Algorithm
Algorithm
Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we guarantee an upper bound on the running time of an
algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and it is rarely done. In the average case analysis, we must know (or
predict) the mathematical distribution of all possible inputs.
The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesnt provide any information as in the worst case, an algorithm
may take years to run.
For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best cases. For example, Merge Sort. Merge Sort does ?
(nLogn) operations in all cases. Most of the other sorting algorithms have worst and best cases. For example, in the typical implementation of
Quick Sort (where pivot is chosen as a corner element), the worst occurs when the input array is already sorted and the best occur when the pivot
elements always divide array in two halves. For insertion sort, the worst case occurs when the array is reverse sorted and the best case occurs
when the array is sorted in the same order as output.
References:
MITs Video lecture 1 on Introduction to Algorithms.
1) ? Notation: The theta notation bounds a functions from above and below, so it defines exact asymptotic behavior.
A simple way to get Theta notation of an expression is to drop low order terms and ignore leading constants. For example, consider the following
expression.
3n3 + 6n2 + 6000 = ?(n3)
Dropping lower order terms is always fine because there will always be a n0 after which ?(n3) beats ?n2) irrespective of the constants involved.
For a given function g(n), we denote ?(g(n)) is following set of functions.
?((g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that
0 <= c1*g(n) <= f(n) <= c2*g(n) for all n >= n0}
The above definition means, if f(n) is theta of g(n), then the value f(n) is always between c1*g(n) and c2*g(n) for large values of n (n >= n0). The
definition of theta also requires that f(n) must be non-negative for values of n greater than n0.
2) Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a function only from
above. For example, consider the case of Insertion Sort. It takes linear time in best case and quadratic time in worst case. We can safely say that
the time complexity of Insertion sort is O(n^2). Note that O(n^2) also covers linear time.
If we use ? notation to represent time complexity of Insertion sort, we have to use two statements for best and worst cases:
1. The worst case time complexity of Insertion Sort is ?(n^2).
2. The best case time complexity of Insertion Sort is ?(n).
The Big O notation is useful when we only have upper bound on time complexity of an algorithm. Many times we easily find an upper bound by
simply looking at the algorithm.
O(g(n)) = { f(n): there exist positive constants c and n0 such that
0 <= f(n) <= cg(n) for all n >= n0}
3) ? Notation: Just as Big O notation provides an asymptotic upper bound on a function, ? notation provides an
asymptotic lower bound.
? Notation< can be useful when we have lower bound on time complexity of an algorithm. As discussed in the previous post, the best case
performance of an algorithm is generally not useful, the Omega notation is the least used notation among all three.
For a given function g(n), we denote by ?(g(n)) the set of functions.
Let us consider the same Insertion sort example here. The time complexity of Insertion Sort can be written as ?(n), but it is not a very useful
information about insertion sort, as we are generally interested in worst case and sometimes in average case.
Exercise:
Which of the following statements is/are valid?
1. Time Complexity of QuickSort is ?(n^2)
2. Time Complexity of QuickSort is O(n^2)
3. For any two functions f(n) and g(n), we have f(n) = ?(g(n)) if and only if f(n) = O(g(n)) and f(n) = ?(g(n)).
4. Time complexity of all computer algorithms can be written as ?(1)
References:
Lec 1 | MIT (Introduction to Algorithms)
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
2) O(n): Time Complexity of a loop is considered as O(n) if the loop variables is incremented / decremented by a constant amount. For example
following functions have O(n) time complexity.
// Here c is a positive integer constant
for (int i = 1; i <= n; i += c) {
// some O(1) expressions
}
for (int i = n; i > 0; i -= c) {
// some O(1) expressions
}
3) O(nc): Time complexity of nested loops is equal to the number of times the innermost statement is executed. For example the following sample
loops have O(n2) time complexity
for (int i = 1; i <=n; i += c) {
for (int j = 1; j <=n; j += c) {
// some O(1) expressions
}
}
for (int i = n; i > 0; i += c) {
for (int j = i+1; j <=n; j += c) {
// some O(1) expressions
}
For example Selection sort and Insertion Sort have O(n2) time complexity.
4) O(Logn) Time Complexity of a loop is considered as O(Logn) if the loop variables is divided / multiplied by a constant amount.
for (int i = 1; i <=n; i *= c) {
// some O(1) expressions
}
for (int i = n; i > 0; i /= c) {
// some O(1) expressions
}
For example Binary Search(refer iterative implementation) has O(Logn) time complexity.
5) O(LogLogn) Time Complexity of a loop is considered as O(LogLogn) if the loop variables is reduced / increased exponentially by a constant
amount.
// Here c is a constant greater than 1
for (int i = 2; i <=n; i = pow(i, c)) {
// some O(1) expressions
}
//Here fun is sqrt or cuberoot or any other constant root
for (int i = n; i > 0; i = fun(i)) {
// some O(1) expressions
}
How to calculate time complexity when there are many if, else statements inside loops?
As discussed here, worst case time complexity is the most useful among best, average and worst. Therefore we need to consider worst case. We
evaluate the situation when values in if-else conditions cause maximum number of statements to be executed.
For example consider the linear search function where we consider the case when element is present at the end or not present at all.
When the code is too complex to consider all if-else cases, we can get an upper bound by ignoring if else and other complex control statements.
How to calculate time complexity of recursive functions?
Time complexity of a recursive function can be written as a mathematical recurrence relation. To calculate time complexity, we must know how to
solve recurrences. We will soon be discussing recurrence solving techniques as a separate post.
Quiz on Analysis of Algorithms
2T(n/2) + n
cn/2Log(n/2) + n
cnLogn - cnLog2 + n
cnLogn - cn + n
cnLogn
2) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the time taken by every level of tree. Finally, we sum the
work done at all levels. To draw the recurrence tree, we start from the given recurrence and keep drawing till we find a pattern among levels. The
pattern is typically a arithmetic or geometric series.
For example consider the recurrence relation
T(n) = T(n/4) + T(n/2) + cn2
cn2
/
T(n/4)
\
T(n/2)
3) Master Method:
Master Method is a direct way to get the solution. The master method works only for following type of recurrences or for recurrences that can be
transformed to following type.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
In recurrence tree method, we calculate total work done. If the work done at leaves is polynomially more, then leaves are the dominant part, and
our result becomes the work done at leaves (Case 1). If work done at leaves and root is asymptotically same, then our result becomes height
multiplied by work done at any level (Case 2). If work done at root is asymptotically more, then our result becomes work done at root (Case 3).
Examples of some standard algorithms whose time complexity can be evaluated using Master Method
Merge Sort: T(n) = 2T(n/2) + ?(n). It falls in case 2 as c is 1 and Logba] is also 1. So the solution is ?(n Logn)
Binary Search: T(n) = T(n/2) + ?(1). It also falls in case 2 as c is 0 and Logba is also 0. So the solution is ?(Logn)
Notes:
1) It is not necessary that a recurrence of the form T(n) = aT(n/b) + f(n) can be solved using Master Theorem. The given three cases have some
gaps between them. For example, the recurrence T(n) = 2T(n/2) + n/Logn cannot be solved using master method.
2) Case 2 can be extended for f(n) = ?(ncLogkn)
If f(n) = ?(ncLogkn) for some constant k >= 0 and c = Logba, then T(n) = ?(ncLogk+1n)
Practice Problems and Solutions on Master Theorem.
References:
http://en.wikipedia.org/wiki/Master_theorem
MIT Video Lecture on Asymptotic Notation | Recurrences | Substitution, Master Method
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
The solution to this trade-off problem is to use Dynamic Table (or Arrays). The idea is to increase size of table whenever it becomes full. Following
are the steps to follow when table becomes full.
1) Allocate memory for a larger table of size, typically twice the old table.
2) Copy the contents of old table to new table.
3) Free the old table.
If the table has space available, we simply insert new item in available space.
What is the time complexity of n insertions using the above scheme?
If we use simple analysis, the worst case cost of an insertion is O(n). Therefore, worst case cost of n inserts is n * O(n) which is O(n2). This
analysis gives an upper bound, but not a tight upper bound for n insertions as all insertions dont take ?(n) time.
So using Amortized Analysis, we could prove that the Dynamic Table scheme has O(1) insertion time which is a great result used in hashing. Also,
the concept of dynamic table is used in vectors in C++, ArrayList in Java.
Following are few important notes.
1) Amortized cost of a sequence of operations can be seen as expenses of a salaried person. The average monthly expense of the person is less
than or equal to the salary, but the person can spend more money in a particular month by buying a car or something. In other months, he or she
saves money for the expensive month.
2) The above Amortized Analysis done for Dynamic Array example is called Aggregate Method. There are two more powerful ways to do
Amortized analysis called Accounting Method and Potential Method. We will be discussing the other two methods in separate posts.
3) The amortized analysis doesnt involve probability. There is also another different notion of average case running time where algorithms use
randomization to make them faster and expected running time is faster than the worst case running time. These algorithms are analyzed using
Randomized Analysis. Examples of these algorithms are Randomized Quick Sort, Quick Select and Hashing. We will soon be covering
Randomized analysis in a different post.
Sources:
Berkeley Lecture 35: Amortized Analysis
MIT Lecture 13: Amortized Algorithms, Table Doubling, Potential Method
http://www.cs.cornell.edu/courses/cs3110/2011sp/lectures/lec20-amortized/amortized.htm
1) L is in NP (Any given solution for NP-complete problems can be verified quickly, but there is no efficient known solution).
2) Every problem in NP is reducible to L in polynomial time (Reduction is defined below).
A problem is NP-Hard if it follows property 2 mentioned above, doesnt need to follow property 1. Therefore, NP-Complete set is also a subset
of NP-Hard set.
NP-completeness applies to the realm of decision problems. It was set up this way because its easier to compare the difficulty of decision
problems than that of optimization problems. In reality, though, being able to solve a decision problem in polynomial time will often permit us to
solve the corresponding optimization problem in polynomial time (using a polynomial number of calls to the decision problem). So, discussing the
difficulty of decision problems is often really equivalent to discussing the difficulty of optimization problems.(Source Ref 2).
For example, consider the vertex cover problem (Given a graph, find out the minimum sized vertex set that covers all edges). It is an optimization
problem. Corresponding decision problem is, given undirected graph G and k, is there a vertex cover of size k?
What is Reduction?
Let L1 and L2 be two decision problems. Suppose algorithm A2 solves L2. That is, if y is an input for L2 then algorithm A2 will answer Yes or No
depending upon whether y belongs to L2 or not.
The idea is to find a transformation from L1 to L2 so that the algorithm A2 can be part of an algorithm A1 to solve L1.
Learning reduction in general is very important. For example, if we have library functions to solve certain problem and if we can reduce a new
problem to one of the solved problems, we save a lot of time. Consider the example of a problem where we have to find minimum product path in
a given directed graph where product of path is multiplication of weights of edges along the path. If we have code for Dijkstras algorithm to find
shortest path, we can take log of all weights and use Dijkstras algorithm to find the minimum product path rather than writing a fresh code for this
new problem.
From the definition of NP-complete, it appears impossible toprove that a problem L is NP-Complete. By definition, it requires us to that show
every problem in NP is polynomial time reducible to L. Fortunately, there is an alternate way to prove it. The idea is to take a known NPComplete problem and reduce it to L. If polynomial time reduction is possible, we can prove that L is NP-Complete by transitivity of reduction (If
a NP-Complete problem is reducible to L in polynomial time, then all problems are reducible to L in polynomial time).
What was the first problem proved as NP-Complete?
There must be some first NP-Complete problem proved by definition of NP-Complete problems. SAT (Boolean satisfiability problem) is the first
NP-Complete problem proved by Cook (See CLRS book for proof).
It is always useful to know about NP-Completeness even for engineers. Suppose you are asked to write an efficient algorithm to solve an
extremely important problem for your company. After a lot of thinking, you can only come up exponential time approach which is impractical. If
you dont know about NP-Completeness, you can only say that I could not come with an efficient algorithm. If you know about NP-Completeness
and prove that the problem as NP-complete, you can proudly say that the polynomial time solution is unlikely to exist. If there is a polynomial time
solution possible, then that solution solves a big problem of computer science many scientists have been trying for years.
We will soon be discussing more NP-Complete problems and their proof for NP-Completeness.
References:
MIT Video Lecture on Computational Complexity
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
http://www.ics.uci.edu/~eppstein/161/960312.html
Time Complexity of the above function can be written as ?(log 1) + ?(log 2) + ?(log 3) + . . . . + ?(log n) which is ? (log n!)
Order of growth of log n! and n log n is same for large values of n, i.e., ? (log n!) = ?(n log n). So time complexity of fun() is ?(n log n).
The expression ?(log n!) = ?(n log n) can be easily derived from following Stirlings approximation (or Stirlings formula).
log n! = n log n - n + O(log(n))
Sources:
http://en.wikipedia.org/wiki/Stirling%27s_approximation
1 + 2 + 3 ... n
n(n+1)/2
n2 + n/2
k=2
Sum =
=
=
12 + 22 + 32 + ... n12.
n(n+1)(2n+1)/6
n3/3 + n2/2 + n/6
k=3
Sum =
=
=
13 + 23 + 33 + ... n13.
n2(n+1)2/4
n4/4 + n3/2 + n2/4
Binary Search
Given a sorted array arr[] of n elements, write a function to search a given element x in arr[].
A simple approach is to do linear search, i.e., start from the leftmost element of arr[] and one by one compare x with each element of arr[], if x
matches with an element, return the index. If x doesnt match with any of elements, return -1.
C/C++
// Linearly search x in arr[]. If x is present then return its
// location, otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i=0; i<n; i++)
if (arr[i] == x)
return i;
return -1;
}
Python
#
#
#
#
#
The idea of binary search is to use the information that the array is sorted and reduce the time complexity to O(Logn). We basically ignore half of
the elements just after one comparison.
1) Compare x with the middle element.
2) If x matches with middle element, we return the mid index.
3) Else If x is greater than the mid element, then x can only lie in right half subarray after the mid element. So we recur for right half.
4) Else (x is smaller) recur for the left half.
Following is Recursive implementation of Binary Search.
C/C++
#include <stdio.h>
// A recursive binary search function. It returns location of x in
// given array arr[l..r] is present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid = l + (r - l)/2;
// If the element is present at the middle itself
if (arr[mid] == x) return mid;
// If element is smaller than mid, then it can only be present
// in left subarray
if (arr[mid] > x) return binarySearch(arr, l, mid-1, x);
// Else the element can only be present in right subarray
Python
# Python Program for recursive binary search.
# Returns index of x in arr if present, else -1
def binarySearch (arr, l, r, x):
# Check base case
if r >= l:
mid = l + (r - l)/2
# If element is present at the middle itself
if arr[mid] == x:
return mid
# If element is smaller than mid, then it can only
# be present in left subarray
elif arr[mid] > x:
return binarySearch(arr, l, mid-1, x)
# Else the element can only be present in right subarray
else:
return binarySearch(arr, mid+1, r, x)
else:
# Element is not present in the array
return -1
# Test array
arr = [ 2, 3, 4, 10, 40 ]
x = 10
# Function call
result = binarySearch(arr, 0, len(arr)-1, x)
if result != -1:
print "Element is present at index %d" % result
else:
print "Element is not present in array"
C/C++
#include <stdio.h>
// A iterative binary search function. It returns location of x in
// given array arr[l..r] if present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
while (l <= r)
{
int m = l + (r-l)/2;
if (arr[m] == x) return m; // Check if x is present at mid
Python
# Iterative Binary Search Function
# It returns location of x in given array arr if present,
# else returns -1
def binarySearch(arr, l, r, x):
while l<=r:
mid = l + (r - l)/2;
# Check if x is present at mid
if arr[mid] == x:
return mid
# If x is greater, ignore left half
elif arr[mid] < x:
l = mid + 1
# If x is smaller, ignore right half
else:
r = mid - 1
# If we reach here, then the element was not present
return -1
# Test array
arr = [ 2, 3, 4, 10, 40 ]
x = 10
# Function call
result = binarySearch(arr, 0, len(arr)-1, x)
if result != -1:
print "Element is present at index %d" % result
else:
print "Element is not present in array"
Time Complexity:
The time complexity of Binary Search can be written as
T(n) = T(n/2) + c
The above recurrence can be solved either using Recurrence T ree method or Master method. It falls in case II of Master Method and solution of
the recurrence is
.
Auxiliary Space: O(1) in case of iterative implementation. In case of recursive implementation, O(Logn) recursion call stack space.
Algorithmic Paradigm: Divide and Conquer
Following are some interesting articles based on Binary Search.
The Ubiquitous Binary Search
Interpolation search vs Binary search
Find the minimum element in a sorted and rotated array
Selection Sort
The selection sort algorithm sorts an array by repeatedly finding the minimum element (considering ascending order) from unsorted part and putting
it at the beginning. The algorithm maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending order) from the unsorted subarray is picked and moved to the
sorted subarray.
Following example explains the above steps:
arr[] = 64 25 12 22 11
// Find the minimum element in arr[0...4] and place it at beginning
11 25 12 22 64
// Find the minimum element in arr[1...4] and
// place it at beginning of arr[1...4]
11 12 25 22 64
// Find the minimum element in arr[2...4] and
// place it at beginning of arr[2...4]
11 12 22 25 64
// Find the minimum element in arr[3...4] and
// place it at beginning of arr[3...4]
11 12 22 25 64
// C program for implementation of selection sort
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
void selectionSort(int arr[], int n)
{
int i, j, min_idx;
// One by one move boundary of unsorted subarray
for (i = 0; i < n-1; i++)
{
// Find the minimum element in unsorted array
min_idx = i;
for (j = i+1; j < n; j++)
if (arr[j] < arr[min_idx])
min_idx = j;
// Swap the found minimum element with the first element
swap(&arr[min_idx], &arr[i]);
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 25, 12, 22, 11};
int n = sizeof(arr)/sizeof(arr[0]);
selectionSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
Output:
Sorted array:
11 12 22 25 64
Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in wrong order.
Example:
First Pass:
( 5 1 4 2 8 ) > ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps since 5 > 1.
( 1 5 4 2 8 ) > ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) > ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) > ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.
Second Pass:
(14258)>(14258)
( 1 4 2 5 8 ) > ( 1 2 4 5 8 ), Swap since 4 > 2
(12458)>(12458)
(12458)>(12458)
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without any swap to
know it is sorted.
Third Pass:
(12458)>(12458)
(12458)>(12458)
(12458)>(12458)
(12458)>(12458)
Following is C implementation of Bubble Sort.
// C program for implementation of Bubble sort
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
// A function to implement bubble sort
void bubbleSort(int arr[], int n)
{
int i, j;
for (i = 0; i < n-1; i++)
for (j = 0; j < n-i-1; j++) //Last i elements are already in place
if (arr[j] > arr[j+1])
swap(&arr[j], &arr[j+1]);
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);
bubbleSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
Output:
Sorted array:
11 12 22 25 34 64 90
Optimized Implementation:
The above function always runs O(n^2) time even if the array is sorted. It can be optimized by stopping the algorithm if inner loop didnt cause any
swap.
// Optimized implementation of Bubble sort
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
// An optimized version of Bubble Sort
void bubbleSort(int arr[], int n)
{
int i, j;
bool swapped;
for (i = 0; i < n-1; i++)
{
swapped = false;
for (j = 0; j < n-i-1; j++)
{
if (arr[j] > arr[j+1])
{
swap(&arr[j], &arr[j+1]);
swapped = true;
}
}
// IF no two elements were swapped by inner loop, then break
if (swapped == false)
break;
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);
bubbleSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
Output:
Sorted array:
11 12 22 25 34 64 90
Worst and Average Case Time Complexity: O(n*n). Worst case occurs when array is reverse sorted.
Best Case Time Complexity: O(n). Best case occurs when array is already sorted.
Auxiliary Space: O(1)
Boundary Cases: Bubble sort takes minimum time (Order of n) when elements are already sorted.
Sorting In Place: Yes
Stable: Yes
Due to its simplicity, bubble sort is often used to introduce the concept of a sorting algorithm.
In computer graphics it is popular for its capability to detect a very small error (like swap of just two elements) in almost-sorted arrays and fix it
with just linear complexity (2n). For example, it is used in a polygon filling algorithm, where bounding lines are sorted by their x coordinate at a
specific scan line (a line parallel to x axis) and with incrementing y their order changes (two elements are swapped) only at intersections of two lines
(Source: Wikipedia)
Insertion Sort
Insertion sort is a simple sorting algorithm that works the way we sort playing cards in our hands.
Algorithm
// Sort an arr[] of size n
insertionSort(arr, n)
Loop from i = 1 to n-1.
a) Pick element arr[i] and insert it into sorted sequence arr[0i-1]
Example:
12, 11, 13, 5, 6
Let us loop for i = 1 (second element of the array) to 5 (Size of input array)
i = 1. Since 11 is smaller than 12, move 12 and insert 11 before 12
11, 12, 13, 5, 6
i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6
i = 3. 5 will move to the beginning and all other elements from 11 to 13 will move one position ahead of their current position.
5, 11, 12, 13, 6
i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one position ahead of their current position.
5, 6, 11, 12, 13
// C program for insertion sort
#include <stdio.h>
#include <math.h>
/* Function to sort an array using insertion sort*/
void insertionSort(int arr[], int n)
{
int i, key, j;
for (i = 1; i < n; i++)
{
key = arr[i];
j = i-1;
/* Move elements of arr[0..i-1], that are
greater than key, to one position ahead
of their current position */
while (j >= 0 && arr[j] > key)
{
arr[j+1] = arr[j];
j = j-1;
}
arr[j+1] = key;
}
}
// A utility function ot print an array of size n
void printArray(int arr[], int n)
{
int i;
for (i=0; i < n; i++)
printf("%d ", arr[i]);
printf("\n");
Output:
5 6 11 12 13
Merge Sort
MergeSort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted
halves. The merg() function is used for merging two halves. The merge(arr, l, m, r) is key process that assumes that arr[l..m] and arr[m+1..r] are
sorted and merges the two sorted sub-arrays into one. See following C implementation for details.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
The following diagram from wikipedia shows the complete merge sort process for an example array {38, 27, 43, 3, 9, 82, 10}. If we take a closer
look at the diagram, we can see that the array is recursively divided in two halves till the size becomes 1. Once the size becomes 1, the merge
processes comes into action and starts merging arrays back till the complete array is merged.
}
else
{
arr[k] = R[j];
j++;
}
k++;
}
/* Copy the remaining elements of L[], if there are any */
while (i < n1)
{
arr[k] = L[i];
i++;
k++;
}
/* Copy the remaining elements of R[], if there are any */
while (j < n2)
{
arr[k] = R[j];
j++;
k++;
}
}
/* l is for left index and r is right index of the sub-array
of arr to be sorted */
void mergeSort(int arr[], int l, int r)
{
if (l < r)
{
int m = l+(r-l)/2; //Same as (l+r)/2, but avoids overflow for large l and h
mergeSort(arr, l, m);
mergeSort(arr, m+1, r);
merge(arr, l, m, r);
}
}
/* UITLITY FUNCTIONS */
/* Function to print an array */
void printArray(int A[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", A[i]);
printf("\n");
}
/* Driver program to test above functions */
int main()
{
int arr[] = {12, 11, 13, 5, 6, 7};
int arr_size = sizeof(arr)/sizeof(arr[0]);
printf("Given array is \n");
printArray(arr, arr_size);
mergeSort(arr, 0, arr_size - 1);
printf("\nSorted array is \n");
printArray(arr, arr_size);
return 0;
}
Output:
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13
Time Complexity: Sorting arrays on different machines. Merge Sort is a recursive algorithm and time complexity can be expressed as following
recurrence relation.
T(n) = 2T(n/2) +
The above recurrence can be solved either using Recurrence Tree method or Master method. It falls in case II of Master Method and solution of
the recurrence is
.
Time complexity of Merge Sort is
in all 3 cases (worst, average and best) as merge sort always divides the array in two halves and
Heap Sort
Heap sort is a comparison based sorting technique based on Binary Heap data structure. It is similar to selection sort where we first find the
maximum element and place the maximum element at the end. We repeat the same process for remaining element.
What is Binary Heap?
Let us first define a Complete Binary Tree. A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled,
and all nodes are as far left as possible (Source Wikipedia)
A Binary Heap is a Complete Binary Tree where items are stored in a special order such that value in a parent node is greater(or smaller) than the
values in its two children nodes. The former is called as max heap and the latter is called min heap. The heap can be represented by binary tree or
array.
Why array based representation for Binary Heap?
Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array and array based representation is space efficient. If the
parent node is stored at index I, the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2.
Heap Sort Algorithm for sorting in increasing order:
1. Build a max heap from the input data.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of heap by
1. Finally, heapify the root of tree.
3. Repeat above steps until size of heap is greater than 1.
How to build the heap?
Heapify procedure can be applied to a node only if its children nodes are heapified. So the heapification must be performed in the bottom up
order.
Lets understand with the help of an example:
Input data: 4, 10, 3, 5, 1
4(0)
/ \
10(1) 3(2)
/ \
5(3)
1(4)
The numbers in bracket represent the indices in the array
representation of data.
Applying heapify procedure to index 1:
4(0)
/ \
10(1)
3(2)
/ \
5(3)
1(4)
Applying heapify procedure to index 0:
10(0)
/ \
5(1) 3(2)
/ \
4(3)
1(4)
The heapify procedure calls itself recursively to build heap
in top down manner.
// C implementation of Heap Sort
#include <stdio.h>
#include <stdlib.h>
// A heap has current size and array of elements
struct MaxHeap
{
int size;
int* array;
};
// A utility function to swap to integers
void swap(int* a, int* b) { int t = *a; *a = *b; *b = t; }
// The main function to heapify a Max Heap. The function
// assumes that everything under given root (element at
// index idx) is already heapified
void maxHeapify(struct MaxHeap* maxHeap, int idx)
{
int largest = idx; // Initialize largest as root
int left = (idx << 1) + 1; // left = 2*idx + 1
Output:
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13
Notes:
Heap sort is an in-place algorithm.
Its typical implementation is not stable, but can be made stable (See this)
Time Complexity: Time complexity of heapify is O(Logn). Time complexity of createAndBuildHeap() is O(n) and overall time complexity of
Heap Sort is O(nLogn).
Applications of HeapSort
1. Sort a nearly sorted (or K sorted) array
2. k largest(or smallest) elements in an array
Heap sort algorithm has limited uses because Quicksort and Mergesort are better in practice. Nevertheless, the Heap data structure itself is
enormously used. See Applications of Heap Data Structure
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
QuickSort
Radix Sort
Counting Sort
Bucket Sort
ShellSort
QuickSort
Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the given array around the picked
pivot. There are many different versions of quickSort that pick pivot in different ways.
1) Always pick first element as pivot.
2) Always pick last element as pivot (implemented below)
3) Pick a random element as pivot.
4) Pick median as pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element x of array as pivot, put x at its correct position in
sorted array and put all smaller elements (smaller than x) before x, and put all greater elements (greater than x) after x. All this should be done in
linear time.
Partition Algorithm
There can be many ways to do partition, following code adopts the method given in CLRS book. The logic is simple, we start from the leftmost
element and keep track of index of smaller (or equal to) elements as i. While traversing, if we find a smaller element, we swap current element with
arr[i]. Otherwise we ignore current element.
Implementation:
Following is C++ implementation of QuickSort.
/* A typical recursive implementation of quick sort */
#include<stdio.h>
// A utility function to swap two elements
void swap(int* a, int* b)
{
int t = *a;
*a = *b;
*b = t;
}
/* This function takes last element as pivot, places the pivot element at its
correct position in sorted array, and places all smaller (smaller than pivot)
to left of pivot and all greater elements to right of pivot */
int partition (int arr[], int l, int h)
{
int x = arr[h];
// pivot
int i = (l - 1); // Index of smaller element
for (int j = l; j <= h- 1; j++)
{
// If current element is smaller than or equal to pivot
if (arr[j] <= x)
{
i++;
// increment index of smaller element
swap(&arr[i], &arr[j]); // Swap current element with index
}
}
swap(&arr[i + 1], &arr[h]);
return (i + 1);
}
/* arr[] --> Array to be sorted, l --> Starting index, h --> Ending index */
void quickSort(int arr[], int l, int h)
{
if (l < h)
{
int p = partition(arr, l, h); /* Partitioning index */
quickSort(arr, l, p - 1);
quickSort(arr, p + 1, h);
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {10, 7, 8, 9, 1, 5};
int n = sizeof(arr)/sizeof(arr[0]);
quickSort(arr, 0, n-1);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
Output:
Sorted array:
1 5 7 8 9 10
Analysis of QuickSort
Time taken by QuickSort in general can be written as following.
T(n) = T(k) + T(n-k-1) +
(n)
The first two terms are for two recursive calls, the last term is for the partition process. k is the number of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy. Following are three cases.
Worst Case: The worst case occurs when the partition process always picks greatest or smallest element as pivot. If we consider above partition
strategy where last element is always picked as pivot, the worst case would occur when the array is already sorted in increasing or decreasing
order. Following is recurrence for worst case.
T(n) = T(0) + T(n-1) +
which is equivalent to
T(n) = T(n-1) + (n)
(n)
(n)
The solution of above recurrence is (nLogn). It can be solved using case 2 of Master Theorem.
Average Case:
To do average case analysis, we need to consider all possible permutation of array and calculate time taken by every permutation which doesnt
look easy.
We can get an idea of average case by considering the case when partition puts O(n/9) elements in one set and O(9n/10) elements in other set.
Following is recurrence for this case.
T(n) = T(n/9) + T(9n/10) +
(n)
Radix Sort
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is ?(nLogn), i.e., they cannot do better than
nLogn.
Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in range from 1 to k.
What if the elements are in range from 1 to n2?
We cant use counting sort because counting sort will take O(n2) which is worse than comparison based sorting algorithms. Can we sort such an
array in linear time?
Radix Sort is the answer. The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort
uses counting sort as a subroutine to sort.
The Radix Sort Algorithm
1) Do following for each digit i where i varies from least significant digit to the most significant digit.
.a) Sort input array using counting sort (or any stable sort) according to the ith digit.
Example:
Original, unsorted list:
170, 45, 75, 90, 802, 24, 2, 66
Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2, because 802 occurred before 2 in the original list, and
similarly for pairs 170 & 90 and 45 & 75.]
170, 90, 802,2, 24, 45, 75, 66
Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802 comes before 2 in the previous list.]
802, 2,24,45,66, 170,75,90
Sorting by most significant digit (100s place) gives:
2, 24, 45, 66, 75, 90,170,802
What is the running time of Radix Sort?
Let there be d digits in input integers. Radix Sort takes O(d*(n+b)) time where b is the base for representing numbers, for example, for decimal
system, b is 10. What is the value of d? If k is the maximum possible value, then d would be O(logb(k)). So overall time complexity is O((n+b) *
logb(k)). Which looks more than the time complexity of comparison based sorting algorithms for a large k. Let us first limit k. Let k <= nc where c
is a constant. In that case, the complexity becomes O(nLogb(n)). But it still doesnt beat comparison based sorting algorithms.
What if we make value of b larger?. What should be the value of b to make the time complexity linear? If we set b as n, we get the time complexity
as O(n). In other words, we can sort an array of integers with range from 1 to nc if the numbers are represented in base n (or every digit takes
log2(n) bits).
Is Radix Sort preferable to Comparison based sorting algorithms like Quick-Sort?
If we have log2n bits for every digit, the running time of Radix appears to be better than Quick Sort for a wide range of input numbers. The
constant factors hidden in asymptotic notation are higher for Radix Sort and Quick-Sort uses hardware caches more effectively. Also, Radix sort
uses counting sort as a subroutine and counting sort takes extra space to sort numbers.
Implementation of Radix Sort
Following is a simple C++ implementation of Radix Sort. For simplicity, the value of d is assumed to be 10. We recommend you to see Counting
Sort for details of countSort() function in below code.
C/C++
// C++ implementation of Radix Sort
#include<iostream>
using namespace std;
// A utility function to get maximum value in arr[]
int getMax(int arr[], int n)
{
int mx = arr[0];
for (int i = 1; i < n; i++)
if (arr[i] > mx)
mx = arr[i];
return mx;
}
Java
// Radix sort Java implementation
import java.io.*;
import java.util.*;
class Radix {
// A utility function to get maximum value in arr[]
static int getMax(int arr[], int n)
{
int mx = arr[0];
for (int i = 1; i < n; i++)
if (arr[i] > mx)
mx = arr[i];
return mx;
}
// A function to do counting sort of arr[] according to
// the digit represented by exp.
2 24 45 66 75 90 170 802
http://en.wikipedia.org/wiki/Radix_sort
http://alg12.wikischolars.columbia.edu/file/view/RADIX.pdf
MIT Video Lecture
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
Counting Sort
Counting sort is a sorting technique based on keys between a specific range. It works by counting the number of objects having distinct key values
(kind of hashing). Then doing some arithmetic to calculate the position of each object in the output sequence.
Let us understand it with the help of an example.
For simplicity, consider the data in
Input data: 1, 4, 1, 2, 7, 5, 2
1) Take a count array to store the
Index:
0 1 2 3 4 5 6 7
Count:
0 2 2 0 1 1 0 1
the range 0 to 9.
count of each unique object.
8 9
0 0
2) Modify the count array such that each element at each index
stores the sum of previous counts.
Index:
0 1 2 3 4 5 6 7 8 9
Count:
0 2 4 4 5 6 6 7 7 7
The modified count array indicates the position of each object in
the output sequence.
3) Output each object from the input sequence followed by
decreasing its count by 1.
Process the input data: 1, 4, 1, 2, 7, 5, 2. Position of 1 is 2.
Put data 1 at index 2 in output. Decrease count by 1 to place
next data 1 at an index 1 smaller than this index.
Output:
Time Complexity: O(n+k) where n is the number of elements in input array and k is the range of input.
Auxiliary Space: O(n+k)
Points to be noted:
1. Counting sort is efficient if the range of input data is not significantly greater than the number of objects to be sorted. Consider the situation
where the input sequence is between range 1 to 10K and the data is 10, 5, 10K, 5K.
2. It is not a comparison based sorting. It running time complexity is O(n) with space proportional to the range of data.
3. It is often used as a sub-routine to another sorting algorithm like radix sort.
4. Counting sort uses a partial hashing to count the occurrence of the data object in O(1).
5. Counting sort can be extended to work for negative inputs also.
Exercise:
1. Modify above code to sort the input data in the range from M to N.
2. Modify above code to sort negative input data.
3. Is counting sort stable and online?
4. Thoughts on parallelizing the counting sort algorithm.
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Bucket Sort
ShellSort
Bucket Sort
Bucket sort is mainly useful when input is uniformly distributed over a range. For example, consider the following problem.
Sort a large set of floating point numbers which are in range from 0.0 to 1.0 and are uniformly distributed across the range. How do we
sort the numbers efficiently?
A simple way is to apply a comparison based sorting algorithm. The lower bound for Comparison based sorting algorithm (Merge Sort, Heap
Sort, Quick-Sort .. etc) is ?(n Log n), i.e., they cannot do better than nLogn.
Can we sort the array in linear time? Counting sort can not be applied here as we use keys as index in counting sort. Here keys are floating point
numbers.
The idea is to use bucket sort. Following is bucket algorithm.
bucketSort(arr[], n)
1) Create n empty buckets (Or lists).
2) Do following for every array element arr[i].
.......a) Insert arr[i] into bucket[n*array[i]]
3) Sort individual buckets using insertion sort.
4) Concatenate all sorted buckets.
Following diagram (taken from CLRS book) demonstrates working of bucket sort.
Time Complexity: If we assume that insertion in a bucket takes O(1) time then steps 1 and 2 of the above algorithm clearly take O(n) time. The
O(1) is easily possible if we use a linked list to represent a bucket (In the following code, C++ vector is used for simplicity). Step 4 also takes
O(n) time as there will be n items in all buckets.
The main step to analyze is step 3. This step also takes O(n) time on average if all numbers are uniformly distributed (please refer CLRS book for
more details)
Following is C++ implementation of the above algorithm.
// C++ program to sort an array using bucket sort
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
// Function to sort arr[] of size n using bucket sort
void bucketSort(float arr[], int n)
{
// 1) Create n empty buckets
vector<float> b[n];
// 2) Put array elements in different buckets
for (int i=0; i<n; i++)
{
int bi = n*arr[i]; // Index in bucket
b[bi].push_back(arr[i]);
}
// 3) Sort individual buckets
for (int i=0; i<n; i++)
sort(b[i].begin(), b[i].end());
// 4) Concatenate all buckets into arr[]
int index = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < b[i].size(); j++)
arr[index++] = b[i][j];
}
/* Driver program to test above funtion */
int main()
{
float arr[] = {0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434};
int n = sizeof(arr)/sizeof(arr[0]);
bucketSort(arr, n);
cout << "Sorted array is \n";
for (int i=0; i<n; i++)
cout << arr[i] << " ";
return 0;
}
Output:
Sorted array is
0.1234 0.3434 0.565 0.656 0.665 0.897
References:
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
http://en.wikipedia.org/wiki/Bucket_sort
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Counting Sort
ShellSort
ShellSort
ShellSort is mainly a variation of Insertion Sort. In insertion sort, we move elements only one position ahead. When an element has to be moved far
ahead, many movements are involved. The idea of shellSort is to allow exchange of far items. In shellSort, we make the array h-sorted for a large
value of h. We keep reducing the value of h until it becomes 1. An array is said to be h-sorted if all sublists of every hth element is sorted.
Following is C++ implementation of ShellSort.
#include <iostream>
using namespace std;
/* function to sort arr using shellSort */
int shellSort(int arr[], int n)
{
// Start with a big gap, then reduce the gap
for (int gap = n/2; gap > 0; gap /= 2)
{
// Do a gapped insertion sort for this gap size.
// The first gap elements a[0..gap-1] are already in gapped order
// keep adding one more element until the entire array is
// gap sorted
for (int i = gap; i < n; i += 1)
{
// add a[i] to the elements that have been gap sorted
// save a[i] in temp and make a hole at position i
int temp = arr[i];
// shift earlier gap-sorted elements up until the correct
// location for a[i] is found
int j;
for (j = i; j >= gap && arr[j - gap] > temp; j -= gap)
arr[j] = arr[j - gap];
// put temp (the original a[i]) in its correct location
arr[j] = temp;
}
}
return 0;
}
void printArray(int arr[], int n)
{
for (int i=0; i<n; i++)
cout << arr[i] << " ";
}
int main()
{
int arr[] = {12, 34, 54, 2, 3}, i;
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Array before sorting: \n";
printArray(arr, n);
shellSort(arr, n);
cout << "\nArray after sorting: \n";
printArray(arr, n);
return 0;
}
Output:
Array before sorting:
12 34 54 2 3
Array after sorting:
2 3 12 34 54
Time Complexity: Time complexity of above implementation of shellsort is O(n2). In the above implementation gap is reduce by half in every
iteration. There are many other ways to reduce gap which lead to better time complexity. See this for more details.
References:
https://www.youtube.com/watch?v=pGhazjsFW28
http://en.wikipedia.org/wiki/Shellsort
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Therefore, any comparison based sorting algorithm must make at least nLog2n comparisons to sort the input array, and Heapsort and merge sort
are asymptotically optimal comparison sorts.
References:
Introduction to Algorithms, by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
Find the Minimum length Unsorted Subarray, sorting which makes the complete
array sorted
Given an unsorted array arr[0..n-1] of size n, find the minimum length subarray arr[s..e] such that sorting this subarray makes the whole array
sorted.
Examples:
1) If the input array is [10, 12, 20, 30, 25, 40, 32, 31, 35, 50, 60], your program should be able to find that the subarray lies between the indexes
3 and 8.
2) If the input array is [0, 1, 15, 25, 6, 7, 30, 40, 50], your program should be able to find that the subarray lies between the indexes 2 and 5.
Solution:
1) Find the candidate unsorted subarray
a) Scan from left to right and find the first element which is greater than the next element. Let s be the index of such an element. In the above
example 1, s is 3 (index of 30).
b) Scan from right to left and find the first element (first in right to left order) which is smaller than the next element (next in right to left order). Let e
be the index of such an element. In the above example 1, e is 7 (index of 31).
2) Check whether sorting the candidate unsorted subarray makes the complete array sorted or not. If not, then include more
elements in the subarray.
a) Find the minimum and maximum values in arr[s..e]. Let minimum and maximum values be min and max. min and max for [30, 25, 40, 32, 31]
are 25 and 40 respectively.
b) Find the first element (if there is any) in arr[0..s-1] which is greater than min, change s to index of this element. There is no such element in
above example 1.
c) Find the last element (if there is any) in arr[e+1..n-1] which is smaller than max, change e to index of this element. In the above example 1, e is
changed to 8 (index of 35)
3) Print s and e.
Implementation:
#include<stdio.h>
void printUnsorted(int arr[], int n)
{
int s = 0, e = n-1, i, max, min;
// step 1(a) of above algo
for (s = 0; s < n-1; s++)
{
if (arr[s] > arr[s+1])
break;
}
if (s == n-1)
{
printf("The complete array is sorted");
return;
}
// step 1(b) of above algo
for(e = n - 1; e > 0; e--)
{
if(arr[e] < arr[e-1])
break;
}
// step 2(a) of above algo
max = arr[s]; min = arr[s];
for(i = s + 1; i <= e; i++)
{
if(arr[i] > max)
max = arr[i];
if(arr[i] < min)
min = arr[i];
}
// step 2(b) of above algo
for( i = 0; i < s; i++)
{
if(arr[i] > min)
{
s = i;
break;
}
}
// step 2(c) of above algo
for( i = n -1; i >= e+1; i--)
{
if(arr[i] < max)
{
e = i;
break;
}
}
// step 3 of above algo
printf(" The unsorted subarray which makes the given array "
" sorted lies between the indees %d and %d", s, e);
return;
}
int main()
{
int arr[] = {10, 12, 20, 30, 25, 40, 32, 31, 35, 50, 60};
int arr_size = sizeof(arr)/sizeof(arr[0]);
printUnsorted(arr, arr_size);
getchar();
return 0;
}
{
result = b;
result->next = SortedMerge(a, b->next);
}
return(result);
}
/* UTILITY FUNCTIONS */
/* Split the nodes of the given list into front and back halves,
and return the two lists using the reference parameters.
If the length is odd, the extra node should go in the front list.
Uses the fast/slow pointer strategy. */
void FrontBackSplit(struct node* source,
struct node** frontRef, struct node** backRef)
{
struct node* fast;
struct node* slow;
if (source==NULL || source->next==NULL)
{
/* length < 2 cases */
*frontRef = source;
*backRef = NULL;
}
else
{
slow = source;
fast = source->next;
/* Advance 'fast' two nodes, and advance 'slow' one node */
while (fast != NULL)
{
fast = fast->next;
if (fast != NULL)
{
slow = slow->next;
fast = fast->next;
}
}
/* 'slow' is before the midpoint in the list, so split it in two
at that point. */
*frontRef = source;
*backRef = slow->next;
slow->next = NULL;
}
}
/* Function to print nodes in a given linked list */
void printList(struct node *node)
{
while(node!=NULL)
{
printf("%d ", node->data);
node = node->next;
}
}
/* Function to insert a node at the beginging of the linked list */
void push(struct node** head_ref, int new_data)
{
/* allocate node */
struct node* new_node =
(struct node*) malloc(sizeof(struct node));
/* put in the data */
new_node->data = new_data;
/* link the old list off the new node */
new_node->next = (*head_ref);
/* move the head to point to the new node */
(*head_ref)
= new_node;
}
/* Drier program to test above functions*/
int main()
{
/* Start with the empty list */
struct node* res = NULL;
struct node* a = NULL;
The inner loop will run at most k times. To move every element to its correct place, at most k elements need to be moved. So overall complexity
will be O(nk)
We can sort such arrays more efficiently with the help of Heap data structure. Following is the detailed process that uses Heap.
1) Create a Min Heap of size k+1 with first k+1 elements. This will take O(k) time (See this GFact)
2) One by one remove min element from heap, put it in result array, and add a new element to heap from remaining elements.
Removing an element and adding a new element to min heap will take Logk time. So overall complexity will be O(k) + O((n-k)*logK)
#include<iostream>
using namespace std;
// Prototype of a utility function to swap two integers
void swap(int *x, int *y);
// A class for Min Heap
class MinHeap
{
int *harr; // pointer to array of elements in heap
int heap_size; // size of min heap
public:
// Constructor
MinHeap(int a[], int size);
// to heapify a subtree with root at given index
void MinHeapify(int );
// to get index of left child of node at index i
int left(int i) { return (2*i + 1); }
// to get index of right child of node at index i
int right(int i) { return (2*i + 2); }
// to remove min (or root), add a new value x, and return old root
int replaceMin(int x);
// to extract the root which is the minimum element
int extractMin();
};
// Given an array of size n, where every element is k away from its target
// position, sorts the array in O(nLogk) time.
int sortK(int arr[], int n, int k)
{
// Create a Min Heap of first (k+1) elements from
// input array
int *harr = new int[k+1];
for (int i = 0; i<=k && i<n; i++) // i < n condition is needed when k > n
harr[i] = arr[i];
MinHeap hp(harr, k+1);
// i is index for remaining elements in arr[] and ti
// is target index of for cuurent minimum element in
// Min Heapm 'hp'.
for(int i = k+1, ti = 0; ti < n; i++, ti++)
{
// If there are remaining elements, then place
// root of heap at target index and add arr[i]
// to Min Heap
if (i < n)
arr[ti] = hp.replaceMin(arr[i]);
// Otherwise place root at its target index and
// reduce heap size
else
arr[ti] = hp.extractMin();
}
}
// FOLLOWING ARE IMPLEMENTATIONS OF STANDARD MIN HEAP METHODS FROM CORMEN BOOK
// Constructor: Builds a heap from a given array a[] of given size
MinHeap::MinHeap(int a[], int size)
{
heap_size = size;
harr = a; // store address of array
int i = (heap_size - 1)/2;
while (i >= 0)
{
MinHeapify(i);
i--;
}
}
// Method to remove minimum element (or root) from min heap
int MinHeap::extractMin()
{
int root = harr[0];
if (heap_size > 1)
{
harr[0] = harr[heap_size-1];
heap_size--;
MinHeapify(0);
}
return root;
}
// Method to change root with given value x, and return the old root
int MinHeap::replaceMin(int x)
{
int root = harr[0];
harr[0] = x;
if (root < x)
MinHeapify(0);
return root;
}
// A recursive method to heapify a subtree with root at given index
// This method assumes that the subtrees are already heapified
void MinHeap::MinHeapify(int i)
{
int l = left(i);
int r = right(i);
int smallest = i;
if (l < heap_size && harr[l] < harr[i])
smallest = l;
if (r < heap_size && harr[r] < harr[smallest])
smallest = r;
if (smallest != i)
{
swap(&harr[i], &harr[smallest]);
MinHeapify(smallest);
}
}
// A utility function to swap two elements
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
// A utility function to print array elements
void printArray(int arr[], int size)
{
for (int i=0; i < size; i++)
cout << arr[i] << " ";
cout << endl;
}
// Driver program to test above functions
int main()
{
int k = 3;
int arr[] = {2, 6, 3, 12, 56, 8};
int n = sizeof(arr)/sizeof(arr[0]);
sortK(arr, n, k);
cout << "Following is sorted array\n";
printArray (arr, n);
return 0;
}
Output:
Following is sorted array
2 3 6 8 12 56
The Min Heap based method takes O(nLogk) time and uses O(k) auxiliary space.
We can also use a Balanced Binary Search Tree instead of Heap to store K+1 elements. The insert and delete operations on Balanced BST
also take O(Logk) time. So Balanced BST based method will also take O(nLogk) time, but the Heap bassed method seems to be more efficient
as the minimum element will always be at root. Also, Heap doesnt need extra space for left and right pointers.
Output:
1 2 2 3 3 3 4 5
The above mentioned optimizations for recursive quick sort can also be applied to iterative version.
1) Partition process is same in both recursive and iterative. The same techniques to choose optimal pivot can also be applied to iterative version.
2) To reduce the stack size, first push the indexes of smaller half.
3) Use insertion sort when the size reduces below a experimentally calculated threshold.
References:
http://en.wikipedia.org/wiki/Quicksort
// During partition, both the head and end of the list might change
// which is updated in the newHead and newEnd variables
while (cur != pivot)
{
if (cur->data < pivot->data)
{
// First node that has a value less than the pivot - becomes
// the new head
if ((*newHead) == NULL)
(*newHead) = cur;
prev = cur;
cur = cur->next;
}
else // If cur node is greater than pivot
{
// Move cur node to next of tail, and change tail
if (prev)
prev->next = cur->next;
struct node *tmp = cur->next;
cur->next = NULL;
tail->next = cur;
tail = cur;
cur = tmp;
}
}
// If the pivot data is the smallest element in the current list,
// pivot becomes the head
if ((*newHead) == NULL)
(*newHead) = pivot;
// Update newEnd to the current last node
(*newEnd) = tail;
// Return the pivot node
return pivot;
}
//here
struct
{
//
if
printList(a);
quickSort(&a);
cout << "Linked List after sorting \n";
printList(a);
return 0;
}
Output:
Linked List before sorting
30 3 4 20 5
Linked List after sorting
3 4 5 20 30
push(&a, 4);
push(&a, 3);
push(&a, 30);
cout << "Linked List before sorting \n";
printList(a);
quickSort(a);
cout << "Linked List after sorting \n";
printList(a);
return 0;
}
Output :
Linked List before sorting
30 3 4 20 5
Linked List after sorting
3 4 5 20 30
Time Complexity: Time complexity of the above implementation is same as time complexity of QuickSort() for arrays. It takes O(n^2) time in
worst case and O(nLogn) in average and best cases. The worst case occurs when the linked list is already sorted.
Can we implement random quick sort for linked list?
Quicksort can be implemented for Linked List only when we can pick a fixed point as pivot (like last element in above implementation). Random
QuickSort cannot be efficiently implemented for Linked Lists by picking random pivot.
Exercise:
The above implementation is for doubly linked list. Modify it for singly linked list. Note that we dont have prev pointer in singly linked list.
Refer QuickSort on Singly Linked List for solution.
Note that if the element is present in array, then it should not be in output, only the other closest elements are required.
In the following solutions, it is assumed that all elements of array are distinct.
A simple solution is to do linear search for k closest elements.
1) Start from the first element and search for the crossover point (The point before which elements are smaller than or equal to X and after which
elements are greater). This step takes O(n) time.
2) Once we find the crossover point, we can compare elements on both sides of crossover point to print k closest elements. This step takes O(k)
time.
The time complexity of the above solution is O(n).
An Optimized Solution is to find k elements in O(Logn + k) time. The idea is to use Binary Search to find the crossover point. Once we find
index of crossover point, we can print k closest elements in O(k) time.
#include<stdio.h>
/* Function to find the cross over point (the point before
which elements are smaller than or equal to x and after
which greater than x)*/
int findCrossOver(int arr[], int low, int high, int x)
{
// Base cases
if (arr[high] <= x) // x is greater than all
return high;
if (arr[low] > x) // x is smaller than all
return low;
// Find the middle point
int mid = (low + high)/2; /* low + (high - low)/2 */
/* If x is same as middle element, then return mid */
if (arr[mid] <= x && arr[mid+1] > x)
return mid;
/* If x is greater than arr[mid], then either arr[mid + 1]
is ceiling of x or ceiling lies in arr[mid+1...high] */
if(arr[mid] < x)
return findCrossOver(arr, mid+1, high, x);
return findCrossOver(arr, low, mid - 1, x);
}
// This function prints k closest elements to x in arr[].
// n is the number of elements in arr[]
void printKclosest(int arr[], int x, int k, int n)
{
// Find the crossover point
int l = findCrossOver(arr, 0, n-1, x); // le
int r = l+1; // Right index to search
int count = 0; // To keep track of count of elements already printed
// If x is present in arr[], then reduce left index
// Assumption: all elements in arr[] are distinct
if (arr[l] == x) l--;
// Compare elements on left and right of crossover
// point to find the k closest elements
while (l >= 0 && r < n && count < k)
{
if (x - arr[l] < arr[r] - x)
printf("%d ", arr[l--]);
else
printf("%d ", arr[r++]);
count++;
}
Output:
39 30 42 45
Solution: If we use Counting Sort, it would take O(n^2) time as the given range is of size n^2. Using any comparison based sorting like Merge
Sort, Heap Sort, .. etc would take O(nLogn) time.
Now question arises how to do this in 0(n)? Firstly, is it possible? Can we use data given in question? n numbers in range from 0 to n2 1?
The idea is to use Radix Sort. Following is standard Radix Sort algorithm.
1) Do following for each digit i where i varies from least
significant digit to the most significant digit.
..a) Sort input array using counting sort (or any stable
sort) according to the ith digit
Let there be d digits in input integers. Radix Sort takes O(d*(n+b)) time where b is the base for representing numbers, for example, for decimal
system, b is 10. Since n2-1 is the maximum possible value, the value of d would be O(logb(n)). So overall time complexity is O((n+b)*O(logb(n)).
Which looks more than the time complexity of comparison based sorting algorithms for a large k. The idea is to change base b. If we set b as n,
the value of O(logb(n)) becomes O(1) and overall time complexity becomes O(n).
arr[] = {0, 10, 13, 12, 7}
Let us consider the elements in base 5. For example 13 in
base 5 is 23, and 7 in base 5 is 12.
arr[] = {00(0), 20(10), 23(13), 22(12), 12(7)}
After first iteration (Sorting according to the last digit in
base 5), we get.
arr[] = {00(0), 20(10), 12(7), 22(12), 23(13)}
After second iteration, we get
arr[] = {00(0), 12(7), 20(10), 22(12), 23(13)}
Following is C++ implementation to sort an array of size n where elements are in range from 0 to n2 1.
#include<iostream>
using namespace std;
// A function to do counting sort of arr[] according to
// the digit represented by exp.
int countSort(int arr[], int n, int exp)
{
int output[n]; // output array
int i, count[n] ;
for (int i=0; i < n; i++)
count[i] = 0;
// Store count of occurrences in count[]
for (i = 0; i < n; i++)
count[ (arr[i]/exp)%n ]++;
// Change count[i] so that count[i] now contains actual
// position of this digit in output[]
for (i = 1; i < n; i++)
count[i] += count[i - 1];
// Build the output array
for (i = n - 1; i >= 0; i--)
{
output[count[ (arr[i]/exp)%n] - 1] = arr[i];
count[(arr[i]/exp)%n]--;
}
// Copy the output array to arr[], so that arr[] now
// contains sorted numbers according to curent digit
for (i = 0; i < n; i++)
arr[i] = output[i];
}
// The main function to that sorts arr[] of size n using Radix Sort
void sort(int arr[], int n)
{
// Do counting sort for first digit in base n. Note that
// instead of passing digit number, exp (n^0 = 0) is passed.
countSort(arr, n, 1);
// Do counting sort for second digit in base n. Note that
// instead of passing digit number, exp (n^1 = n) is passed.
countSort(arr, n, n);
}
// A utility function to print an array
void printArr(int arr[], int n)
{
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
}
// Driver program to test above functions
int main()
{
// Since array size is 7, elements should be from 0 to 48
int arr[] = {40, 12, 45, 32, 33, 1, 22};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Given array is \n";
printArr(arr, n);
sort(arr, n);
cout << "\nSorted array is \n";
printArr(arr, n);
return 0;
}
Output:
Given array is
40 12 45 32 33 1 22
Sorted array is
1 12 22 32 33 40 45
The above looks fine except one subtle thing, the expression m = (l+r)/2?. It fails for large values of l and r. Specifically, it fails if the sum of low
and high is greater than the maximum positive int value (231 1). The sum overflows to a negative value, and the value stays negative when divided
by two. In C this causes an array index out of bounds with unpredictable results.
What is the way to resolve this problem?
Following is one way:
int mid = low + ((high - low) / 2);
Probably faster, and arguably as clear is (works only in Java, refer this):
int mid = (low + high) >>> 1;
In C and C++ (where you dont have the >>> operator), you can do this:
mid = ((unsigned int)low + (unsigned int)high)) >> 1
Output:
error: size of array 'arr' is too large
Even when we try boolean array, the program compiles fine, but crashes when run in Windows 7.0 and Code Blocks 32 bit compiler
#include <stdbool.h>
int main()
{
bool arr[1<<30];
return 0;
}
http://locklessinc.com/articles/binary_search/
A simple solution is to linearly search the given key in given array. Time complexity of this solution is O(n). We cab modify binary search to do it in
O(Logn) time.
The idea is to compare the key with middle 3 elements, if present then return the index. If not present, then compare the key with middle element
to decide whether to go in left half or right half. Comparing with middle element is enough as all the elements after mid+2 must be greater than
element mid and all elements before mid-2 must be smaller than mid element.
Following is C++ implementation of this approach.
// C++ program to find an element in an almost sorted array
#include <stdio.h>
// A recursive binary search based function. It returns index of x in
// given array arr[l..r] is present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid = l + (r - l)/2;
//
if
if
if
Output:
Element is present at index 3
A Simple Solution is to use sorting. First sort the input array, then swap all adjacent elements.
For example, let the input array be {3, 6, 5, 10, 7, 20}. After sorting, we get {3, 5, 6, 7, 10, 20}. After swapping adjacent elements, we get {5,
3, 7, 6, 20, 10}.
Below are implementations of this simple approach.
C++
// A C++ program to sort an array in wave form using a sorting function
#include<iostream>
#include<algorithm>
using namespace std;
// A utility method to swap two numbers.
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
// This function sorts arr[0..n-1] in wave form, i.e.,
// arr[0] >= arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5]..
void sortInWave(int arr[], int n)
{
// Sort the input array
sort(arr, arr+n);
// Swap adjacent elements
for (int i=0; i<n-1; i += 2)
swap(&arr[i], &arr[i+1]);
}
// Driver program to test above function
int main()
{
int arr[] = {10, 90, 49, 2, 1, 5, 23};
int n = sizeof(arr)/sizeof(arr[0]);
sortInWave(arr, n);
for (int i=0; i<n; i++)
cout << arr[i] << " ";
return 0;
}
Python
# Python function to sort the array arr[0..n-1] in wave form,
# i.e., arr[0] >= arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5]
def sortInWave(arr, n):
#sort the array
arr.sort()
# Swap adjacent elements
for i in range(0,n-1,2):
arr[i], arr[i+1] = arr[i+1], arr[i]
# Driver progrM
arr = [10, 90, 49, 2, 1, 5, 23]
sortInWave(arr, len(arr))
for i in range(0,len(arr)):
print arr[i],
# This code is contributed by __Devesh Agrawal__
2 1 10 5 49 23 90
The time complexity of the above solution is O(nLogn) if a O(nLogn) sorting algorithm like Merge Sort, Heap Sort, .. etc is used.
This can be done in O(n) time by doing a single traversal of given array. The idea is based on the fact that if we make sure that all even
positioned (at index 0, 2, 4, ..) elements are greater than their adjacent odd elements, we dont need to worry about odd positioned element.
Following are simple steps.
1) Traverse all even positioned elements of input array, and do following.
.a) If current element is smaller than previous odd element, swap previous and current.
.b) If current element is smaller than next odd element, swap next and current.
Below are implementations of above simple algorithm.
C++
// A O(n) program to sort an input array in wave form
#include<iostream>
using namespace std;
// A utility method to swap two numbers.
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
// This function sorts arr[0..n-1] in wave form, i.e., arr[0] >=
// arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5] ....
void sortInWave(int arr[], int n)
{
// Traverse all even elements
for (int i = 0; i < n; i+=2)
{
// If current even element is smaller than previous
if (i>0 && arr[i-1] > arr[i] )
swap(&arr[i], &arr[i-1]);
// If current even element is smaller than next
if (i<n-1 && arr[i] < arr[i+1] )
swap(&arr[i], &arr[i + 1]);
}
}
// Driver program to test above function
int main()
{
int arr[] = {10, 90, 49, 2, 1, 5, 23};
int n = sizeof(arr)/sizeof(arr[0]);
sortInWave(arr, n);
for (int i=0; i<n; i++)
cout << arr[i] << " ";
return 0;
}
Python
# Python function to sort the array arr[0..n-1] in wave form,
# i.e., arr[0] >= arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5]
def sortInWave(arr, n):
Output:
90 10 49 1 5 2 23
The following is recursive formula for counting comparisons in worst case of Ternary Search.
T(n) = T(n/3) + 4, T(1) = 1
In binary search, there are 2Log2n + 1 comparisons in worst case. In ternary search, there are 4Log3n + 1 comparisons in worst case.
Therefore, the comparison of Ternary and Binary Searches boils down the comparison of expressions 2Log3n and Log2n . The value of 2Log3n
can be written as (2 / Log23) * Log2n . Since the value of (2 / Log23) is more than one, Ternary Search does more comparisons than Binary
Search in worst case.
Exercise:
Why Merge Sort divides input array in two halves, why not in three or more parts?
i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] arount the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}
Output:
K'th smallest element is 5
Time Complexity:
The worst case time complexity of the above solution is still O(n2). In worst case, the randomized function may always pick a corner element. The
expected time complexity of above randomized QuickSelect is ?(n), see CLRS book or MIT video lecture for proof. The assumption in the
analysis is, random number generator is equally likely to generate any number in the input range.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] arount the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}
Output:
K'th smallest element is 5
Time Complexity:
The worst case time complexity of the above solution is still O(n2). In worst case, the randomized function may always pick a corner element. The
expected time complexity of above randomized QuickSelect is ?(n), see CLRS book or MIT video lecture for proof. The assumption in the
analysis is, random number generator is equally likely to generate any number in the input range.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Kth Smallest/Largest Element in Unsorted Array | Set 3 (Worst Case Linear Time)
We recommend to read following posts as a prerequisite of this post.
Kth Smallest/Largest Element in Unsorted Array | Set 1
Kth Smallest/Largest Element in Unsorted Array | Set 2 (Expected Linear Time)
Given an array and a number k where k is smaller than size of array, we need to find the kth smallest element in the given array. It is given that ll
array elements are distinct.
Examples:
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 3
Output: 7
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 4
Output: 10
In previous post, we discussed an expected linear time algorithm. In this post, a worst case linear time method is discussed. The idea in this new
method is similar to quickSelect(), we get worst case linear time by selecting a pivot that divides array in a balanced way (there are not
very few elements on one side and many on other side). After the array is divided in a balanced way, we apply the same steps as used in
quickSelect() to decide whether to go left or right of pivot.
Following is complete algorithm.
kthSmallest(arr[0..n-1], k)
1) Divide arr[] into ?n/5rceil; groups where size of each group is 5
except possibly the last group which may have less than 5 elements.
2) Sort the above created ?n/5? groups and find median
of all groups. Create an auxiliary array 'median[]' and store medians
of all ?n/5? groups in this median array.
// Recursively call this method to find median of median[0..?n/5?-1]
3) medOfMed = kthSmallest(median[0..?n/5?-1], ?n/10?)
4) Partition arr[] around medOfMed and obtain its position.
pos = partition(arr, n, medOfMed)
5) If pos == k return medOfMed
6) If pos < k return kthSmallest(arr[l..pos-1], k)
7) If poa > k return kthSmallest(arr[pos+1..r], k-pos+l-1)
In above algorithm, last 3 steps are same as algorithm in previous post. The first four steps are used to obtain a good point for partitioning the array
(to make sure that there are not too many elements either side of pivot).
Following is C++ implementation of above algorithm.
// C++ implementation of worst case linear time algorithm
// to find k'th smallest element
#include<iostream>
#include<algorithm>
#include<climits>
using namespace std;
int partition(int arr[], int l, int r, int k);
// A simple function to find median of arr[]. This is called
// only for an array of size 5 in this program.
int findMedian(int arr[], int n)
{
sort(arr, arr+n); // Sort the array
return arr[n/2]; // Return middle element
}
// Returns k'th smallest element in arr[l..r] in worst case
// linear time. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
int n = r-l+1; // Number of elements in arr[l..r]
// Divide arr[] in groups of size 5, calculate median
Output:
K'th smallest element is 5
Time Complexity:
The worst case time complexity of the above algorithm is O(n). Let us analyze all steps.
The steps 1) and 2) take O(n) time as finding median of an array of size 5 takes O(1) time and there are n/5 arrays of size 5.
The step 3) takes T(n/5) time. The step 4 is standard partition and takes O(n) time.
The interesting steps are 6) and 7). At most, one of them is executed. These are recursive steps. What is the worst case size of these recursive
calls. The answer is maximum number of elements greater than medOfMed (obtained in step 3) or maximum number of elements smaller than
medOfMed.
How many elements are greater than medOfMed and how many are smaller?
At least half of the medians found in step 2 are greater than or equal to medOfMed. Thus, at least half of the n/5 groups contribute 3 elements that
are greater than medOfMed, except for the one group that has fewer than 5 elements. Therefore, the number of elements greater than medOfMed
is at least.
Similarly, the number of elements that are less than medOfMed is at least 3n/10 6. In the worst case, the function recurs for at most n (3n/10 6)
which is 7n/10 + 6 elements.
Note that 7n/10 + 6 < n for n > 20 and that any input of 80 or fewer elements requires O(1) time. We can therefore obtain the recurrence
We show that the running time is linear by substitution. Assume that T(n) cn for some constant c and all n > 80. Substituting this inductive
hypothesis into the right-hand side of the recurrence yields
T(n) <= cn/5 + c(7n/10 + 6) + O(n)
<= cn/5 + c + 7cn/10 + 6c + O(n)
<= 9cn/10 + 7c + O(n)
<= cn,
since we can pick c large enough so that c(n/10 - 7) is larger than the function described by the O(n) term for all n > 80. The worst-case running
time of is therefore linear (Source: http://staff.ustc.edu.cn/~csli/graduate/algorithms/book6/chap10.htm ).
Note that the above algorithm is linear in worst case, but the constants are very high for this algorithm. Therefore, this algorithm doesn't work well
in practical situations, randomized quickSelect works much better and preferred.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
http://staff.ustc.edu.cn/~csli/graduate/algorithms/book6/chap10.htm
A Simple Solution is to run two loops. The outer loop considers every element of first array and inner loop checks for the pair in second array.
We keep track of minimum difference between ar1[i] + ar2[j] and x.
We can do it in O(n) time using following steps.
1) Merge given two arrays into an auxiliary array of size m+n using merge process of merge sort. While merging keep another boolean array of
size m+n to indicate whether the current element in merged array is from ar1[] or ar2[].
2) Consider the merged array and use the linear time algorithm to find the pair with sum closest to x. One extra thing we need to consider only
those pairs which have one element from ar1[] and other from ar2[], we use the boolean array for this purpose.
Can we do it in a single pass and O(1) extra space?
The idea is to start from left side of one array and right side of another array, and use the algorithm same as step 2 of above approach. Following is
detailed algorithm.
1) Initialize a variable diff as infinite (Diff is used to store the
difference between pair and x). We need to find the minimum diff.
2) Initialize two index variables l and r in the given sorted array.
(a) Initialize first to the leftmost index in ar1: l = 0
(b) Initialize second the rightmost index in ar2: r = n-1
3) Loop while l < m and r >= 0
(a) If abs(ar1[l] + ar2[r] - sum) < diff then
update diff and result
(b) Else if(ar1[l] + ar2[r] < sum ) then l++
(c) Else r-4) Print the result.
}
// If sum of this pair is more than x, move to smaller
// side
if (ar1[l] + ar2[r] > x)
r--;
else // move to the greater side
l++;
}
// Print the result
cout << "The closest pair is [" << ar1[res_l] << ", "
<< ar2[res_r] << "] \n";
}
// Driver program to test above functions
int main()
{
int ar1[] = {1, 4, 5, 7};
int ar2[] = {10, 20, 30, 40};
int m = sizeof(ar1)/sizeof(ar1[0]);
int n = sizeof(ar2)/sizeof(ar2[0]);
int x = 38;
printClosest(ar1, ar2, m, n, x);
return 0;
}
Output:
The closest pair is [7, 30]
{1,
{6,
{3,
20,
ar1[] =
ar2[] =
ar3[] =
Outptu:
{1, 5, 5}
{3, 4, 5, 5, 10}
{5, 5, 10, 20}
5, 5
A simple solution is to first find intersection of two arrays and store the intersection in a temporary array, then find the intersection of third array
and temporary array. Time complexity of this solution is O(n1 + n2 + n3) where n1, n2 and n3 are sizes of ar1[], ar2[] and ar3[] respectively.
The above solution requires extra space and two loops, we can find the common elements using a single loop and without extra space. The idea is
similar to intersection of two arrays. Like two arrays loop, we run a loop and traverse three arrays.
Let the current element traversed in ar1[] be x, in ar2[] be y and in ar3[] be z. We can have following cases inside the loop.
1) If x, y and z are same, we can simply print any of them as common element and move ahead in all three arrays.
2) Else If x < y, we can move ahead in ar1[] as x cannot be a common element 3) Else If y < z, we can move ahead in ar2[] as y cannot be a
common element 4) Else (We reach here when x > y and y > z), we can simply move ahead in ar3[] as z cannot be a common element.
Following are implementations of the above idea.
C++
// C++ program to print common elements in three arrays
#include <iostream>
using namespace std;
// This function prints common elements in ar1
int findCommon(int ar1[], int ar2[], int ar3[], int n1, int n2, int n3)
{
// Initialize starting indexes for ar1[], ar2[] and ar3[]
int i = 0, j = 0, k = 0;
// Iterate through three arrays while all arrays have elements
while (i < n1 && j < n2 && k < n3)
{
// If x = y and y = z, print any of them and move ahead
// in all arrays
if (ar1[i] == ar2[j] && ar2[j] == ar3[k])
{ cout << ar1[i] << " "; i++; j++; k++; }
// x < y
else if (ar1[i] < ar2[j])
i++;
// y < z
else if (ar2[j] < ar3[k])
j++;
// We reach here when x > y and z < y, i.e., z is smallest
else
k++;
}
}
// Driver program to test above function
int main()
{
int ar1[] = {1, 5, 10, 20, 40, 80};
int ar2[] = {6, 7, 20, 80, 100};
int ar3[] = {3, 4, 15, 20, 30, 70, 80, 120};
int n1 = sizeof(ar1)/sizeof(ar1[0]);
int n2 = sizeof(ar2)/sizeof(ar2[0]);
int n3 = sizeof(ar3)/sizeof(ar3[0]);
cout << "Common Elements are ";
findCommon(ar1, ar2, ar3, n1, n2, n3);
return 0;
}
Python
# Python function to print common elements in three sorted arrays
def findCommon(ar1, ar2, ar3, n1, n2, n3):
# Initialize starting indexes for ar1[], ar2[] and ar3[]
i, j, k = 0, 0, 0
# Iterate through three arrays while all arrays have elements
while (i < n1 and j < n2 and k< n3):
# If x = y and y = z, print any of them and move ahead
# in all arrays
if (ar1[i] == ar2[j] and ar2[j] == ar3[k]):
print ar1[i],
i += 1
j += 1
k += 1
# x < y
elif ar1[i] < ar2[j]:
i += 1
# y < z
elif ar2[j] < ar3[k]:
j += 1
# We reach here when x > y and z < y, i.e., z is smallest
else:
k += 1
#Driver program to check above function
ar1 = [1, 5, 10, 20, 40, 80]
ar2 = [6, 7, 20, 80, 100]
ar3 = [3, 4, 15, 20, 30, 70, 80, 120]
n1 = len(ar1)
n2 = len(ar2)
n3 = len(ar3)
print "Common elements are",
findCommon(ar1, ar2, ar3, n1, n2, n3)
# This code is contributed by __Devesh Agrawal__
Time complexity of the above solution is O(n1 + n2 + n3). In worst case, the largest sized array may have all small elements and middle sized array
has all middle elements.
Given a sorted array and a number x, find the pair in array whose sum is closest to x
Given a sorted array and a number x, find a pair in array whose sum is closest to x.
Examples:
Input: arr[] = {10, 22, 28, 29, 30, 40}, x = 54
Output: 22 and 30
Input: arr[] = {1, 3, 4, 7, 10}, x = 15
Output: 4 and 10
A simple solution is to consider every pair and keep track of closest pair (absolute difference between pair sum and x is minimum). Finally print the
closest pair. Time complexity of this solution is O(n2)
An efficient solution can find the pair in O(n) time. The idea is similar to method 2 of this post. Following is detailed algorithm.
1) Initialize a variable diff as infinite (Diff is used to store the
difference between pair and x). We need to find the minimum diff.
2) Initialize two index variables l and r in the given sorted array.
(a) Initialize first to the leftmost index: l = 0
(b) Initialize second the rightmost index: r = n-1
3) Loop while l < r.
(a) If abs(arr[l] + arr[r] - sum) < diff then
update diff and result
(b) Else if(arr[l] + arr[r] < sum ) then l++
(c) Else r--
C++
// Simple C++ program to find the pair with sum closest to a given no.
#include <iostream>
#include <climits>
#include <cstdlib>
using namespace std;
// Prints the pair with sum closest to x
void printClosest(int arr[], int n, int x)
{
int res_l, res_r; // To store indexes of result pair
// Initialize left and right indexes and difference between
// pair sum and x
int l = 0, r = n-1, diff = INT_MAX;
// While there are elements between l and r
while (r > l)
{
// Check if this pair is closer than the closest pair so far
if (abs(arr[l] + arr[r] - x) < diff)
{
res_l = l;
res_r = r;
diff = abs(arr[l] + arr[r] - x);
}
// If this pair has more sum, move to smaller values.
if (arr[l] + arr[r] > x)
r--;
else // Move to larger values
l++;
}
cout <<" The closest pair is " << arr[res_l] << " and " << arr[res_r];
}
// Driver program to test above functions
int main()
{
int arr[] = {10, 22, 28, 29, 30, 40}, x = 54;
int n = sizeof(arr)/sizeof(arr[0]);
printClosest(arr, n, x);
return 0;
}
Java
// Java program to find pair with sum closest to x
import java.io.*;
import java.util.*;
import java.lang.Math;
class CloseSum {
// Prints the pair with sum cloest to x
static void printClosest(int arr[], int n, int x)
{
int res_l=0, res_r=0; // To store indexes of result pair
// Initialize left and right indexes and difference between
// pair sum and x
int l = 0, r = n-1, diff = Integer.MAX_VALUE;
// While there are elements between l and r
while (r > l)
{
// Check if this pair is closer than the closest pair so far
if (Math.abs(arr[l] + arr[r] - x) < diff)
{
res_l = l;
res_r = r;
diff = Math.abs(arr[l] + arr[r] - x);
}
// If this pair has more sum, move to smaller values.
if (arr[l] + arr[r] > x)
r--;
else // Move to larger values
l++;
}
System.out.println(" The closest pair is "+arr[res_l]+" and "+ arr[res_r]);
}
// Driver program to test above function
public static void main(String[] args)
{
int arr[] = {10, 22, 28, 29, 30, 40}, x = 54;
int n = arr.length;
printClosest(arr, n, x);
}
}
/*This code is contributed by Devesh Agrawal*/
A simple solution is to linearly traverse the array. The time complexity of the simple solution is O(n). We can use Binary Search to find count in
O(Logn) time. The idea is to look for last occurrence of 1 using Binary Search. Once we find the index last occurrence, we return index + 1 as
count.
The following is C++ implementation of above idea.
C++
// C++ program to count one's in a boolean array
#include <iostream>
using namespace std;
/* Returns counts of 1's in arr[low..high]. The array is
assumed to be sorted in non-increasing order */
int countOnes(bool arr[], int low, int high)
{
if (high >= low)
{
// get the middle index
int mid = low + (high - low)/2;
// check if the element at middle index is last 1
if ( (mid == high || arr[mid+1] == 0) && (arr[mid] == 1))
return mid+1;
// If element is not last 1, recur for right side
if (arr[mid] == 1)
return countOnes(arr, (mid + 1), high);
// else recur for left side
return countOnes(arr, low, (mid -1));
}
return 0;
}
/* Driver program to test above functions */
int main()
{
bool arr[] = {1, 1, 1, 1, 0, 0, 0};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Count of 1's in given array is " << countOnes(arr, 0, n-1);
return 0;
}
Python
# Python program to count one's in a boolean array
# Returns counts of 1's in arr[low..high]. The array is
# assumed to be sorted in non-increasing order
def countOnes(arr,low,high):
if high>=low:
# get the middle index
mid = low + (high-low)/2
# check if the element at middle index is last 1
if ((mid == high or arr[mid+1]==0) and (arr[mid]==1)):
return mid+1
# If element is not last 1, recur for right side
if arr[mid]==1:
return countOnes(arr, (mid+1), high)
# else recur for left side
return countOnes(arr, low, mid-1)
return 0
# Driver function
arr=[1, 1, 1, 1, 0, 0, 0]
print "Count of 1's in given array is",countOnes(arr, 0 , len(arr)-1)
# This code is contributed by __Devesh Agrawal__
Output:
Count of 1's in given array is 4
Output:
Sorted array:
0 12 17 23 31 37 46 54 72 88 100
Time Complexity: The algorithm as a whole still has a running worst case running time of O(n2) because of the series of swaps required for each
insertion.
The main step is (2.a) which has been covered in below post.
Sorted Insert for Singly Linked List
Below is C implementation of above algorithm
/* C program for insertion sort on a linked list */
#include<stdio.h>
#include<stdlib.h>
/* Link list node */
struct node
{
int data;
struct node* next;
};
// Function to insert a given node in a sorted linked list
void sortedInsert(struct node**, struct node*);
// function to sort a singly linked list using insertion sort
void insertionSort(struct node **head_ref)
{
// Initialize sorted linked list
struct node *sorted = NULL;
// Traverse the given linked list and insert every
// node to sorted
struct node *current = *head_ref;
while (current != NULL)
{
// Store next for next iteration
struct node *next = current->next;
// insert current in sorted linked list
sortedInsert(&sorted, current);
// Update current
current = next;
}
// Update head_ref to point to sorted linked list
*head_ref = sorted;
}
/* function to insert a new_node in a list. Note that this
function expects a pointer to head_ref as this can modify the
head of the input linked list (similar to push())*/
void sortedInsert(struct node** head_ref, struct node* new_node)
{
struct node* current;
/* Special case for the head end */
if (*head_ref == NULL || (*head_ref)->data >= new_node->data)
{
new_node->next = *head_ref;
*head_ref = new_node;
}
else
{
/* Locate the node before the point of insertion */
current = *head_ref;
while (current->next!=NULL &&
current->next->data < new_node->data)
{
current = current->next;
}
new_node->next = current->next;
current->next = new_node;
}
}
/* BELOW FUNCTIONS ARE JUST UTILITY TO TEST sortedInsert */
/* A utility function to create a new node */
struct node *newNode(int new_data)
{
/* allocate node */
struct node* new_node =
(struct node*) malloc(sizeof(struct node));
/* put in the data */
new_node->data = new_data;
new_node->next = NULL;
return new_node;
}
/* Function to print linked list */
void printList(struct node *head)
{
struct node *temp = head;
while(temp != NULL)
{
printf("%d ", temp->data);
temp = temp->next;
}
}
/* A utility function to insert a node at the beginning of linked list */
void push(struct node** head_ref, int new_data)
{
/* allocate node */
struct node* new_node = new node;
/* put in the data */
new_node->data = new_data;
/* link the old list off the new node */
new_node->next = (*head_ref);
/* move the head to point to the new node */
(*head_ref)
= new_node;
}
// Driver program to test above functions
int main()
{
struct node *a = NULL;
push(&a, 5);
push(&a, 20);
push(&a, 4);
push(&a, 3);
push(&a, 30);
printf("Linked List before sorting \n");
printList(a);
insertionSort(&a);
printf("\nLinked List after sorting \n");
printList(a);
return 0;
}
Linked List before sorting
30 3 4 20 5
Linked List after sorting
3 4 5 20 30
Why Quick Sort preferred for Arrays and Merge Sort for Linked Lists?
Why is Quick Sort preferred for arrays?
Below are recursive and iterative implementations of Quick Sort and Merge Sort for arrays.
Recursive Quick Sort for array.
Iterative Quick Sort for arrays.
Recursive Merge Sort for arrays
Iterative Merge Sort for arrays
Quick Sort in its general form is an in-place sort (i.e. it doesnt require any extra storage) whereas merge sort requires O(N) extra storage, N
denoting the array size which may be quite expensive. Allocating and de-allocating the extra space used for merge sort increases the running time
of the algorithm. Comparing average complexity we find that both type of sorts have O(NlogN) average complexity but the constants differ. For
arrays, merge sort loses due to the use of extra O(N) storage space.
Most practical implementations of Quick Sort use randomized version. The randomized version has expected time complexity of O(nLogn). The
worst case is possible in randomized version also, but worst case doesnt occur for a particular pattern (like sorted array) and randomized Quick
Sort works well in practice.
Quick Sort is also a cache friendly sorting algorithm as it has good locality of reference when used for arrays.
Quick Sort is also tail recursive, therefore tail call optimizations is done.
Why is Merge Sort preferred for Linked Lists?
Below are implementations of Quicksort and Mergesort for singly and doubly linked lists.
Quick Sort for Doubly Linked List
Quick Sort for Singly Linked List
Merge Sort for Singly Linked List
Merge Sort for Doubly Linked List
In case of linked lists the case is different mainly due to difference in memory allocation of arrays and linked lists. Unlike arrays, linked list nodes
may not be adjacent in memory. Unlike array, in linked list, we can insert items in the middle in O(1) extra space and O(1) time. Therefore merge
operation of merge sort can be implemented without extra space for linked lists.
In arrays, we can do random access as elements are continuous in memory. Let us say we have an integer (4-byte) array A and let the address of
A[0] be x then to access A[i], we can directly access the memory at (x + i*4). Unlike arrays, we can not do random access in linked list. Quick
Sort requires a lot of this kind of access. In linked list to access ith index, we have to travel each and every node from the head to ith node as we
dont have continuous block of memory. Therefore, the overhead increases for quick sort. Merge sort accesses data sequentially and the need of
random access is low.
{
temp->next = *head;
(*head)->prev = temp;
(*head) = temp;
}
}
// A utility function to print a doubly linked list in
// both forward and backward directions
void print(struct node *head)
{
struct node *temp = head;
printf("Forward Traversal using next poitner\n");
while (head)
{
printf("%d ",head->data);
temp = head;
head = head->next;
}
printf("\nBackword Traversal using prev pointer\n");
while (temp)
{
printf("%d ", temp->data);
temp = temp->prev;
}
}
// Utility function to swap two integers
void swap(int *A, int *B)
{
int temp = *A;
*A = *B;
*B = temp;
}
// Split a doubly linked list (DLL) into 2 DLLs of
// half sizes
struct node *split(struct node *head)
{
struct node *fast = head,*slow = head;
while (fast->next && fast->next->next)
{
fast = fast->next->next;
slow = slow->next;
}
struct node *temp = slow->next;
slow->next = NULL;
return temp;
}
// Driver program
int main(void)
{
struct node *head = NULL;
insert(&head,5);
insert(&head,20);
insert(&head,4);
insert(&head,3);
insert(&head,30);
insert(&head,10);
printf("Linked List before sorting\n");
print(head);
head = mergeSort(head);
printf("\n\nLinked List after sorting\n");
print(head);
return 0;
}
Output:
Linked List before sorting
Forward Traversal using next pointer
10 30 3 4 20 5
Backward Traversal using prev pointer
5 20 4 3 30 10
Linked List after sorting
Forward Traversal using next pointer
3 4 5 10 20 30
Backward Traversal using prev pointer
30 20 10 5 4 3
The greedy choice is to always pick the next activity whose finish time is least among the remaining activities and the start time is more than or
equal to the finish time of previously selected activity. We can sort the activities according to their finishing time so that we always consider the next
activity as minimum finishing time activity.
1) Sort the activities according to their finishing time
2) Select the first activity from the sorted array and print it.
3) Do following for remaining activities in the sorted array.
.a) If the start time of this activity is greater than the finish time of previously selected activity then select this activity and print it.
In the following C implementation, it is assumed that the activities are already sorted according to their finish time.
C++
#include<stdio.h>
// Prints a maximum set of activities that can be done by a single
// person, one at a time.
// n --> Total number of activities
// s[] --> An array that contains start time of all activities
// f[] --> An array that contains finish time of all activities
void printMaxActivities(int s[], int f[], int n)
{
int i, j;
printf ("Following activities are selected \n");
// The first activity always gets selected
i = 0;
printf("%d ", i);
// Consider rest of the activities
for (j = 1; j < n; j++)
{
// If this activity has start time greater than or
Python
"""The following implemenatation assumes that the activities
are already sorted according to their finish time"""
"""Prints a maximum set of activities that can be done by a
single person, one at a time"""
# n --> Total number of activities
# s[]--> An array that contains start time of all activities
# f[] --> An array that conatins finish time of all activities
def printMaxActivities(s , f ):
n = len(f)
print "The following activities are selected"
# The first activity is always selected
i = 0
print i,
# Consider rest of the activities
for j in xrange(n):
# If this activity has start time greater than
# or equal to the finish time of previously
# selected activity, then select it
if s[j] >= f[i]:
print j,
i = j
# Driver program to test above function
s = [1 , 3 , 0 , 5 , 8 , 5]
f = [2 , 4 , 6 , 7 , 9 , 9]
printMaxActivities(s , f)
# This code is contributed by Nikhil Kumar Singh
How does Greedy Choice work for Activities sorted according to finish time?
Let the give set of activities be S = {1, 2, 3, ..n} and activities be sorted by finish time. The greedy choice is to always pick activity 1. How come
the activity 1 always provides one of the optimal solutions. We can prove it by showing that if there is another solution B with first activity other
than 1, then there is also a solution A of same size with activity 1 as first activity. Let the first activity selected by B be k, then there always exist A
= {B {k}} U {1}.(Note that the activities in B are independent and k has smallest finishing time among all. Since k is not 1, finish(k) >= finish(1)).
References:
Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani
http://en.wikipedia.org/wiki/Greedy_algorithm
The step#2 uses Union-Find algorithm to detect cycle. So we recommend to read following post as a prerequisite.
Union-Find Algorithm | Set 1 (Detect Cycle in a Graph)
Union-Find Algorithm | Set 2 (Union By Rank and Path Compression)
The algorithm is a Greedy Algorithm. The Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed
so far. Let us understand it with an example: Consider the below input graph.
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9 1) = 8 edges.
After sorting:
Weight Src
Dest
1
7
6
2
8
2
2
6
5
4
0
1
4
2
5
6
8
6
7
2
3
7
7
8
8
0
7
8
1
2
9
3
4
10
5
4
11
1
7
14
3
5
Now pick all edges one by one from sorted list of edges
1. Pick edge 7-6: No cycle is formed, include it.
6. Pick edge 8-6: Since including this edge results in cycle, discard it.
7. Pick edge 2-3: No cycle is formed, include it.
8. Pick edge 7-8: Since including this edge results in cycle, discard it.
9. Pick edge 0-7: No cycle is formed, include it.
10. Pick edge 1-2: Since including this edge results in cycle, discard it.
11. Pick edge 3-4: No cycle is formed, include it.
Since the number of edges included equals (V 1), the algorithm stops here.
C/C++
// C++ program for Kruskal's algorithm to find Minimum Spanning Tree
// of a given connected, undirected and weighted graph
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// a structure to represent a weighted edge in graph
struct Edge
{
int src, dest, weight;
};
// a structure to represent a connected, undirected and weighted graph
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges. Since the graph is
// undirected, the edge from src to dest is also edge from dest
// to src. Both are counted as 1 edge here.
struct Edge* edge;
};
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
struct Graph* graph = (struct Graph*) malloc( sizeof(struct Graph) );
graph->V = V;
graph->E = E;
graph->edge = (struct Edge*) malloc( graph->E * sizeof( struct Edge ) );
return graph;
}
// A structure to represent a subset for union-find
struct subset
{
int parent;
int rank;
};
// A utility function to find set of an element i
// (uses path compression technique)
int find(struct subset subsets[], int i)
{
// find root and make root as parent of i (path compression)
if (subsets[i].parent != i)
subsets[i].parent = find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(struct subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high rank tree
// (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and increment
// its rank by one
else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
// Compare two edges according to their weights.
// Used in qsort() for sorting an array of edges
int myComp(const void* a, const void* b)
{
struct Edge* a1 = (struct Edge*)a;
struct Edge* b1 = (struct Edge*)b;
return a1->weight > b1->weight;
}
// The main function to construct MST using Kruskal's algorithm
void KruskalMST(struct Graph* graph)
{
int V = graph->V;
struct Edge result[V]; // Tnis will store the resultant MST
graph->edge[3].weight = 15;
// add edge 2-3
graph->edge[4].src = 2;
graph->edge[4].dest = 3;
graph->edge[4].weight = 4;
KruskalMST(graph);
return 0;
}
Java
// Java program for Kruskal's algorithm to find Minimum Spanning Tree
// of a given connected, undirected and weighted graph
import java.util.*;
import java.lang.*;
import java.io.*;
class Graph
{
// A class to represent a graph edge
class Edge implements Comparable<Edge>
{
int src, dest, weight;
// Comparator function used for sorting edges based on
// their weight
public int compareTo(Edge compareEdge)
{
return this.weight-compareEdge.weight;
}
};
// A class to represent a subset for union-find
class subset
{
int parent, rank;
};
int V, E;
// V-> no. of vertices & E->no.of edges
Edge edge[]; // collection of all edges
// Creates a graph with V vertices and E edges
Graph(int v, int e)
{
V = v;
E = e;
edge = new Edge[E];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// A utility function to find set of an element i
// (uses path compression technique)
int find(subset subsets[], int i)
{
// find root and make root as parent of i (path compression)
if (subsets[i].parent != i)
subsets[i].parent = find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high rank tree
// (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and increment
graph.edge[0].src = 0;
graph.edge[0].dest = 1;
graph.edge[0].weight = 10;
// add edge 0-2
graph.edge[1].src = 0;
graph.edge[1].dest = 2;
graph.edge[1].weight = 6;
// add edge 0-3
graph.edge[2].src = 0;
graph.edge[2].dest = 3;
graph.edge[2].weight = 5;
// add edge 1-3
graph.edge[3].src = 1;
graph.edge[3].dest = 3;
graph.edge[3].weight = 15;
// add edge 2-3
graph.edge[4].src = 2;
graph.edge[4].dest = 3;
graph.edge[4].weight = 4;
graph.KruskalMST();
}
}
//This code is contributed by Aakash Hasija
Following
2 -- 3 ==
0 -- 3 ==
0 -- 1 ==
Time Complexity: O(ElogE) or O(ElogV). Sorting of edges takes O(ELogE) time. After sorting, we iterate through all edges and apply findunion algorithm. The find and union operations can take atmost O(LogV) time. So overall complexity is O(ELogE + ELogV) time. The value of E
can be atmost V^2, so O(LogV) are O(LogE) same. Therefore, overall time complexity is O(ElogE) or O(ElogV)
References:
http://www.ics.uci.edu/~eppstein/161/960206.html
http://en.wikipedia.org/wiki/Minimum_spanning_tree
Frequency
5
9
12
13
16
45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree with single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node with frequency 5 + 9 = 14.
Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements
character
c
d
Internal Node
e
f
Frequency
12
13
14
16
45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12 + 13 = 25
Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than
one nodes.
character
Internal Node
e
Internal Node
f
Frequency
14
16
25
45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30
Frequency
25
30
45
Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25 + 30 = 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100
Since the heap contains only one node, the algorithm stops here.
Steps to print codes from Huffman Tree:
Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write 0 to the array. While moving to the
right child, write 1 to the array. Print the array when a leaf node is encountered.
code-word
0
100
101
1100
1101
111
{
struct MinHeapNode* temp = minHeap->array[0];
minHeap->array[0] = minHeap->array[minHeap->size - 1];
--minHeap->size;
minHeapify(minHeap, 0);
return temp;
}
// A utility function to insert a new node to Min Heap
void insertMinHeap(struct MinHeap* minHeap, struct MinHeapNode* minHeapNode)
{
++minHeap->size;
int i = minHeap->size - 1;
while (i && minHeapNode->freq < minHeap->array[(i - 1)/2]->freq)
{
minHeap->array[i] = minHeap->array[(i - 1)/2];
i = (i - 1)/2;
}
minHeap->array[i] = minHeapNode;
}
// A standard funvtion to build min heap
void buildMinHeap(struct MinHeap* minHeap)
{
int n = minHeap->size - 1;
int i;
for (i = (n - 1) / 2; i >= 0; --i)
minHeapify(minHeap, i);
}
// A utility function to print an array of size n
void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);
printf("\n");
}
// Utility function to check if this node is leaf
int isLeaf(struct MinHeapNode* root)
{
return !(root->left) && !(root->right) ;
}
// Creates a min heap of capacity equal to size and inserts all character of
// data[] in min heap. Initially size of min heap is equal to capacity
struct MinHeap* createAndBuildMinHeap(char data[], int freq[], int size)
{
struct MinHeap* minHeap = createMinHeap(size);
for (int i = 0; i < size; ++i)
minHeap->array[i] = newNode(data[i], freq[i]);
minHeap->size = size;
buildMinHeap(minHeap);
return minHeap;
}
// The main function that builds Huffman tree
struct MinHeapNode* buildHuffmanTree(char data[], int freq[], int size)
{
struct MinHeapNode *left, *right, *top;
// Step 1: Create a min heap of capacity equal to size. Initially, there are
// modes equal to size.
struct MinHeap* minHeap = createAndBuildMinHeap(data, freq, size);
// Iterate while size of heap doesn't become 1
while (!isSizeOne(minHeap))
{
// Step 2: Extract the two minimum freq items from min heap
left = extractMin(minHeap);
right = extractMin(minHeap);
// Step 3: Create a new internal node with frequency equal to the
// sum of the two nodes frequencies. Make the two extracted node as
// left and right children of this new node. Add this node to the min heap
// '$' is a special value for internal nodes, not used
top = newNode('$', left->freq + right->freq);
top->left = left;
top->right = right;
insertMinHeap(minHeap, top);
}
// Step 4: The remaining node is the root node and the tree is complete.
return extractMin(minHeap);
}
// Prints huffman codes from the root of Huffman Tree. It uses arr[] to
// store codes
void printCodes(struct MinHeapNode* root, int arr[], int top)
{
// Assign 0 to left edge and recur
if (root->left)
{
arr[top] = 0;
printCodes(root->left, arr, top + 1);
}
// Assign 1 to right edge and recur
if (root->right)
{
arr[top] = 1;
printCodes(root->right, arr, top + 1);
}
// If this is a leaf node, then it contains one of the input
// characters, print the character and its code from arr[]
if (isLeaf(root))
{
printf("%c: ", root->data);
printArr(arr, top);
}
}
// The main function that builds a Huffman Tree and print codes by traversing
// the built Huffman Tree
void HuffmanCodes(char data[], int freq[], int size)
{
// Construct Huffman Tree
struct MinHeapNode* root = buildHuffmanTree(data, freq, size);
// Print Huffman codes using the Huffman tree built above
int arr[MAX_TREE_HT], top = 0;
printCodes(root, arr, top);
}
// Driver program to test above functions
int main()
{
char arr[] = {'a', 'b', 'c', 'd', 'e', 'f'};
int freq[] = {5, 9, 12, 13, 16, 45};
int size = sizeof(arr)/sizeof(arr[0]);
HuffmanCodes(arr, freq, size);
return 0;
}
f:
c:
d:
a:
b:
e:
0
100
101
1100
1101
111
Time complexity: O(nlogn) where n is the number of unique characters. If there are n nodes, extractMin() is called 2*(n 1) times. extractMin()
takes O(logn) time as it calles minHeapify(). So, overall complexity is O(nlogn).
If the input array is sorted, there exists a linear time algorithm. We will soon be discussing in our next post.
Reference:
http://en.wikipedia.org/wiki/Huffman_coding
{
return queue->front == -1;
}
// A utility function to check if given queue is full
int isFull(struct Queue* queue)
{
return queue->rear == queue->capacity - 1;
}
// A utility function to add an item to queue
void enQueue(struct Queue* queue, struct QueueNode* item)
{
if (isFull(queue))
return;
queue->array[++queue->rear] = item;
if (queue->front == -1)
++queue->front;
}
// A utility function to remove an item from queue
struct QueueNode* deQueue(struct Queue* queue)
{
if (isEmpty(queue))
return NULL;
struct QueueNode* temp = queue->array[queue->front];
if (queue->front == queue->rear) // If there is only one item in queue
queue->front = queue->rear = -1;
else
++queue->front;
return temp;
}
// A utility function to get from of queue
struct QueueNode* getFront(struct Queue* queue)
{
if (isEmpty(queue))
return NULL;
return queue->array[queue->front];
}
/* A function to get minimum item from two queues */
struct QueueNode* findMin(struct Queue* firstQueue, struct Queue* secondQueue)
{
// Step 3.a: If second queue is empty, dequeue from first queue
if (isEmpty(firstQueue))
return deQueue(secondQueue);
// Step 3.b: If first queue is empty, dequeue from second queue
if (isEmpty(secondQueue))
return deQueue(firstQueue);
// Step 3.c: Else, compare the front of two queues and dequeue minimum
if (getFront(firstQueue)->freq < getFront(secondQueue)->freq)
return deQueue(firstQueue);
return deQueue(secondQueue);
}
// Utility function to check if this node is leaf
int isLeaf(struct QueueNode* root)
{
return !(root->left) && !(root->right) ;
}
// A utility function to print an array of size n
void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);
printf("\n");
}
// The main function that builds Huffman tree
struct QueueNode* buildHuffmanTree(char data[], int freq[], int size)
{
struct QueueNode *left, *right, *top;
// Step 1: Create two empty queues
struct Queue* firstQueue = createQueue(size);
Output:
f: 0
c: 100
d:
a:
b:
e:
101
1100
1101
111
The set mstSet is initially empty and keys assigned to vertices are {0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite. Now
pick the vertex with minimum key value. The vertex 0 is picked, include it in mstSet. So mstSet becomes {0}. After including to mstSet, update
key values of adjacent vertices. Adjacent vertices of 0 are 1 and 7. The key values of 1 and 7 are updated as 4 and 8. Following subgraph shows
vertices and their key values, only the vertices with finite key values are shown. The vertices included in MST are shown in green color.
Pick the vertex with minimum key value and not already included in MST (not in mstSET). The vertex 1 is picked and added to mstSet. So mstSet
now becomes {0, 1}. Update the key values of adjacent vertices of 1. The key value of vertex 2 becomes 8.
Pick the vertex with minimum key value and not already included in MST (not in mstSET). We can either pick vertex 7 or vertex 2, let vertex 7 is
picked. So mstSet now becomes {0, 1, 7}. Update the key values of adjacent vertices of 7. The key value of vertex 6 and 8 becomes finite (7
and 1 respectively).
Pick the vertex with minimum key value and not already included in MST (not in mstSET). Vertex 6 is picked. So mstSet now becomes {0, 1, 7,
6}. Update the key values of adjacent vertices of 6. The key value of vertex 5 and 8 are updated.
We repeat the above steps until mstSet includes all vertices of given graph. Finally, we get the following graph.
C/C++
// A C / C++ program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph
#include <stdio.h>
#include <limits.h>
// Number of vertices in the graph
#define V 5
// A utility function to find the vertex with minimum key value, from
// the set of vertices not yet included in MST
int minKey(int key[], bool mstSet[])
{
// Initialize min value
int min = INT_MAX, min_index;
for (int v = 0; v < V; v++)
if (mstSet[v] == false && key[v] < min)
min = key[v], min_index = v;
return min_index;
}
// A utility function to print the constructed MST stored in parent[]
int printMST(int parent[], int n, int graph[V][V])
{
printf("Edge Weight\n");
for (int i = 1; i < V; i++)
printf("%d - %d
%d \n", parent[i], i, graph[i][parent[i]]);
}
// Function to construct and print MST for a graph represented using adjacency
// matrix representation
void primMST(int graph[V][V])
{
int parent[V]; // Array to store constructed MST
int key[V]; // Key values used to pick minimum weight edge in cut
bool mstSet[V]; // To represent set of vertices not yet included in MST
Java
// A Java program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph
import java.util.*;
import java.lang.*;
import java.io.*;
class MST
{
// Number of vertices in the graph
private static final int V=5;
// A utility function to find the vertex with minimum key
// value, from the set of vertices not yet included in MST
int minKey(int key[], Boolean mstSet[])
{
// Initialize min value
int min = Integer.MAX_VALUE, min_index=-1;
{
parent[v] = u;
key[v] = graph[u][v];
}
}
// print the constructed MST
printMST(parent, V, graph);
}
public static void main (String[] args)
{
/* Let us create the following graph
2
3
(0)--(1)--(2)
|
/ \ |
6| 8/ \5 |7
| /
\ |
(3)-------(4)
9
*/
MST t = new MST();
int graph[][] = new int[][] {{0, 2, 0, 6, 0},
{2, 0, 3, 8, 5},
{0, 3, 0, 0, 7},
{6, 8, 0, 0, 9},
{0, 5, 7, 9, 0},
};
// Print the solution
t.primMST(graph);
}
}
// This code is contributed by Aakash Hasija
Edge Weight
0 - 1
2
1 - 2
3
0 - 3
6
1 - 4
5
Time Complexity of the above program is O(V^2). If the input graph is represented using adjacency list, then the time complexity of Prims
algorithm can be reduced to O(E log V) with the help of binary heap. Please see Prims MST for Adjacency List Representation for more details.
Initially, key value of first vertex is 0 and INF (infinite) for all other vertices. So vertex 0 is extracted from Min Heap and key values of vertices
adjacent to 0 (1 and 7) are updated. Min Heap contains all vertices except vertex 0.
The vertices in green color are the vertices included in MST.
Since key value of vertex 1 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 1 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 1 to the adjacent). Min
Heap contains all vertices except vertex 0 and 1.
Since key value of vertex 7 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 7 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 7 to the adjacent). Min
Heap contains all vertices except vertex 0, 1 and 7.
Since key value of vertex 6 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 6 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 6 to the adjacent). Min
Heap contains all vertices except vertex 0, 1, 7 and 6.
The above steps are repeated for rest of the nodes in Min Heap till Min Heap becomes empty
// C / C++ program for Prim's MST for adjacency list representation of graph
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
// A structure to represent a node in adjacency list
struct AdjListNode
{
int dest;
int weight;
struct AdjListNode* next;
};
// A structure to represent an adjacency liat
struct AdjList
{
struct AdjListNode *head; // pointer to head node of list
};
// A structure to represent a graph. A graph is an array of adjacency lists.
// Size of array will be V (number of vertices in graph)
struct Graph
{
int V;
struct AdjList* array;
};
// A utility function to create a new adjacency list node
struct AdjListNode* newAdjListNode(int dest, int weight)
{
struct AdjListNode* newNode =
(struct AdjListNode*) malloc(sizeof(struct AdjListNode));
newNode->dest = dest;
newNode->weight = weight;
newNode->next = NULL;
return newNode;
}
// A utility function that creates a graph of V vertices
struct Graph* createGraph(int V)
{
struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph));
graph->V = V;
// Create an array of adjacency lists. Size of array will be V
graph->array = (struct AdjList*) malloc(V * sizeof(struct AdjList));
// Initialize each adjacency list as empty by making head as NULL
for (int i = 0; i < V; ++i)
graph->array[i].head = NULL;
return graph;
}
// Adds an edge to an undirected graph
void addEdge(struct Graph* graph, int src, int dest, int weight)
{
// Add an edge from src to dest. A new node is added to the adjacency
// list of src. The node is added at the begining
struct AdjListNode* newNode = newAdjListNode(dest, weight);
newNode->next = graph->array[src].head;
graph->array[src].head = newNode;
// Since graph is undirected, add an edge from dest to src also
newNode = newAdjListNode(src, weight);
newNode->next = graph->array[dest].head;
graph->array[dest].head = newNode;
}
// Structure to represent a min heap node
struct MinHeapNode
{
int v;
int key;
};
// Structure to represent a min heap
struct MinHeap
{
int size;
// Number of heap nodes present currently
int capacity; // Capacity of min heap
int *pos;
// This is needed for decreaseKey()
struct MinHeapNode **array;
};
// A utility function to create a new Min Heap Node
struct MinHeapNode* newMinHeapNode(int v, int key)
{
struct MinHeapNode* minHeapNode =
(struct MinHeapNode*) malloc(sizeof(struct MinHeapNode));
minHeapNode->v = v;
minHeapNode->key = key;
return minHeapNode;
}
// A utilit function to create a Min Heap
struct MinHeap* createMinHeap(int capacity)
{
struct MinHeap* minHeap =
(struct MinHeap*) malloc(sizeof(struct MinHeap));
minHeap->pos = (int *)malloc(capacity * sizeof(int));
minHeap->size = 0;
minHeap->capacity = capacity;
minHeap->array =
(struct MinHeapNode**) malloc(capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// A utility function to swap two nodes of min heap. Needed for min heapify
void swapMinHeapNode(struct MinHeapNode** a, struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// A standard function to heapify at given idx
// This function also updates position of nodes when they are swapped.
// Position is needed for decreaseKey()
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest, left, right;
smallest = idx;
left = 2 * idx + 1;
right = 2 * idx + 2;
if (left < minHeap->size &&
minHeap->array[left]->key < minHeap->array[smallest]->key )
smallest = left;
if (right < minHeap->size &&
minHeap->array[right]->key < minHeap->array[smallest]->key )
smallest = right;
if (smallest != idx)
{
// The nodes to be swapped in min heap
MinHeapNode *smallestNode = minHeap->array[smallest];
MinHeapNode *idxNode = minHeap->array[idx];
// Swap positions
minHeap->pos[smallestNode->v] = idx;
minHeap->pos[idxNode->v] = smallest;
// Swap nodes
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if the given minHeap is ampty or not
int isEmpty(struct MinHeap* minHeap)
{
return minHeap->size == 0;
}
// Standard function to extract minimum node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
if (isEmpty(minHeap))
return NULL;
// Store the root node
struct MinHeapNode* root = minHeap->array[0];
// Replace root node with last node
struct MinHeapNode* lastNode = minHeap->array[minHeap->size - 1];
minHeap->array[0] = lastNode;
// Update position of last node
minHeap->pos[root->v] = minHeap->size-1;
minHeap->pos[lastNode->v] = 0;
// Reduce heap size and heapify root
--minHeap->size;
minHeapify(minHeap, 0);
return root;
}
// Function to decreasy key value of a given vertex v. This function
// uses pos[] of min heap to get the current index of node in min heap
void decreaseKey(struct MinHeap* minHeap, int v, int key)
{
// Get the index of v in heap array
int i = minHeap->pos[v];
// Get the node and update its key value
minHeap->array[i]->key = key;
// Travel up while the complete tree is not hepified.
// This is a O(Logn) loop
while (i && minHeap->array[i]->key < minHeap->array[(i - 1) / 2]->key)
{
// Swap this node with its parent
minHeap->pos[minHeap->array[i]->v] = (i-1)/2;
minHeap->pos[minHeap->array[(i-1)/2]->v] = i;
swapMinHeapNode(&minHeap->array[i], &minHeap->array[(i - 1) / 2]);
// move to parent index
i = (i - 1) / 2;
}
}
// A utility function to check if a given vertex
// 'v' is in min heap or not
bool isInMinHeap(struct MinHeap *minHeap, int v)
{
if (minHeap->pos[v] < minHeap->size)
return true;
return false;
}
// A utility function used to print the constructed MST
void printArr(int arr[], int n)
{
for (int i = 1; i < n; ++i)
printf("%d - %d\n", arr[i], i);
}
// The main function that constructs Minimum Spanning Tree (MST)
// using Prim's algorithm
}
pCrawl = pCrawl->next;
}
}
// print edges of MST
printArr(parent, V);
}
// Driver program to test above functions
int main()
{
// Let us create the graph given in above fugure
int V = 9;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1, 4);
addEdge(graph, 0, 7, 8);
addEdge(graph, 1, 2, 8);
addEdge(graph, 1, 7, 11);
addEdge(graph, 2, 3, 7);
addEdge(graph, 2, 8, 2);
addEdge(graph, 2, 5, 4);
addEdge(graph, 3, 4, 9);
addEdge(graph, 3, 5, 14);
addEdge(graph, 4, 5, 10);
addEdge(graph, 5, 6, 2);
addEdge(graph, 6, 7, 1);
addEdge(graph, 6, 8, 6);
addEdge(graph, 7, 8, 7);
PrimMST(graph);
return 0;
}
Output:
0
5
2
3
6
7
0
2
1
2
3
4
5
6
7
8
Time Complexity: The time complexity of the above code/algorithm looks O(V^2) as there are two nested while loops. If we take a closer look,
we can observe that the statements in inner loop are executed O(V+E) times (similar to BFS). The inner loop has decreaseKey() operation which
takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV) (For a connected graph, V =
O(E))
References:
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
http://en.wikipedia.org/wiki/Prims_algorithm
The set sptSetis initially empty and distances assigned to vertices are {0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite.
Now pick the vertex with minimum distance value. The vertex 0 is picked, include it in sptSet. So sptSet becomes {0}. After including 0 to
sptSet, update distance values of its adjacent vertices. Adjacent vertices of 0 are 1 and 7. The distance values of 1 and 7 are updated as 4 and 8.
Following subgraph shows vertices and their distance values, only the vertices with finite distance values are shown. The vertices included in SPT
are shown in green color.
Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). The vertex 1 is picked and added to sptSet. So
sptSet now becomes {0, 1}. Update the distance values of adjacent vertices of 1. The distance value of vertex 2 becomes 12.
Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1,
7}. Update the distance values of adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9 respectively).
Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1,
7, 6}. Update the distance values of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.
We repeat the above steps until sptSet doesnt include all vertices of given graph. Finally, we get the following Shortest Path Tree (SPT).
C/C++
// A C / C++ program for Dijkstra's single source shortest path algorithm.
// The program is for adjacency matrix representation of the graph
#include <stdio.h>
#include <limits.h>
// Number of vertices in the graph
#define V 9
// A utility function to find the vertex with minimum distance value, from
// the set of vertices not yet included in shortest path tree
int minDistance(int dist[], bool sptSet[])
{
// Initialize min value
int min = INT_MAX, min_index;
for (int v = 0; v < V; v++)
if (sptSet[v] == false && dist[v] <= min)
min = dist[v], min_index = v;
return min_index;
}
// A utility function to print the constructed distance array
int printSolution(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < V; i++)
printf("%d \t\t %d\n", i, dist[i]);
}
// Funtion that implements Dijkstra's single source shortest path algorithm
// for a graph represented using adjacency matrix representation
void dijkstra(int graph[V][V], int src)
{
int dist[V];
// The output array. dist[i] will hold the shortest
// distance from src to i
bool sptSet[V]; // sptSet[i] will true if vertex i is included in shortest
// path tree or shortest distance from src to i is finalized
// Initialize all distances as INFINITE and stpSet[] as false
for (int i = 0; i < V; i++)
dist[i] = INT_MAX, sptSet[i] = false;
// Distance of source vertex from itself is always 0
dist[src] = 0;
// Find shortest path for all vertices
for (int count = 0; count < V-1; count++)
{
// Pick the minimum distance vertex from the set of vertices not
}
// print the constructed distance array
printSolution(dist, V);
}
// driver program to test above function
int main()
{
/* Let us create the example graph discussed above */
int graph[V][V] = {{0, 4, 0, 0, 0, 0, 0, 8, 0},
{4, 0, 8, 0, 0, 0, 0, 11, 0},
{0, 8, 0, 7, 0, 4, 0, 0, 2},
{0, 0, 7, 0, 9, 14, 0, 0, 0},
{0, 0, 0, 9, 0, 10, 0, 0, 0},
{0, 0, 4, 0, 10, 0, 2, 0, 0},
{0, 0, 0, 14, 0, 2, 0, 1, 6},
{8, 11, 0, 0, 0, 0, 1, 0, 7},
{0, 0, 2, 0, 0, 0, 6, 7, 0}
};
dijkstra(graph, 0);
return 0;
}
Java
// A Java program for Dijkstra's single source shortest path algorithm.
// The program is for adjacency matrix representation of the graph
import java.util.*;
import java.lang.*;
import java.io.*;
class ShortestPath
{
// A utility function to find the vertex with minimum distance value,
// from the set of vertices not yet included in shortest path tree
static final int V=9;
int minDistance(int dist[], Boolean sptSet[])
{
// Initialize min value
int min = Integer.MAX_VALUE, min_index=-1;
for (int v = 0; v < V; v++)
if (sptSet[v] == false && dist[v] <= min)
{
min = dist[v];
min_index = v;
}
return min_index;
}
// A utility function to print the constructed distance array
void printSolution(int dist[], int n)
{
System.out.println("Vertex Distance from Source");
for (int i = 0; i < V; i++)
System.out.println(i+" \t\t "+dist[i]);
}
// Funtion that implements Dijkstra's single source shortest path
// algorithm for a graph represented using adjacency matrix
// representation
}
// print the constructed distance array
printSolution(dist, V);
}
// Driver method
public static void main (String[] args)
{
/* Let us create the example graph discussed above */
int graph[][] = new int[][]{{0, 4, 0, 0, 0, 0, 0, 8, 0},
{4, 0, 8, 0, 0, 0, 0, 11, 0},
{0, 8, 0, 7, 0, 4, 0, 0, 2},
{0, 0, 7, 0, 9, 14, 0, 0, 0},
{0, 0, 0, 9, 0, 10, 0, 0, 0},
{0, 0, 4, 0, 10, 0, 2, 0, 0},
{0, 0, 0, 14, 0, 2, 0, 1, 6},
{8, 11, 0, 0, 0, 0, 1, 0, 7},
{0, 0, 2, 0, 0, 0, 6, 7, 0}
};
ShortestPath t = new ShortestPath();
t.dijkstra(graph, 0);
}
}
//This code is contributed by Aakash Hasija
Vertex
0
1
2
3
4
5
6
7
8
Notes:
1) The code calculates shortest distance, but doesnt calculate the path information. We can create a parent array, update the parent array when
distance is updated (like prims implementation) and use it show the shortest path from source to different vertices.
2) The code is for undirected graph, same dijekstra function can be used for directed graphs also.
3) The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from source to a single target, we can
break the for loop when the picked minimum distance vertex is equal to target (Step 3.a of algorithm).
4) Time Complexity of the implementation is O(V^2). If the input graph is represented using adjacency list, it can be reduced to O(E log V) with
the help of binary heap. Please see
Dijkstras Algorithm for Adjacency List Representation for more details.
5) Dijkstras algorithm doesnt work for graphs with negative weight edges. For graphs with negative weight edges, BellmanFord algorithm can be
used, we will soon be discussing it as a separate post.
Dijkstras Algorithm for Adjacency List Representation
Initially, distance value of source vertex is 0 and INF (infinite) for all other vertices. So source vertex is extracted from Min Heap and distance
values of vertices adjacent to 0 (1 and 7) are updated. Min Heap contains all vertices except vertex 0.
The vertices in green color are the vertices for which minimum distances are finalized and are not in Min Heap
Since distance value of vertex 1 is minimum among all nodes in Min Heap, it is extracted from Min Heap and distance values of vertices adjacent
to 1 are updated (distance is updated if the a vertex is not in Min Heap and distance through 1 is shorter than the previous distance). Min Heap
contains all vertices except vertex 0 and 1.
Pick the vertex with minimum distance value from min heap. Vertex 7 is picked. So min heap now contains all vertices except 0, 1 and 7. Update
the distance values of adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9 respectively).
Pick the vertex with minimum distance from min heap. Vertex 6 is picked. So min heap now contains all vertices except 0, 1, 7 and 6. Update the
distance values of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.
Above steps are repeated till min heap doesnt become empty. Finally, we get the following shortest path tree.
{
// The nodes to be swapped in min heap
MinHeapNode *smallestNode = minHeap->array[smallest];
MinHeapNode *idxNode = minHeap->array[idx];
// Swap positions
minHeap->pos[smallestNode->v] = idx;
minHeap->pos[idxNode->v] = smallest;
// Swap nodes
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if the given minHeap is ampty or not
int isEmpty(struct MinHeap* minHeap)
{
return minHeap->size == 0;
}
// Standard function to extract minimum node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
if (isEmpty(minHeap))
return NULL;
// Store the root node
struct MinHeapNode* root = minHeap->array[0];
// Replace root node with last node
struct MinHeapNode* lastNode = minHeap->array[minHeap->size - 1];
minHeap->array[0] = lastNode;
// Update position of last node
minHeap->pos[root->v] = minHeap->size-1;
minHeap->pos[lastNode->v] = 0;
// Reduce heap size and heapify root
--minHeap->size;
minHeapify(minHeap, 0);
return root;
}
// Function to decreasy dist value of a given vertex v. This function
// uses pos[] of min heap to get the current index of node in min heap
void decreaseKey(struct MinHeap* minHeap, int v, int dist)
{
// Get the index of v in heap array
int i = minHeap->pos[v];
// Get the node and update its dist value
minHeap->array[i]->dist = dist;
// Travel up while the complete tree is not hepified.
// This is a O(Logn) loop
while (i && minHeap->array[i]->dist < minHeap->array[(i - 1) / 2]->dist)
{
// Swap this node with its parent
minHeap->pos[minHeap->array[i]->v] = (i-1)/2;
minHeap->pos[minHeap->array[(i-1)/2]->v] = i;
swapMinHeapNode(&minHeap->array[i], &minHeap->array[(i - 1) / 2]);
// move to parent index
i = (i - 1) / 2;
}
}
// A utility function to check if a given vertex
// 'v' is in min heap or not
bool isInMinHeap(struct MinHeap *minHeap, int v)
{
if (minHeap->pos[v] < minHeap->size)
return true;
return false;
}
// A utility function used to print the solution
void printArr(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < n; ++i)
printf("%d \t\t %d\n", i, dist[i]);
}
// The main function that calulates distances of shortest paths from src to all
// vertices. It is a O(ELogV) function
void dijkstra(struct Graph* graph, int src)
{
int V = graph->V;// Get the number of vertices in graph
int dist[V];
// dist values used to pick minimum weight edge in cut
// minHeap represents set E
struct MinHeap* minHeap = createMinHeap(V);
// Initialize min heap with all vertices. dist value of all vertices
for (int v = 0; v < V; ++v)
{
dist[v] = INT_MAX;
minHeap->array[v] = newMinHeapNode(v, dist[v]);
minHeap->pos[v] = v;
}
// Make dist value of src vertex as 0 so that it is extracted first
minHeap->array[src] = newMinHeapNode(src, dist[src]);
minHeap->pos[src] = src;
dist[src] = 0;
decreaseKey(minHeap, src, dist[src]);
// Initially size of min heap is equal to V
minHeap->size = V;
// In the followin loop, min heap contains all nodes
// whose shortest distance is not yet finalized.
while (!isEmpty(minHeap))
{
// Extract the vertex with minimum distance value
struct MinHeapNode* minHeapNode = extractMin(minHeap);
int u = minHeapNode->v; // Store the extracted vertex number
// Traverse through all adjacent vertices of u (the extracted
// vertex) and update their distance values
struct AdjListNode* pCrawl = graph->array[u].head;
while (pCrawl != NULL)
{
int v = pCrawl->dest;
// If shortest distance to v is not finalized yet, and distance to v
// through u is less than its previously calculated distance
if (isInMinHeap(minHeap, v) && dist[u] != INT_MAX &&
pCrawl->weight + dist[u] < dist[v])
{
dist[v] = dist[u] + pCrawl->weight;
// update distance value in min heap also
decreaseKey(minHeap, v, dist[v]);
}
pCrawl = pCrawl->next;
}
}
// print the calculated shortest distances
printArr(dist, V);
}
// Driver program to test above functions
int main()
{
// create the graph given in above fugure
int V = 9;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1, 4);
addEdge(graph, 0, 7, 8);
addEdge(graph, 1, 2, 8);
addEdge(graph, 1, 7, 11);
addEdge(graph, 2, 3, 7);
addEdge(graph, 2, 8, 2);
addEdge(graph, 2, 5, 4);
addEdge(graph, 3, 4, 9);
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
3,
4,
5,
6,
6,
7,
5,
5,
6,
7,
8,
8,
14);
10);
2);
1);
6);
7);
dijkstra(graph, 0);
return 0;
}
Output:
Vertex
0
1
2
3
4
5
6
7
8
Time Complexity:The time complexity of the above code/algorithm looks O(V^2) as there are two nested while loops. If we take a closer look,
we can observe that the statements in inner loop are executed O(V+E) times (similar to BFS). The inner loop has decreaseKey() operation which
takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV)
Note that the above code uses Binary Heap for Priority Queue implementation. Time complexity can be reduced to O(E + VLogV) using
Fibonacci Heap. The reason is, Fibonacci Heap takes O(1) time for decrease-key operation while Binary Heap takes O(Logn) time.
Notes:
1)The code calculates shortest distance, but doesnt calculate the path information. We can create a parent array, update the parent array when
distance is updated (likeprims implementation) and use it show the shortest path from source to different vertices.
2)The code is for undirected graph, same dijekstra function can be used for directed graphs also.
3)The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from source to a single target, we can
break the for loop when the picked minimum distance vertex is equal to target (Step 3.a of algorithm).
4)Dijkstras algorithm doesnt work for graphs with negative weight edges. For graphs with negative weight edges,BellmanFord algorithmcan be
used, we will soon be discussing it as a separate post.
References:
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Algorithms by Sanjoy Dasgupta, Christos Papadimitriou, Umesh Vazirani
A Simple Solution is to generate all subsets of given set of jobs and check individual subset for feasibility of jobs in that subset. Keep track of
maximum profit among all feasible subsets. The time complexity of this solution is exponential.
This is a standard Greedy Algorithm problem. Following is algorithm.
1) Sort all jobs in decreasing order of profit.
2) Initialize the result sequence as first job in sorted jobs.
3) Do following for remaining n-1 jobs
.......a) If the current job can fit in the current result sequence
without missing the deadline, add current job to the result.
Else ignore the current job.
Output:
Following is maximum profit sequence of jobs
c a e
Time Complexity of the above solution is O(n2). It can be optimized to almost O(n) by using union-find data structure. We will son be discussing
the optimized solution.
Sources:
http://ocw.mit.edu/courses/civil-and-environmental-engineering/1-204-computer-algorithms-in-systems-engineering-spring-2010/lecturenotes/MIT1_204S10_lec10.pdf
Output:
Following is minimal number of change for 93 is 50 20 20 2 1
Note that above approach may not work for all denominations. For example, it doesnt work for denominations {9, 6, 5, 1} and V = 11. The
above approach would print 9, 1 and 1. But we can use 2 denominations 5 and 6.
For general input, we use below dynamic programming approach.
Find minimum number of coins that make a given value
Thanks to Utkarsh for providing above solution here.
There is no polynomial time solution available for this problem as the problem is a known NP-Hard problem. There is a polynomial time Greedy
approximate algorithm, the greedy algorithm provides a solution which is never worse that twice the optimal solution. The greedy solution works
only if the distances between cities follow Triangular Inequality (Distance between two points is always smaller than sum of distances through a
third point).
The 2-Approximate Greedy Algorithm:
1) Choose the first center arbitrarily.
2) Choose remaining k-1 centers using the following criteria.
Let c1, c2, c3, ci be the already chosen centers. Choose
(i+1)th center by picking the city which is farthest from already
selected centers, i.e, the point p which has following value as maximum
Min[dist(p, c1), dist(p, c2), dist(p, c3), . dist(p, ci)]
The following diagram taken from here illustrates above algorithm.
b) This means that distances between all centers are also > 2OPT.
c) We have k + 1 points with distances > 2OPT between every pair.
d) Each point has a center of the optimal solution with distance ? OPT to it.
e) There exists a pair of points with the same center X in the optimal solution (pigeonhole principle: k optimal centers, k+1 points)
f) The distance between them is at most 2OPT (triangle inequality) which is a contradiction.
Source:
http://algo2.iti.kit.edu/vanstee/courses/kcenter.pdf
We can see that the function f(3) is being called 2 times. If we would have stored the value of f(3), then instead of computing it again, we would
have reused the old stored value. There are following two different ways to store the values so that these values can be reused.
a) Memoization (Top Down):
b) Tabulation (Bottom Up):
a) Memoization (Top Down): The memoized program for a problem is similar to the recursive version with a small modification that it looks into a
lookup table before computing solutions. We initialize a lookup array with all initial values as NIL. Whenever we need solution to a subproblem,
we first look into the lookup table. If the precomputed value is there then we return that value, otherwise we calculate the value and put the result in
lookup table so that it can be reused later.
Following is the memoized version for nth Fibonacci Number.
/* Memoized version for nth Fibonacci number */
#include<stdio.h>
#define NIL -1
#define MAX 100
int lookup[MAX];
/* Function to initialize NIL values in lookup table */
void _initialize()
{
int i;
for (i = 0; i < MAX; i++)
lookup[i] = NIL;
}
/* function for
int fib(int n)
{
if(lookup[n]
{
if ( n <= 1
lookup[n]
else
lookup[n]
}
return lookup[n];
}
int main ()
{
int n = 40;
_initialize();
printf("Fibonacci number is %d ", fib(n));
getchar();
return 0;
}
b) Tabulation (Bottom Up): The tabulated program for a given problem builds a table in bottom up fashion and returns the last entry from table.
/* tabulated version */
#include<stdio.h>
int fib(int n)
{
int f[n+1];
int i;
f[0] = 0; f[1] = 1;
for (i = 2; i <= n; i++)
f[i] = f[i-1] + f[i-2];
return f[n];
}
int main ()
{
int n = 9;
printf("Fibonacci number is %d ", fib(n));
getchar();
return 0;
}
Both tabulated and Memoized store the solutions of subproblems. In Memoized version, table is filled on demand while in tabulated version,
starting from the first entry, all entries are filled one by one. Unlike the tabulated version, all entries of the lookup table are not necessarily filled in
memoized version. For example, memoized solution of LCS problem doesnt necessarily fill all entries.
To see the optimization achieved by memoized and tabulated versions over the basic recursive version, see the time taken by following runs for
40th Fibonacci number.
Simple recursive program
Memoized version
tabulated version
Also see method 2 of Ugly Number post for one more simple example where we have overlapping subproblems and we store the results of
subproblems.
We will be covering Optimal Substructure Property and some more example problems in future posts on Dynamic Programming.
Try following questions as an exercise of this post.
1) Write a memoized version for LCS problem. Note that the tabular version is given in the CLRS book.
2) How would you choose between Memoization and Tabulation?
References:
http://www.youtube.com/watch?v=V5hZoJ6uK-s
C/C++
/* A Naive C/C++ recursive implementation of LIS problem */
#include<stdio.h>
#include<stdlib.h>
/* To make use of recursive calls, this function must return
two things:
1) Length of LIS ending with element arr[n-1]. We use
max_ending_here for this purpose
2) Overall maximum as the LIS may end with an element
before arr[n-1] max_ref is used this purpose.
The value of LIS of full array of size n is stored in
*max_ref which is our final result */
int _lis( int arr[], int n, int *max_ref)
{
/* Base case */
if (n == 1)
return 1;
// 'max_ending_here' is length of LIS ending with arr[n-1]
int res, max_ending_here = 1;
/* Recursively get all LIS ending with arr[0], arr[1] ...
arr[n-2]. If arr[i-1] is smaller than arr[n-1], and
max ending with arr[n-1] needs to be updated, then
update it */
for (int i = 1; i < n; i++)
{
res = _lis(arr, i, max_ref);
if (arr[i-1] < arr[n-1] && res + 1 > max_ending_here)
max_ending_here = res + 1;
}
// Compare max_ending_here with the overall max. And
// update the overall max if needed
if (*max_ref < max_ending_here)
*max_ref = max_ending_here;
// Return length of LIS ending with arr[n-1]
return max_ending_here;
}
// The wrapper function for _lis()
int lis(int arr[], int n)
{
// The max variable holds the result
int max = 1;
// The function _lis() stores its result in max
_lis( arr, n, &max );
// returns max
return max;
}
/* Driver program to test above function */
int main()
{
int arr[] = { 10, 22, 9, 33, 21, 50, 41, 60 };
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of LIS is %d\n", lis( arr, n ));
return 0;
}
Python
# A naive Python implementation of LIS problem
""" To make use of recursive calls, this function must return
two things:
1) Length of LIS ending with element arr[n-1]. We use
max_ending_here for this purpose
2) Overall maximum as the LIS may end with an element
before arr[n-1] max_ref is used this purpose.
The value of LIS of full array of size n is stored in
*max_ref which is our final result """
# global variable to store the maximum
global maximum
def _lis(arr , n ):
# to allow the access of global variable
global maximum
# Base Case
if n == 1 :
return 1
# maxEndingHere is the length of LIS ending with arr[n-1]
maxEndingHere = 1
"""Recursively get all LIS ending with arr[0], arr[1]..arr[n-2]
IF arr[n-1] is maller than arr[n-1], and max ending with
arr[n-1] needs to be updated, then update it"""
for i in xrange(1, n):
res = _lis(arr , i)
if arr[i-1] < arr[n-1] and res+1 > maxEndingHere:
maxEndingHere = res +1
# Compare maxEndingHere with overall maximum.And update
# the overall maximum if needed
maximum = max(maximum , maxEndingHere)
return maxEndingHere
def lis(arr):
# to allow the access of global variable
global maximum
# lenght of arr
n = len(arr)
# maximum variable holds the result
maximum = 1
# The function _lis() stores its result in maximum
_lis(arr , n)
return maximum
# Driver program to test the above function
arr = [10 , 22 , 9 , 33 , 21 , 41 , 60]
n = len(arr)
print "Length of LIS is ", lis(arr)
# This code is contributed by NIKHIL KUMAR SINGH
/
lis(3)
/
\
lis(2) lis(1)
lis(4)
|
lis(2)
/
lis(1)
\
lis(1)
/
lis(1)
We can see that there are many subproblems which are solved again and again. So this problem has Overlapping Substructure property and
recomputation of same subproblems can be avoided by either using Memoization or Tabulation. Following is a tabluated implementation for the
LIS problem.
C/C++
/* Dynamic Programming C/C++ implementation of LIS problem */
#include<stdio.h>
#include<stdlib.h>
/* lis() returns the length of the longest increasing
subsequence in arr[] of size n */
int lis( int arr[], int n )
{
int *lis, i, j, max = 0;
lis = (int*) malloc ( sizeof( int ) * n );
/* Initialize LIS values for all indexes */
for ( i = 0; i < n; i++ )
lis[i] = 1;
/* Compute optimized LIS values in bottom up manner */
for ( i = 1; i < n; i++ )
for ( j = 0; j < i; j++ )
if ( arr[i] > arr[j] && lis[i] < lis[j] + 1)
lis[i] = lis[j] + 1;
/* Pick maximum of all LIS values */
for ( i = 0; i < n; i++ )
if ( max < lis[i] )
max = lis[i];
/* Free memory to avoid memory leak */
free( lis );
return max;
}
/* Driver program to test above function */
int main()
{
int arr[] = { 10, 22, 9, 33, 21, 50, 41, 60 };
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of LIS is %d\n", lis( arr, n ) );
return 0;
}
Python
# Dynamic programming Python implementation of LIS problem
# lis returns length of the longest increasing subsequence
# in arr of size n
def lis(arr):
n = len(arr)
# Declare the list (array) for LIS and initialize LIS
# values for all indexes
lis = [1]*n
# Compute optimized LIS values in bottom up manner
for i in range (1 , n):
for j in range(0 , i):
if arr[i] > arr[j] and lis[i]< lis[j] + 1 :
lis[i] = lis[j]+1
# Initialize maximum to 0 to get the maximum of all
# LIS
maximum = 0
# Pick maximum of all LIS values
for i in range(n):
maximum = max(maximum , lis[i])
return maximum
Length of LIS is 5
Note that the time complexity of the above Dynamic Programming (DP) solution is O(n^2) and there is a O(nLogn) solution for the LIS problem.
We have not discussed the O(n Log n) solution here as the purpose of this post is to explain Dynamic Programming with a simple example. See
below post for O(n Log n) solution.
Longest Increasing Subsequence Size (N log N)
C/C++
/* A Naive recursive implementation of LCS problem */
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int max(int a, int b);
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
if (m == 0 || n == 0)
return 0;
if (X[m-1] == Y[n-1])
return 1 + lcs(X, Y, m-1, n-1);
else
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
return (a > b)? a : b;
}
/* Driver program to test above function */
int main()
{
Python
# A Naive recursive Python implementation of LCS problem
def lcs(X, Y, m, n):
if m == 0 or n == 0:
return 0;
elif X[m-1] == Y[n-1]:
return 1 + lcs(X, Y, m-1, n-1);
else:
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X , Y, len(X), len(Y))
Length of LCS is 4
Time complexity of the above naive recursive approach is O(2^n) in worst case and worst case happens when all characters of X and Y mismatch
i.e., length of LCS is 0.
Considering the above implementation, following is a partial recursion tree for input strings AXYT and AYZX
lcs("AXYT", "AYZX")
/
\
lcs("AXY", "AYZX")
lcs("AXYT", "AYZ")
/
\
/
\
lcs("AX", "AYZX") lcs("AXY", "AYZ") lcs("AXY", "AYZ") lcs("AXYT", "AY")
In the above partial recursion tree, lcs(AXY, AYZ) is being solved twice. If we draw the complete recursion tree, then we can see that there are
many subproblems which are solved again and again. So this problem has Overlapping Substructure property and recomputation of same
subproblems can be avoided by either using Memoization or Tabulation. Following is a tabulated implementation for the LCS problem.
C/C++
/* Dynamic Programming C/C++ implementation of LCS problem */
#include<stdio.h>
#include<stdlib.h>
int max(int a, int b);
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
int L[m+1][n+1];
int i, j;
/* Following steps build L[m+1][n+1] in bottom up fashion. Note
that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
for (i=0; i<=m; i++)
{
for (j=0; j<=n; j++)
{
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i-1] == Y[j-1])
L[i][j] = L[i-1][j-1] + 1;
else
L[i][j] = max(L[i-1][j], L[i][j-1]);
}
}
Python
# Dynamic Programming implementation of LCS problem
def lcs(X , Y):
# find the length of the strings
m = len(X)
n = len(Y)
# declaring the array for storing the dp values
L = [[None]*(n+1) for i in xrange(m+1)]
"""Following steps build L[m+1][n+1] in bottom up fashion
Note: L[i][j] contains length of LCS of X[0..i-1]
and Y[0..j-1]"""
for i in range(m+1):
for j in range(n+1):
if i == 0 or j == 0 :
L[i][j] = 0
elif X[i-1] == Y[j-1]:
L[i][j] = L[i-1][j-1]+1
else:
L[i][j] = max(L[i-1][j] , L[i][j-1])
# L[m][n] contains the length of LCS of X[0..n-1] & Y[0..m-1]
return L[m][n]
#end of function lcs
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X, Y)
# This code is contributed by Nikhil Kumar Singh(nickzuck_007)
The above algorithm/code returns only length of LCS. Please see the following post for printing the LCS.
Printing Longest Common Subsequence
References:
http://www.youtube.com/watch?v=V5hZoJ6uK-s
http://www.algorithmist.com/index.php/Longest_Common_Subsequence
http://www.ics.uci.edu/~eppstein/161/960229.html
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
1. If last characters of two strings are same, nothing much to do. Ignore last characters and get count for remaining strings. So we recur for
lengths m-1 and n-1.
2. Else (If last characters are not same), we consider all operations on str1?, consider all three operations on last character of first string,
recursively compute minimum cost for all three operations and take minimum of three values.
a. Insert: Recur for m and n-1
b. Remove: Recur for m-1 and n
c. Replace: Recur for m-1 and n-1
Below is C++ implementation of above Naive recursive solution.
C++
// A Naive recursive C++ program to find minimum number
// operations to convert str1 to str2
#include<bits/stdc++.h>
using namespace std;
// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
return min(min(x, y), z);
}
int editDist(string str1 , string str2 , int m ,int n)
{
// If first string is empty, the only option is to
// insert all characters of second string into first
if (m == 0) return n;
// If second string is empty, the only option is to
// remove all characters of first string
if (n == 0) return m;
//
//
//
if
}
// Driver program
int main()
{
// your code goes here
string str1 = "sunday";
string str2 = "saturday";
cout << editDist( str1 , str2 , str1.length(), str2.length());
return 0;
}
Python
# A Naive recursive Python program to fin minimum number
# operations to convert str1 to str2
def editDistance(str1, str2, m , n):
# If first string is empty, the only option is to
# insert all characters of second string into first
if m==0:
return n
# If second string is empty, the only option is to
# remove all characters of first string
if n==0:
return m
# If last characters of two strings are same, nothing
# much to do. Ignore last characters and get count for
# remaining strings.
if str1[m-1]==str2[n-1]:
return editDistance(str1,str2,m-1,n-1)
# If last characters are not same, consider all three
# operations on last character of first string, recursively
# compute minimum cost for all three operations and take
# minimum of three values.
return 1 + min(editDistance(str1, str2, m, n-1),
# Insert
editDistance(str1, str2, m-1, n),
# Remove
editDistance(str1, str2, m-1, n-1)
# Replace
)
# Driver program to test the above function
str1 = "sunday"
str2 = "saturday"
print editDistance(str1, str2, len(str1), len(str2))
# This code is contributed by Bhavya Jain
The time complexity of above solution is exponential. In worst case, we may end up doing O(3m) operations. The worst case happens when none
of characters of two strings match. Below is a recursive call diagram for worst case.
We can see that many subproblems are solved again and again, for example eD(2,2) is called three times. Since same suproblems are called again,
this problem has Overlapping Subprolems property. So Edit Distance problem has both properties (see this and this) of a dynamic programming
problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a
temporary array that stores results of subpriblems.
C++
// A Dynamic Programming based C++ program to find minimum
// number operations to convert str1 to str2
#include<bits/stdc++.h>
using namespace std;
// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
return min(min(x, y), z);
}
int editDistDP(string str1, string str2, int m, int n)
{
// Create a table to store results of subproblems
int dp[m+1][n+1];
// Fill d[][] in bottom up manner
for (int i=0; i<=m; i++)
{
for (int j=0; j<=n; j++)
{
// If first string is empty, only option is to
// isnert all characters of second string
if (i==0)
dp[i][j] = j; // Min. operations = j
// If second string is empty, only option is to
// remove all characters of second string
else if (j==0)
dp[i][j] = i; // Min. operations = i
// If last characters are same, ignore last char
// and recur for remaining string
else if (str1[i-1] == str2[j-1])
dp[i][j] = dp[i-1][j-1];
// If last character are different, consider all
// possibilities and find minimum
else
dp[i][j] = 1 + min(dp[i][j-1], // Insert
dp[i-1][j], // Remove
dp[i-1][j-1]); // Replace
}
}
return dp[m][n];
}
// Driver program
int main()
{
Python
# A Dynamic Programming based Python program for edit
# distance problem
def editDistDP(str1, str2, m, n):
# Create a table to store results of subproblems
dp = [[0 for x in range(n+1)] for x in range(m+1)]
# Fill d[][] in bottom up manner
for i in range(m+1):
for j in range(n+1):
# If first string is empty, only option is to
# isnert all characters of second string
if i == 0:
dp[i][j] = j
# Min. operations = j
# If second string is empty, only option is to
# remove all characters of second string
elif j == 0:
dp[i][j] = i
# Min. operations = i
# If last characters are same, ignore last char
# and recur for remaining string
elif str1[i-1] == str2[j-1]:
dp[i][j] = dp[i-1][j-1]
# If last character are different, consider all
# possibilities and find minimum
else:
dp[i][j] = 1 + min(dp[i][j-1],
# Insert
dp[i-1][j],
# Remove
dp[i-1][j-1])
# Replace
return dp[m][n]
# Driver program
str1 = "sunday"
str2 = "saturday"
print(editDistDP(str1, str2, len(str1), len(str2)))
# This code is contributed by Bhavya Jain
Output:
3
The path with minimum cost is highlighted in the following figure. The path is (0, 0) > (0, 1) > (1, 2) > (2, 2). The cost of the path is 8 (1 + 2 + 2 +
3).
1) Optimal Substructure
The path to reach (m, n) must be through one of the 3 cells: (m-1, n-1) or (m-1, n) or (m, n-1). So minimum cost to reach (m, n) can be written as
minimum of the 3 cells plus cost[m][n].
minCost(m, n) = min (minCost(m-1, n-1), minCost(m-1, n), minCost(m, n-1)) + cost[m][n]
2) Overlapping Subproblems
Following is simple recursive implementation of the MCP (Minimum Cost Path) problem. The implementation simply follows the recursive structure
mentioned above.
/* A Naive recursive implementation of MCP(Minimum Cost Path) problem */
#include<stdio.h>
#include<limits.h>
#define R 3
#define C 3
int min(int x, int y, int z);
/* Returns cost of minimum cost path from (0,0) to (m, n) in mat[R][C]*/
int minCost(int cost[R][C], int m, int n)
{
if (n < 0 || m < 0)
return INT_MAX;
else if (m == 0 && n == 0)
return cost[m][n];
else
return cost[m][n] + min( minCost(cost, m-1, n-1),
minCost(cost, m-1, n),
minCost(cost, m, n-1) );
}
/* A utility function
int min(int x, int y,
{
if (x < y)
return (x < z)?
else
return (y < z)?
}
{1, 5, 3} };
printf(" %d ", minCost(cost, 2, 2));
return 0;
}
It should be noted that the above function computes the same subproblems again and again. See the following recursion tree, there are many nodes
which apear more than once. Time complexity of this naive recursive solution is exponential and it is terribly slow.
mC refers to minCost()
mC(1,
/
|
/
|
mC(0,0) mC(0,1)
mC(2, 2)
/
|
\
/
|
\
1)
mC(1, 2)
mC(2,
\
/
|
\
/
\
/
|
\
/
mC(1,0) mC(0,1) mC(0,2) mC(1,1) mC(1,0)
1)
|
\
|
\
mC(1,1) mC(2,0)
So the MCP problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP)
problems, recomputations of same subproblems can be avoided by constructing a temporary array tc[][] in bottom up manner.
C++
/* Dynamic Programming implementation of MCP problem */
#include<stdio.h>
#include<limits.h>
#define R 3
#define C 3
int min(int x, int y, int z);
int minCost(int cost[R][C], int m, int n)
{
int i, j;
// Instead of following line, we can use int tc[m+1][n+1] or
// dynamically allocate memoery to save space. The following line is
// used to keep te program simple and make it working on all compilers.
int tc[R][C];
tc[0][0] = cost[0][0];
/* Initialize first column of total cost(tc) array */
for (i = 1; i <= m; i++)
tc[i][0] = tc[i-1][0] + cost[i][0];
/* Initialize first row of tc array */
for (j = 1; j <= n; j++)
tc[0][j] = tc[0][j-1] + cost[0][j];
/* Construct rest of the tc array */
for (i = 1; i <= m; i++)
for (j = 1; j <= n; j++)
tc[i][j] = min(tc[i-1][j-1], tc[i-1][j], tc[i][j-1]) + cost[i][j];
return tc[m][n];
}
/* A utility function
int min(int x, int y,
{
if (x < y)
return (x < z)?
else
return (y < z)?
}
Python
#
#
R
C
Time Complexity of the DP implementation is O(mn) which is much better than Naive Recursive implementation.
It should be noted that the above function computes the same subproblems again and again. See the following recursion tree for S = {1, 2, 3} and
n = 5.
The function C({1}, 3) is called two times. If we draw the complete tree, then we can see that there are many subproblems being called more than
once.
C() --> count()
C({1,2,3}, 5)
/
\
/
\
C({1,2,3}, 2)
C({1,2}, 5)
/
\
/
\
/
\
/
\
C({1,2,3}, -1) C({1,2}, 2)
C({1,2}, 3)
C({1}, 5)
/
\
/
\
/
\
/
\
/
\
/
\
C({1,2},0) C({1},2) C({1,2},1) C({1},3)
C({1}, 4) C({}, 5)
/ \
/ \
/ \
/
\
/ \
/ \
/ \
/
\
.
. .
. .
. C({1}, 3) C({}, 4)
/ \
/
\
.
.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So the Coin Change problem has both properties
(see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same
subproblems can be avoided by constructing a temporary array table[][] in bottom up manner.
Dynamic Programming Solution
C
#include<stdio.h>
int count( int S[], int m, int n )
{
int i, j, x, y;
// We need n+1 rows as the table is consturcted in bottom up manner using
// the base case 0 value case (n = 0)
int table[n+1][m];
// Fill the enteries for 0 value case (n = 0)
for (i=0; i<m; i++)
table[0][i] = 1;
// Fill rest of the table enteries in bottom up manner
for (i = 1; i < n+1; i++)
{
for (j = 0; j < m; j++)
{
// Count of solutions including S[j]
x = (i-S[j] >= 0)? table[i - S[j]][j]: 0;
// Count of solutions excluding S[j]
y = (j >= 1)? table[i][j-1]: 0;
// total count
table[i][j] = x + y;
}
}
return table[n][m-1];
}
// Driver program to test above function
int main()
{
int arr[] = {1, 2, 3};
int m = sizeof(arr)/sizeof(arr[0]);
int n = 4;
printf(" %d ", count(arr, m, n));
return 0;
}
Python
# Dynamic Programming Python implementation of Coin Change problem
def count(S, m, n):
# We need n+1 rows as the table is consturcted in bottom up
# manner using the base case 0 value case (n = 0)
table = [[0 for x in range(m)] for x in range(n+1)]
# Fill the enteries for 0 value case (n = 0)
for i in range(m):
table[0][i] = 1
# Fill rest of the table enteries in bottom up manner
for i in range(1, n+1):
for j in range(m):
# Count of solutions including S[j]
x = table[i - S[j]][j] if i-S[j] >= 0 else 0
# Count of solutions excluding S[j]
y = table[i][j-1] if j >= 1 else 0
# total count
table[i][j] = x + y
return table[n][m-1]
# Driver program to test above function
arr = [1, 2, 3]
m = len(arr)
n = 4
print(count(arr, m, n))
# This code is contributed by Bhavya Jain
However, the order in which we parenthesize the product affects the number of simple arithmetic operations needed to compute the product, or
the efficiency. For example, suppose A is a 10 30 matrix, B is a 30 5 matrix, and C is a 5 60 matrix. Then,
(AB)C = (10305) + (10560) = 1500 + 3000 = 4500 operations
A(BC) = (30560) + (103060) = 9000 + 18000 = 27000 operations.
1) Optimal Substructure:
A simple solution is to place parenthesis at all possible places, calculate the cost for each placement and return the minimum value. In a chain of
matrices of size n, we can place the first set of parenthesis in n-1 ways. For example, if the given chain is of 4 matrices. let the chain be ABCD,
then there are 3 way to place first set of parenthesis: A(BCD), (AB)CD and (ABC)D. So when we place a set of parenthesis, we divide the
problem into subproblems of smaller size. Therefore, the problem has optimal substructure property and can be easily solved using recursion.
Minimum number of multiplication needed to multiply a chain of size n = Minimum of all n-1 placements (these placements create subproblems of
smaller size)
2) Overlapping Subproblems
Following is a recursive implementation that simply follows the above optimal substructure property.
/* A naive recursive implementation that simply follows the above optimal
substructure property */
#include<stdio.h>
#include<limits.h>
// Matrix Ai has dimension p[i-1] x p[i] for i = 1..n
int MatrixChainOrder(int p[], int i, int j)
{
if(i == j)
return 0;
int k;
int min = INT_MAX;
int count;
// place parenthesis at different places between first and last matrix,
// recursively calculate count of multiplcations for each parenthesis
// placement and return the minimum count
for (k = i; k <j; k++)
{
count = MatrixChainOrder(p, i, k) +
MatrixChainOrder(p, k+1, j) +
p[i-1]*p[k]*p[j];
if (count < min)
min = count;
}
// Return minimum count
return min;
}
// Driver program to test above function
int main()
{
int arr[] = {1, 2, 3, 4, 3};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of multiplications is %d ",
MatrixChainOrder(arr, 1, n-1));
getchar();
return 0;
}
Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. See the following recursion tree for a matrix chain of size 4. The function MatrixChainOrder(p, 3, 4) is called two times. We can
see that there are many subproblems being called more than once.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So Matrix Chain Multiplication problem has both
properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of
same subproblems can be avoided by constructing a temporary array m[][] in bottom up manner.
Dynamic Programming Solution
Following is C/C++ implementation for Matrix Chain Multiplication problem using Dynamic Programming.
C
// See the Cormen book for details of the following algorithm
#include<stdio.h>
#include<limits.h>
// Matrix Ai has dimension p[i-1] x p[i] for i = 1..n
int MatrixChainOrder(int p[], int n)
{
/* For simplicity of the program, one extra row and one extra column are
allocated in m[][]. 0th row and 0th column of m[][] are not used */
int m[n][n];
int i, j, k, L, q;
/* m[i,j] = Minimum number of scalar multiplications needed to compute
the matrix A[i]A[i+1]...A[j] = A[i..j] where dimention of A[i] is
p[i-1] x p[i] */
// cost is zero when multiplying one matrix.
for (i = 1; i < n; i++)
m[i][i] = 0;
// L is chain length.
for (L=2; L<n; L++)
{
for (i=1; i<=n-L+1; i++)
{
j = i+L-1;
m[i][j] = INT_MAX;
for (k=i; k<=j-1; k++)
{
// q = cost/scalar multiplications
q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j];
if (q < m[i][j])
m[i][j] = q;
}
}
}
return m[1][n-1];
}
int main()
{
int arr[] = {1, 2, 3, 4};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of multiplications is %d ",
MatrixChainOrder(arr, size));
getchar();
return 0;
}
Python
# Dynamic Programming Python implementation of Matrix Chain Multiplication
# See the Cormen book for details of the following algorithm
import sys
# Matrix Ai has dimension p[i-1] x p[i] for i = 1..n
def MatrixChainOrder(p, n):
# For simplicity of the program, one extra row and one extra column are
# allocated in m[][]. 0th row and 0th column of m[][] are not used
m = [[0 for x in range(n)] for x in range(n)]
# m[i,j] = Minimum number of scalar multiplications needed to compute
# the matrix A[i]A[i+1]...A[j] = A[i..j] where dimention of A[i] is
# p[i-1] x p[i]
# cost is zero when multiplying one matrix.
for i in range(1, n):
m[i][i] = 0
# L is chain length.
for L in range(2, n):
for i in range(1, n-L+1):
j = i+L-1
m[i][j] = sys.maxint
for k in range(i, j):
# q = cost/scalar multiplications
q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j]
if q < m[i][j]:
m[i][j] = q
return m[1][n-1]
# Driver program to test above function
arr = [1, 2, 3 ,4]
size = len(arr)
print("Minimum number of multiplications is " + str(MatrixChainOrder(arr, size)))
# This Code is contributed by Bhavya Jain
2) Overlapping Subproblems
Following is simple recursive implementation that simply follows the recursive structure mentioned above.
// A Naive Recursive Implementation
#include<stdio.h>
// Returns value of Binomial Coefficient C(n, k)
int binomialCoeff(int n, int k)
{
// Base Cases
if (k==0 || k==n)
return 1;
// Recur
return binomialCoeff(n-1, k-1) + binomialCoeff(n-1, k);
}
/* Drier program to test above function*/
int main()
{
int n = 5, k = 2;
printf("Value of C(%d, %d) is %d ", n, k, binomialCoeff(n, k));
return 0;
}
It should be noted that the above function computes the same subproblems again and again. See the following recursion tree for n = 5 an k = 2.
The function C(3, 1) is called two times. For large values of n, there will be many common subproblems.
C(5, 2)
/
\
C(4, 1)
C(4, 2)
/ \
/
\
C(3, 0) C(3, 1)
C(3, 1)
C(3, 2)
/
\
/
\
/
\
C(2, 0)
C(2, 1)
C(2, 0) C(2, 1)
C(2, 1) C(2, 2)
/
\
/ \
/
\
C(1, 0) C(1, 1)
C(1, 0) C(1, 1) C(1, 0) C(1, 1)
Since same suproblems are called again, this problem has Overlapping Subprolems property. So the Binomial Coefficient problem has both
properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of
same subproblems can be avoided by constructing a temporary array C[][] in bottom up manner. Following is Dynamic Programming based
implementation.
C
// A Dynamic Programming based solution that uses table C[][] to
// calculate the Binomial Coefficient
#include<stdio.h>
// Prototype of a utility function that returns minimum of two integers
int min(int a, int b);
// Returns value of Binomial Coefficient C(n, k)
int binomialCoeff(int n, int k)
{
int C[n+1][k+1];
int i, j;
Python
# A Dynamic Programming based Python Program that uses table C[][]
# to calculate the Binomial Coefficient
# Returns value of Binomial Coefficient C(n, k)
def binomialCoef(n, k):
C = [[0 for x in range(k+1)] for x in range(n+1)]
# Calculate value of Binomial Coefficient in bottom up manner
for i in range(n+1):
for j in range(min(i, k)+1):
# Base Cases
if j == 0 or j == i:
C[i][j] = 1
# Calculate value using previosly stored values
else:
C[i][j] = C[i-1][j-1] + C[i-1][j]
return C[n][k]
# Driver program to test above function
n = 5
k = 2
print("Value of C[" + str(n) + "][" + str(k) + "] is "
+ str(binomialCoef(n,k)))
# This code is contributed by Bhavya Jain
Value of C[5][2] is 10
C/C++
/* A Naive recursive implementation of 0-1 Knapsack problem */
#include<stdio.h>
// A utility function that returns maximum of two integers
int max(int a, int b) { return (a > b)? a : b; }
// Returns the maximum value that can be put in a knapsack of capacity W
int knapSack(int W, int wt[], int val[], int n)
{
// Base Case
if (n == 0 || W == 0)
return 0;
// If weight of the nth item is more than Knapsack capacity W, then
// this item cannot be included in the optimal solution
if (wt[n-1] > W)
return knapSack(W, wt, val, n-1);
// Return the maximum of two cases:
// (1) nth item included
// (2) not included
else return max( val[n-1] + knapSack(W-wt[n-1], wt, val, n-1),
knapSack(W, wt, val, n-1)
);
}
// Driver program to test above function
int main()
{
int val[] = {60, 100, 120};
int wt[] = {10, 20, 30};
int W = 50;
int n = sizeof(val)/sizeof(val[0]);
printf("%d", knapSack(W, wt, val, n));
return 0;
}
Python
#A naive recursive implementation of 0-1 Knapsack Problem
# Returns the maximum value that can be put in a knapsack of
# capacity W
def knapSack(W , wt , val , n):
# Base Case
if n == 0 or W == 0 :
return 0
# If weight of the nth item is more than Knapsack of capacity
# W, then this item cannot be included in the optimal solution
if (wt[n-1] > W):
220
It should be noted that the above function computes the same subproblems again and again. See the following recursion tree, K(1, 1) is being
evaluated twice. Time complexity of this naive recursive solution is exponential (2^n).
In the following recursion tree, K() refers to knapSack(). The two
parameters indicated in the following recursion tree are n and W.
The recursion tree is for following sample inputs.
wt[] = {1, 1, 1}, W = 2, val[] = {10, 20, 30}
K(3, 2)
/
/
K(2,2)
/
\
/
K(1,2)
/ \
/
\
K(0,2) K(0,1)
Recursion tree
---------> K(n, W)
\
\
K(2,1)
/
\
\
/
\
K(1,1)
K(1,1)
K(1,0)
/ \
/ \
/
\
/
\
K(0,1) K(0,0) K(0,1) K(0,0)
for Knapsack capacity 2 units and 3 items of 1 unit weight.
Since suproblems are evaluated again, this problem has Overlapping Subprolems property. So the 0-1 Knapsack problem has both properties
(see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same
subproblems can be avoided by constructing a temporary array K[][] in bottom up manner. Following is Dynamic Programming based
implementation.
C++
// A Dynamic Programming based solution for 0-1 Knapsack problem
#include<stdio.h>
// A utility function that returns maximum of two integers
int max(int a, int b) { return (a > b)? a : b; }
// Returns the maximum value that can be put in a knapsack of capacity W
int knapSack(int W, int wt[], int val[], int n)
{
int i, w;
int K[n+1][W+1];
// Build table K[][] in bottom up manner
for (i = 0; i <= n; i++)
{
for (w = 0; w <= W; w++)
{
if (i==0 || w==0)
K[i][w] = 0;
else if (wt[i-1] <= w)
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w]);
else
K[i][w] = K[i-1][w];
}
}
return K[n][W];
}
int main()
{
int val[] = {60, 100, 120};
int wt[] = {10, 20, 30};
int W = 50;
int n = sizeof(val)/sizeof(val[0]);
printf("%d", knapSack(W, wt, val, n));
return 0;
}
Pyhton
# A Dynamic Programming based Python Program for 0-1 Knapsack problem
# Returns the maximum value that can be put in a knapsack of capacity W
def knapSack(W, wt, val, n):
K = [[0 for x in range(W+1)] for x in range(n+1)]
# Build table K[][] in bottom up manner
for i in range(n+1):
for w in range(W+1):
if i==0 or w==0:
K[i][w] = 0
elif wt[i-1] <= w:
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w])
else:
K[i][w] = K[i-1][w]
return K[n][W]
# Driver program to test above function
val = [60, 100, 120]
wt = [10, 20, 30]
W = 50
n = len(val)
print(knapSack(W, wt, val, n))
# This code is contributed by Bhavya Jain
220
Time Complexity: O(nW) where n is the number of items and W is the capacity of knapsack.
References:
http://www.es.ele.tue.nl/education/5MC10/Solutions/knapsack.pdf
http://www.cse.unl.edu/~goddard/Courses/CSCE310J/Lectures/Lecture8-DynamicProgramming.pdf
2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.
# include <stdio.h>
# include <limits.h>
// A utility function to get maximum of two integers
int max(int a, int b) { return (a > b)? a: b; }
/* Function to get minimum number of trails needed in worst
case with n eggs and k floors */
int eggDrop(int n, int k)
{
// If there are no floors, then no trials needed. OR if there is
// one floor, one trial needed.
if (k == 1 || k == 0)
return k;
// We need k trials for one egg and k floors
if (n == 1)
return k;
int min = INT_MAX, x, res;
// Consider all droppings from 1st floor to kth floor and
// return the minimum of these values plus 1.
for (x = 1; x <= k; x++)
{
res = max(eggDrop(n-1, x-1), eggDrop(n, k-x));
if (res < min)
min = res;
}
return min + 1;
}
/* Driver program to test to pront printDups*/
int main()
{
int n = 2, k = 10;
printf ("\nMinimum number of trials in worst case with %d eggs and "
"%d floors is %d \n", n, k, eggDrop(n, k));
return 0;
}
Output:
Minimum number of trials in worst case with 2 eggs and 10 floors is 4
It should be noted that the above function computes the same subproblems again and again. See the following partial recursion tree, E(2, 2) is
being evaluated twice. There will many repeated subproblems when you draw the complete recursion tree even for small values of n and k.
E(2,4)
|
------------------------------------|
|
|
|
|
|
|
|
x=1/\
x=2/\
x=3/ \
x=4/ \
/ \
/ \
....
....
/
\
/
\
E(1,0) E(2,3)
E(1,1) E(2,2)
/\ /\...
/ \
x=1/ \
.....
/
\
E(1,0) E(2,2)
/ \
......
Partial recursion tree for 2 eggs and 4 floors.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So Egg Dropping Puzzle has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems
can be avoided by constructing a temporary array eggFloor[][] in bottom up manner.
Dynamic Programming Solution
Following are C++ and Python implementations for Egg Dropping problem using Dynamic Programming.
C++
# A Dynamic Programming based C++ Program for the Egg Dropping Puzzle
# include <stdio.h>
# include <limits.h>
// A utility function to get maximum of two integers
int max(int a, int b) { return (a > b)? a: b; }
/* Function to get minimum number of trails needed in worst
case with n eggs and k floors */
int eggDrop(int n, int k)
{
/* A 2D table where entery eggFloor[i][j] will represent minimum
number of trials needed for i eggs and j floors. */
int eggFloor[n+1][k+1];
int res;
int i, j, x;
// We need one trial for one floor and0 trials for 0 floors
for (i = 1; i <= n; i++)
{
eggFloor[i][1] = 1;
eggFloor[i][0] = 0;
}
// We always need j trials for one egg and j floors.
for (j = 1; j <= k; j++)
eggFloor[1][j] = j;
Python
# A Dynamic Programming based Python Program for the Egg Dropping Puzzle
INT_MAX = 32767
# Function to get minimum number of trails needed in worst
# case with n eggs and k floors
def eggDrop(n, k):
# A 2D table where entery eggFloor[i][j] will represent minimum
# number of trials needed for i eggs and j floors.
eggFloor = [[0 for x in range(k+1)] for x in range(n+1)]
# We need one trial for one floor and0 trials for 0 floors
for i in range(1, n+1):
eggFloor[i][1] = 1
eggFloor[i][0] = 0
# We always need j trials for one egg and j floors.
for j in range(1, k+1):
eggFloor[1][j] = j
# Fill rest of the entries in table using optimal substructure
# property
for i in range(2, n+1):
for j in range(2, k+1):
eggFloor[i][j] = INT_MAX
for x in range(1, j+1):
res = 1 + max(eggFloor[i-1][x-1], eggFloor[i][j-x])
if res < eggFloor[i][j]:
eggFloor[i][j] = res
# eggFloor[n][k] holds the result
return eggFloor[n][k]
# Driver program to test to pront printDups
n = 2
k = 36
print("Minimum number of trials in worst case with" + str(n) + "eggs and " \
+ str(k) + " floors is " + str(eggDrop(n, k)))
# This code is contributed by Bhavya Jain
Output:
Minimum number of trials in worst case with 2 eggs and 36 floors is 8
References:
http://archive.ite.journal.informs.org/Vol4No1/Sniedovich/index.php
2) Overlapping Subproblems
Following is simple recursive implementation of the LPS problem. The implementation simply follows the recursive structure mentioned above.
#include<stdio.h>
#include<string.h>
// A utility function to get max of two integers
int max (int x, int y) { return (x > y)? x : y; }
// Returns the length of the longest palindromic subsequence in seq
int lps(char *seq, int i, int j)
{
// Base Case 1: If there is only 1 character
if (i == j)
return 1;
// Base Case 2: If there are only 2 characters and both are same
if (seq[i] == seq[j] && i + 1 == j)
return 2;
// If the first and last characters match
if (seq[i] == seq[j])
return lps (seq, i+1, j-1) + 2;
// If the first and last characters do not match
return max( lps(seq, i, j-1), lps(seq, i+1, j) );
}
/* Driver program to test above functions */
int main()
{
char seq[] = "GEEKSFORGEEKS";
int n = strlen(seq);
printf ("The lnegth of the LPS is %d", lps(seq, 0, n-1));
getchar();
return 0;
}
Output:
The lnegth of the LPS is 5
Considering the above implementation, following is a partial recursion tree for a sequence of length 6 with all different characters.
/
L(0, 5)
\
/
\
L(1,5)
L(0,4)
/
\
/
\
/
\
/
\
L(2,5)
L(1,4) L(1,4) L(0,3)
In the above partial recursion tree, L(1, 4) is being solved twice. If we draw the complete recursion tree, then we can see that there are many
subproblems which are solved again and again. Since same suproblems are called again, this problem has Overlapping Subprolems property. So
LPS problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems,
recomputations of same subproblems can be avoided by constructing a temporary array L[][] in bottom up manner.
Dynamic Programming Solution
C++
# A Dynamic Programming based Python program for LPS problem
# Returns the length of the longest palindromic subsequence in seq
#include<stdio.h>
#include<string.h>
// A utility function to get max of two integers
int max (int x, int y) { return (x > y)? x : y; }
// Returns the length of the longest palindromic subsequence in seq
int lps(char *str)
{
int n = strlen(str);
int i, j, cl;
int L[n][n]; // Create a table to store results of subproblems
// Strings of length 1 are palindrome of lentgh 1
for (i = 0; i < n; i++)
L[i][i] = 1;
// Build the table. Note that the lower diagonal values of table are
// useless and not filled in the process. The values are filled in a
// manner similar to Matrix Chain Multiplication DP solution (See
// http://www.geeksforgeeks.org/archives/15553). cl is length of
// substring
for (cl=2; cl<=n; cl++)
{
for (i=0; i<n-cl+1; i++)
{
j = i+cl-1;
if (str[i] == str[j] && cl == 2)
L[i][j] = 2;
else if (str[i] == str[j])
L[i][j] = L[i+1][j-1] + 2;
else
L[i][j] = max(L[i][j-1], L[i+1][j]);
}
}
return L[0][n-1];
}
/* Driver program to test above functions */
int main()
{
char seq[] = "GEEKS FOR GEEKS";
int n = strlen(seq);
printf ("The lnegth of the LPS is %d", lps(seq));
getchar();
return 0;
}
Python
# A Dynamic Programming based Python program for LPS problem
# Returns the length of the longest palindromic subsequence in seq
def lps(str):
n = len(str)
# Create a table to store results of subproblems
L = [[0 for x in range(n)] for x in range(n)]
# Strings of length 1 are palindrome of length 1
for i in range(n):
L[i][i] = 1
# Build the table. Note that the lower diagonal values of table are
# useless and not filled in the process. The values are filled in a
# manner similar to Matrix Chain Multiplication DP solution (See
# http://www.geeksforgeeks.org/dynamic-programming-set-8-matrix-chain-multiplication/
# cl is length of substring
for cl in range(2, n+1):
for i in range(n-cl+1):
j = i+cl-1
if str[i] == str[j] and cl == 2:
L[i][j] = 2
elif str[i] == str[j]:
L[i][j] = L[i+1][j-1] + 2
else:
L[i][j] = max(L[i][j-1], L[i+1][j]);
return L[0][n-1]
# Driver program to test above functions
seq = "GEEKS FOR GEEKS"
n = len(seq)
print("The length of the LPS is " + str(lps(seq)))
# This code is contributed by Bhavya Jain
Time Complexity of the above implementation is O(n^2) which is much better than the worst case time complexity of Naive Recursive
implementation.
This problem is close to the Longest Common Subsequence (LCS) problem. In fact, we can use LCS as a subroutine to solve this problem.
Following is the two step solution that uses LCS.
1) Reverse the given sequence and store the reverse in another array say rev[0..n-1]
2) LCS of the given sequence and rev[] will be the longest palindromic sequence.
This solution is also a O(n^2) solution.
References:
http://users.eecs.northwestern.edu/~dda902/336/hw6-sol.pdf
And if the prices are as following, then the maximum obtainable value is 24 (by cutting in eight pieces of length 1)
length | 1 2 3 4 5 6 7 8
-------------------------------------------price
| 3 5 8 9 10 17 17 20
The naive solution for this problem is to generate all configurations of different pieces and find the highest priced configuration. This solution is
exponential in term of time complexity. Let us see how this problem possesses both important properties of a Dynamic Programming (DP)
Problem and can efficiently solved using Dynamic Programming.
1) Optimal Substructure:
We can get the best price by making a cut at different positions and comparing the values obtained after a cut. We can recursively call the same
function for a piece obtained after a cut.
Let cutRoad(n) be the required (best possible price) value for a rod of lenght n. cutRod(n) can be written as following.
cutRod(n) = max(price[i] + cutRod(n-i-1)) for all i in {0, 1 .. n-1}
2) Overlapping Subproblems
Following is simple recursive implementation of the Rod Cutting problem. The implementation simply follows the recursive structure mentioned
above.
// A Naive recursive solution for Rod cutting problem
#include<stdio.h>
#include<limits.h>
// A utility function to get the maximum of two integers
int max(int a, int b) { return (a > b)? a : b;}
/* Returns the best obtainable price for a rod of length n and
price[] as prices of different pieces */
int cutRod(int price[], int n)
{
if (n <= 0)
return 0;
int max_val = INT_MIN;
// Recursively cut the rod in different pieces and compare different
// configurations
for (int i = 0; i<n; i++)
max_val = max(max_val, price[i] + cutRod(price, n-i-1));
return max_val;
}
/* Driver program to test above functions */
int main()
{
int arr[] = {1, 5, 8, 9, 10, 17, 17, 20};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Maximum Obtainable Value is %d\n", cutRod(arr, size));
getchar();
return 0;
}
Output:
Maximum Obtainable Value is 22
Considering the above implementation, following is recursion tree for a Rod of length 4.
cR() ---> cutRod()
/
/
/
/
cR(4)
\
\
\
\
cR(3)
/ | \
/ | \
cR(2) cR(1) cR(0)
/ \
|
/ \
|
cR(1) cR(0) cR(0)
cR(2)
cR(1) cR(0)
/ \
|
/
\
|
cR(1) cR(0) cR(0)
|
|
cR(0)
In the above partial recursion tree, cR(2) is being solved twice. We can see that there are many subproblems which are solved again and again.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So the Rod Cutting problem has both properties
(see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same
subproblems can be avoided by constructing a temporary array val[] in bottom up manner.
C++
// A Dynamic Programming solution for Rod cutting problem
#include<stdio.h>
#include<limits.h>
// A utility function to get the maximum of two integers
int max(int a, int b) { return (a > b)? a : b;}
/* Returns the best obtainable price for a rod of length n and
price[] as prices of different pieces */
int cutRod(int price[], int n)
{
int val[n+1];
val[0] = 0;
int i, j;
// Build the table val[] in bottom up manner and return the last entry
// from the table
for (i = 1; i<=n; i++)
{
int max_val = INT_MIN;
for (j = 0; j < i; j++)
max_val = max(max_val, price[j] + val[i-j-1]);
val[i] = max_val;
}
return val[n];
}
/* Driver program to test above functions */
int main()
{
int arr[] = {1, 5, 8, 9, 10, 17, 17, 20};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Maximum Obtainable Value is %d\n", cutRod(arr, size));
getchar();
return 0;
}
Python
# A Dynamic Programming solution for Rod cutting problem
INT_MIN = -32767
# Returns the best obtainable price for a rod of length n and
# price[] as prices of different pieces
def cutRod(price, n):
val = [0 for x in range(n+1)]
val[0] = 0
# Build the table val[] in bottom up manner and return
# the last entry from the table
for i in range(1, n+1):
max_val = INT_MIN
for j in range(i):
max_val = max(max_val, price[j] + val[i-j-1])
val[i] = max_val
return val[n]
# Driver program to test above functions
arr = [1, 5, 8, 9, 10, 17, 17, 20]
size = len(arr)
print("Maximum Obtainable Value is " + str(cutRod(arr, size)))
Time Complexity of the above implementation is O(n^2) which is much better than the worst case time complexity of Naive Recursive
implementation.
C/C++
/* Dynamic Programming implementation of Maximum Sum Increasing
Subsequence (MSIS) problem */
#include<stdio.h>
/* maxSumIS() returns the maximum sum of increasing subsequence
in arr[] of size n */
int maxSumIS( int arr[], int n )
{
int i, j, max = 0;
int msis[n];
/* Initialize msis values for all indexes */
for ( i = 0; i < n; i++ )
msis[i] = arr[i];
/* Compute maximum sum values in bottom up manner */
for ( i = 1; i < n; i++ )
for ( j = 0; j < i; j++ )
if ( arr[i] > arr[j] && msis[i] < msis[j] + arr[i])
msis[i] = msis[j] + arr[i];
/* Pick maximum of all msis values */
for ( i = 0; i < n; i++ )
if ( max < msis[i] )
max = msis[i];
return max;
}
/* Driver program to test above function */
int main()
{
int arr[] = {1, 101, 2, 3, 100, 4, 5};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Sum of maximum sum increasing subsequence is %d\n",
maxSumIS( arr, n ) );
return 0;
}
Python
# Dynamic Programming bsed Python implementation of Maximum Sum Increasing
# Subsequence (MSIS) problem
# maxSumIS() returns the maximum sum of increasing subsequence in arr[] of
# size n
def maxSumIS(arr, n):
max = 0
msis = [0 for x in range(n)]
# Initialize msis values for all indexes
for i in range(n):
msis[i] = arr[i]
# Compute maximum sum values in bottom up manner
for i in range(1, n):
for j in range(i):
if arr[i] > arr[j] and msis[i] < msis[j] + arr[i]:
msis[i] = msis[j] + arr[i]
# Pick maximum of all msis values
for i in range(n):
if max < msis[i]:
max = msis[i]
return max
# Driver program to test above function
arr = [1, 101, 2, 3, 100, 4, 5]
n = len(arr)
print("Sum of maximum sum increasing subsequence is " +
str(maxSumIS(arr, n)))
# This code is contributed by Bhavya Jain
Output:
Sum of maximum sum increasing subsequence is 106
C++
/* Dynamic Programming implementation of longest bitonic subsequence problem */
#include<stdio.h>
#include<stdlib.h>
/* lbs() returns the length of the Longest Bitonic Subsequence in
arr[] of size n. The function mainly creates two temporary arrays
lis[] and lds[] and returns the maximum lis[i] + lds[i] - 1.
lis[i] ==> Longest Increasing subsequence ending with arr[i]
lds[i] ==> Longest decreasing subsequence starting with arr[i]
*/
int lbs( int arr[], int n )
{
int i, j;
/* Allocate memory for LIS[] and initialize LIS values as 1 for
all indexes */
int *lis = new int[n];
for (i = 0; i < n; i++)
lis[i] = 1;
/* Compute LIS values from left to right */
for (i = 1; i < n; i++)
for (j = 0; j < i; j++)
if (arr[i] > arr[j] && lis[i] < lis[j] + 1)
lis[i] = lis[j] + 1;
/* Allocate memory for lds and initialize LDS values for
all indexes */
int *lds = new int [n];
for (i = 0; i < n; i++)
lds[i] = 1;
/* Compute LDS values from right to left */
for (i = n-2; i >= 0; i--)
for (j = n-1; j > i; j--)
if (arr[i] > arr[j] && lds[i] < lds[j] + 1)
lds[i] = lds[j] + 1;
/* Return the maximum value
int max = lis[0] + lds[0] for (i = 1; i < n; i++)
if (lis[i] + lds[i] - 1 >
max = lis[i] + lds[i]
return max;
}
/* Driver program to test above function */
int main()
{
int arr[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5,
13, 3, 11, 7, 15};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of LBS is %d\n", lbs( arr, n ) );
return 0;
}
Java
/* Dynamic Programming implementation in Java for longest bitonic
subsequence problem */
import java.util.*;
import java.lang.*;
import java.io.*;
class LBS
{
/* lbs() returns the length of the Longest Bitonic Subsequence in
arr[] of size n. The function mainly creates two temporary arrays
lis[] and lds[] and returns the maximum lis[i] + lds[i] - 1.
lis[i] ==>
lds[i] ==>
*/
static int
{
int i,
Length of LBS is 7
9
4
1
0
C/C++
// C Program for Floyd Warshall Algorithm
#include<stdio.h>
// Number of vertices in the graph
#define V 4
/* Define Infinite as a large enough value. This value will be used
for vertices not connected to each other */
#define INF 99999
// A function to print the solution matrix
void printSolution(int dist[][V]);
// Solves the all-pairs shortest path problem using Floyd Warshall algorithm
void floydWarshell (int graph[][V])
{
/* dist[][] will be the output matrix that will finally have the shortest
distances between every pair of vertices */
int dist[V][V], i, j, k;
/* Initialize the solution matrix same as input graph matrix. Or
we can say the initial values of shortest distances are based
on shortest paths considering no intermediate vertex. */
Java
// A Java program for Floyd Warshall All Pairs Shortest
// Path algorithm.
import java.util.*;
import java.lang.*;
import java.io.*;
class AllPairShortestPath
{
final static int INF = 99999, V = 4;
void floydWarshall(int graph[][])
{
int dist[][] = new int[V][V];
int i, j, k;
/* Initialize the solution matrix same as input graph matrix.
Or we can say the initial values of shortest distances
are based on shortest paths considering no intermediate
vertex. */
for (i = 0; i < V; i++)
for (j = 0; j < V; j++)
dist[i][j] = graph[i][j];
/* Add all vertices one by one to the set of intermediate
vertices.
---> Before start of a iteration, we have shortest
distances between all pairs of vertices such that
the shortest distances consider only the vertices in
set {0, 1, 2, .. k-1} as intermediate vertices.
----> After the end of a iteration, vertex no. k is added
to the set of intermediate vertices and the set
becomes {0, 1, 2, .. k} */
for (k = 0; k < V; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as destination for the
// above picked source
for (j = 0; j < V; j++)
{
// If vertex k is on the shortest path from
// i to j, then update the value of dist[i][j]
if (dist[i][k] + dist[k][j] < dist[i][j])
dist[i][j] = dist[i][k] + dist[k][j];
}
}
}
// Print the shortest distance matrix
printSolution(dist);
}
void printSolution(int dist[][])
{
System.out.println("Following matrix shows the shortest "+
"distances between every pair of vertices");
for (int i=0; i<V; ++i)
{
for (int j=0; j<V; ++j)
{
if (dist[i][j]==INF)
System.out.print("INF ");
else
System.out.print(dist[i][j]+" ");
}
System.out.println();
}
}
// Driver program to test above function
public static void main (String[] args)
{
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
int graph[][] = { {0, 5, INF, 10},
{INF, 0, 3, INF},
{INF, INF, 0, 1},
{INF, INF, INF, 0}
};
AllPairShortestPath a = new AllPairShortestPath();
// Print the solution
a.floydWarshall(graph);
}
}
// Contributed by Aakash Hasija
Output:
Following matrix shows the shortest distances between every pair of vertices
0
5
8
9
INF
0
3
4
INF
INF
0
1
INF
INF
INF
0
Following is Dynamic Programming solution. It stores the solutions to subproblems in two arrays P[][] and C[][], and reuses the calculated values.
// Dynamic Programming Solution for Palindrome Partitioning Problem
#include <stdio.h>
#include <string.h>
#include <limits.h>
// A utility function to get minimum of two integers
int min (int a, int b) { return (a < b)? a : b; }
// Returns the minimum number of cuts needed to partition a string
// such that every part is a palindrome
int minPalPartion(char *str)
{
// Get the length of the string
int n = strlen(str);
/* Create two arrays to build the solution in bottom up manner
C[i][j] = Minimum number of cuts needed for palindrome partitioning
of substring str[i..j]
P[i][j] = true if substring str[i..j] is palindrome, else false
Note that C[i][j] is 0 if P[i][j] is true */
int C[n][n];
bool P[n][n];
int i, j, k, L; // different looping variables
// Every substring of length 1 is a palindrome
for (i=0; i<n; i++)
{
P[i][i] = true;
C[i][i] = 0;
}
/* L is substring length. Build the solution in bottom up manner by
considering all substrings of length starting from 2 to n.
The loop structure is same as Matrx Chain Multiplication problem (
See http://www.geeksforgeeks.org/archives/15553 )*/
for (L=2; L<=n; L++)
{
// For substring of length L, set different possible starting indexes
for (i=0; i<n-L+1; i++)
{
j = i+L-1; // Set ending index
// If L is 2, then we just need to compare two characters. Else
// need to check two corner characters and value of P[i+1][j-1]
if (L == 2)
P[i][j] = (str[i] == str[j]);
else
P[i][j] = (str[i] == str[j]) && P[i+1][j-1];
// IF str[i..j] is palindrome, then C[i][j] is 0
if (P[i][j] == true)
C[i][j] = 0;
else
{
// Make a cut at every possible localtion starting from i to j,
// and get the minimum cost cut.
C[i][j] = INT_MAX;
for (k=i; k<=j-1; k++)
C[i][j] = min (C[i][j], C[i][k] + C[k+1][j]+1);
}
}
}
// Return the min cut value for complete string. i.e., str[0..n-1]
return C[0][n-1];
}
// Driver program to test above function
int main()
{
char str[] = "ababbbabbababa";
printf("Min cuts needed for Palindrome Partitioning is %d",
minPalPartion(str));
return 0;
}
Output:
Min cuts needed for Palindrome Partitioning is 3
Output:
Min cuts needed for Palindrome Partitioning is 3
C/C++
// A recursive C program for partition problem
#include <stdio.h>
// A utility function that returns true if there is
// a subset of arr[] with sun equal to given sum
bool isSubsetSum (int arr[], int n, int sum)
{
// Base Cases
if (sum == 0)
return true;
if (n == 0 && sum != 0)
return false;
// If last element is greater than sum, then
// ignore it
if (arr[n-1] > sum)
return isSubsetSum (arr, n-1, sum);
/* else, check if sum can be obtained by any of
the following
(a) including the last element
(b) excluding the last element
*/
return isSubsetSum (arr, n-1, sum) ||
isSubsetSum (arr, n-1, sum-arr[n-1]);
}
// Returns true if arr[] can be partitioned in two
// subsets of equal sum, otherwise false
bool findPartiion (int arr[], int n)
{
// Calculate sum of the elements in array
int sum = 0;
for (int i = 0; i < n; i++)
sum += arr[i];
// If sum is odd, there cannot be two subsets
// with equal sum
if (sum%2 != 0)
return false;
Java
// A recursive Java solution for partition problem
import java.io.*;
class Partition
{
// A utility function that returns true if there is a
// subset of arr[] with sun equal to given sum
static boolean isSubsetSum (int arr[], int n, int sum)
{
// Base Cases
if (sum == 0)
return true;
if (n == 0 && sum != 0)
return false;
// If last element is greater than sum, then ignore it
if (arr[n-1] > sum)
return isSubsetSum (arr, n-1, sum);
/* else, check if sum can be obtained by any of
the following
(a) including the last element
(b) excluding the last element
*/
return isSubsetSum (arr, n-1, sum) ||
isSubsetSum (arr, n-1, sum-arr[n-1]);
}
// Returns true if arr[] can be partitioned in two
// subsets of equal sum, otherwise false
static boolean findPartition (int arr[], int n)
{
// Calculate sum of the elements in array
int sum = 0;
for (int i = 0; i < n; i++)
sum += arr[i];
// If sum is odd, there cannot be two subsets
// with equal sum
if (sum%2 != 0)
return false;
// Find if there is subset with sum equal to half
// of total sum
return isSubsetSum (arr, n, sum/2);
}
/*Driver function to check for above function*/
public static void main (String[] args)
{
int arr[] = {3, 1, 5, 9, 12};
int n = arr.length;
if (findPartition(arr, n) == true)
System.out.println("Can be divided into two "+
"subsets of equal sum");
else
System.out.println("Can not be divided into " +
Time Complexity: O(2^n) In worst case, this solution tries two possibilities (whether to include or exclude) for every element.
C/C++
// A Dynamic Programming based C program to partition problem
#include <stdio.h>
// Returns true if arr[] can be partitioned in two subsets of
// equal sum, otherwise false
bool findPartiion (int arr[], int n)
{
int sum = 0;
int i, j;
// Caculcate sun of all elements
for (i = 0; i < n; i++)
sum += arr[i];
if (sum%2 != 0)
return false;
bool part[sum/2+1][n+1];
// initialize top row as true
for (i = 0; i <= n; i++)
part[0][i] = true;
// initialize leftmost column, except part[0][0], as 0
for (i = 1; i <= sum/2; i++)
part[i][0] = false;
// Fill the partition table in botton up manner
for (i = 1; i <= sum/2; i++)
{
for (j = 1; j <= n; j++)
{
part[i][j] = part[i][j-1];
if (i >= arr[j-1])
part[i][j] = part[i][j] || part[i - arr[j-1]][j-1];
}
}
/* // uncomment this part to print table
for (i = 0; i <= sum/2; i++)
{
for (j = 0; j <= n; j++)
printf ("%4d", part[i][j]);
printf("\n");
} */
return part[sum/2][n];
}
// Driver program to test above funtion
int main()
{
int arr[] = {3, 1, 1, 2, 2, 1};
int n = sizeof(arr)/sizeof(arr[0]);
if (findPartiion(arr, n) == true)
printf("Can be divided into two subsets of equal sum");
else
Java
// A dynamic programming based Java program for partition problem
import java.io.*;
class Partition {
// Returns true if arr[] can be partitioned in two subsets of
// equal sum, otherwise false
static boolean findPartition (int arr[], int n)
{
int sum = 0;
int i, j;
// Caculcate sun of all elements
for (i = 0; i < n; i++)
sum += arr[i];
if (sum%2 != 0)
return false;
boolean part[][]=new boolean[sum/2+1][n+1];
// initialize top row as true
for (i = 0; i <= n; i++)
part[0][i] = true;
// initialize leftmost column, except part[0][0], as 0
for (i = 1; i <= sum/2; i++)
part[i][0] = false;
// Fill the partition table in botton up manner
for (i = 1; i <= sum/2; i++)
{
for (j = 1; j <= n; j++)
{
part[i][j] = part[i][j-1];
if (i >= arr[j-1])
part[i][j] = part[i][j] ||
part[i - arr[j-1]][j-1];
}
}
/* // uncomment this part to print table
for (i = 0; i <= sum/2; i++)
{
for (j = 0; j <= n; j++)
printf ("%4d", part[i][j]);
printf("\n");
} */
return part[sum/2][n];
}
/*Driver function to check for above function*/
public static void main (String[] args)
{
int arr[] = {3, 1, 1, 2, 2,1};
int n = arr.length;
if (findPartition(arr, n) == true)
System.out.println("Can be divided into two "
"subsets of equal sum");
else
System.out.println("Can not be divided into"
" two subsets of equal sum");
}
}
/* This code is contributed by Devesh Agrawal */
Following diagram shows the values in partition table. The diagram is taken form the wiki page of partition problem.
Please note that the total cost function is not sum of extra spaces, but sum of cubes (or square is also used) of extra spaces. The idea behind this
cost function is to balance the spaces among lines. For example, consider the following two arrangement of same set of words:
1) There are 3 lines. One line has 3 extra spaces and all other lines have 0 extra spaces. Total extra spaces = 3 + 0 + 0 = 3. Total cost = 3*3*3 +
0*0*0 + 0*0*0 = 27.
2) There are 3 lines. Each of the 3 lines has one extra space. Total extra spaces = 1 + 1 + 1 = 3. Total cost = 1*1*1 + 1*1*1 + 1*1*1 = 3.
Total extra spaces are 3 in both scenarios, but second arrangement should be preferred because extra spaces are balanced in all three lines. The
cost function with cubic sum serves the purpose because the value of total cost in second scenario is less.
Method 1 (Greedy Solution)
The greedy solution is to place as many words as possible in the first line. Then do the same thing for the second line and so on until all words are
placed. This solution gives optimal solution for many cases, but doesnt give optimal solution in all cases. For example, consider the following string
aaa bb cc ddddd and line width as 6. Greedy method will produce following output.
aaa bb
cc
ddddd
Extra spaces in the above 3 lines are 0, 4 and 1 respectively. So total cost is 0 + 64 + 1 = 65.
But the above solution is not the best solution. Following arrangement has more balanced spaces. Therefore less value of total cost function.
aaa
bb cc
ddddd
Extra spaces in the above 3 lines are 3, 1 and 1 respectively. So total cost is 27 + 1 + 1 = 29.
Despite being sub-optimal in some cases, the greedy approach is used by many word processors like MS Word and OpenOffice.org Writer.
Method 2 (Dynamic Programming)
The following Dynamic approach strictly follows the algorithm given in solution of Cormen book. First we compute costs of all possible lines in a
2D table lc[][]. The value lc[i][j] indicates the cost to put words from i to j in a single line where i and j are indexes of words in the input
sequences. If a sequence of words from i to j cannot fit in a single line, then lc[i][j] is considered infinite (to avoid it from being a part of the
solution). Once we have the lc[][] table constructed, we can calculate total cost using following recursive formula. In the following formula, C[j] is
the optimized total cost for arranging words from 1 to j.
The above recursion has overlapping subproblem property. For example, the solution of subproblem c(2) is used by c(3), C(4) and so on. So
Dynamic Programming is used to store the results of subproblems. The array c[] can be computed from left to right, since each value depends only
on earlier values.
To print the output, we keep track of what words go on what lines, we can keep a parallel p array that points to where each c value came from.
The last line starts at word p[n] and goes through word n. The previous line starts at word p[p[n]] and goes through word p[n] 1, etc. The function
printSolution() uses p[] to print the solution.
In the below program, input is an array l[] that represents lengths of words in a sequence. The value l[i] indicates length of the ith word (i starts
from 1) in theinput sequence.
// A Dynamic programming solution for Word Wrap Problem
#include <limits.h>
#include <stdio.h>
#define INF INT_MAX
// A utility function to print the solution
int printSolution (int p[], int n);
// l[] represents lengths of different words in input sequence. For example,
// l[] = {3, 2, 2, 5} is for a sentence like "aaa bb cc ddddd". n is size of
// l[] and M is line width (maximum no. of characters that can fit in a line)
void solveWordWrap (int l[], int n, int M)
{
// For simplicity, 1 extra space is used in all below arrays
// extras[i][j] will have number of extra spaces if words from i
// to j are put in a single line
int extras[n+1][n+1];
// lc[i][j] will have cost of a line which has words from
// i to j
int lc[n+1][n+1];
// c[i] will have total cost of optimal arrangement of words
// from 1 to i
int c[n+1];
// p[] is used to print the solution.
int p[n+1];
int i, j;
// calculate extra spaces in a single line. The value extra[i][j]
// indicates extra spaces if words from word number i to j are
// placed in a single line
for (i = 1; i <= n; i++)
{
extras[i][i] = M - l[i-1];
for (j = i+1; j <= n; j++)
extras[i][j] = extras[i][j-1] - l[j-1] - 1;
}
// Calculate line cost corresponding to the above calculated extra
// spaces. The value lc[i][j] indicates cost of putting words from
// word number i to j in a single line
for (i = 1; i <= n; i++)
{
for (j = i; j <= n; j++)
{
if (extras[i][j] < 0)
lc[i][j] = INF;
else if (j == n && extras[i][j] >= 0)
lc[i][j] = 0;
else
lc[i][j] = extras[i][j]*extras[i][j];
}
}
// Calculate minimum cost and find minimum cost arrangement.
// The value c[j] indicates optimized cost to arrange words
// from word number 1 to j.
c[0] = 0;
for (j = 1; j <= n; j++)
{
c[j] = INF;
for (i = 1; i <= j; i++)
{
if (c[i-1] != INF && lc[i][j] != INF && (c[i-1] + lc[i][j] < c[j]))
{
c[j] = c[i-1] + lc[i][j];
p[j] = i;
}
}
}
printSolution(p, n);
}
int printSolution (int p[], int n)
{
int k;
if (p[n] == 1)
k = 1;
else
k = printSolution (p, p[n]-1) + 1;
printf ("Line number %d: From word no. %d to %d \n", k, p[n], n);
return k;
}
// Driver program to test above functions
int main()
{
int l[] = {3, 2, 2, 5};
int n = sizeof(l)/sizeof(l[0]);
int M = 6;
solveWordWrap (l, n, M);
return 0;
}
Output:
Line number 1: From word no. 1 to 1
Line number 2: From word no. 2 to 3
Line number 3: From word no. 4 to 4
Output:
Length of maximum size chain is 3
The Box Stacking problem is a variation of LIS problem. We need to build a maximum height stack.
Following are the key points to note in the problem statement:
1) A box can be placed on top of another box only if both width and depth of the upper placed box are smaller than width and depth of the lower
box respectively.
2) We can rotate boxes. For example, if there is a box with dimensions {1x2x3} where 1 is height, 23 is base, then there can be three possibilities,
{1x2x3}, {2x1x3} and {3x1x2}.
3) We can use multiple instances of boxes. What it means is, we can have two different rotations of a box as part of our maximum height stack.
Following is the solution based on DP solution of LIS problem.
1) Generate all 3 rotations of all boxes. The size of rotation array becomes 3 times the size of original array. For simplicity, we consider depth as
always smaller than or equal to width.
2) Sort the above generated 3n boxes in decreasing order of base area.
3) After sorting the boxes, the problem is same as LIS with following optimal substructure property.
MSH(i) = Maximum possible Stack Height with box i at top of stack
MSH(i) = { Max ( MSH(j) ) + height(i) } where j < i and width(j) > width(i) and depth(j) > depth(i).
If there is no such j then MSH(i) = height(i)
4) To get overall maximum height, we return max(MSH(i)) where 0 < i < n Following is C++ implementation of the above solution.
/* Dynamic Programming implementation of Box Stacking problem */
#include<stdio.h>
#include<stdlib.h>
/* Representation of a box */
struct Box
{
// h > height, w > width, d > depth
int h, w, d; // for simplicity of solution, always keep w <= d
};
// A utility function to get minimum of two intgers
int min (int x, int y)
{ return (x < y)? x : y; }
// A utility function to get maximum of two intgers
int max (int x, int y)
{ return (x > y)? x : y; }
/* Following function is needed for library function qsort(). We
use qsort() to sort boxes in decreasing order of base area.
Refer following link for help of qsort() and compare()
http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/ */
int compare (const void *a, const void * b)
{
return ( (*(Box *)b).d * (*(Box *)b).w )
( (*(Box *)a).d * (*(Box *)a).w );
}
/* Returns the height of the tallest stack that can be formed with give type of boxes */
int maxStackHeight( Box arr[], int n )
{
/* Create an array of all rotations of given boxes
For example, for a box {1, 2, 3}, we consider three
instances{{1, 2, 3}, {2, 1, 3}, {3, 1, 2}} */
Box rot[3*n];
int index = 0;
for (int i = 0; i < n; i++)
{
// Copy the original box
rot[index] = arr[i];
index++;
// First rotation of box
rot[index].h = arr[i].w;
rot[index].d = max(arr[i].h, arr[i].d);
rot[index].w = min(arr[i].h, arr[i].d);
index++;
// Second rotation of box
rot[index].h = arr[i].d;
rot[index].d = max(arr[i].h, arr[i].w);
rot[index].w = min(arr[i].h, arr[i].w);
index++;
}
// Now the number of boxes is 3n
n = 3*n;
/* Sort the array rot[] in decreasing order, using library
function for quick sort */
qsort (rot, n, sizeof(rot[0]), compare);
// Uncomment following two lines to print all rotations
// for (int i = 0; i < n; i++ )
//
printf("%d x %d x %d\n", rot[i].h, rot[i].w, rot[i].d);
/* Initialize msh values for all indexes
msh[i] > Maximum possible Stack Height with box i on top */
int msh[n];
for (int i = 0; i < n; i++ )
msh[i] = rot[i].h;
/* Compute optimized msh values in bottom up manner */
for (int i = 1; i < n; i++ )
for (int j = 0; j < i; j++ )
if ( rot[i].w < rot[j].w &&
rot[i].d < rot[j].d &&
msh[i] < msh[j] + rot[i].h
)
{
msh[i] = msh[j] + rot[i].h;
}
/* Pick maximum of all msh values */
int max = -1;
for ( int i = 0; i < n; i++ )
if ( max < msh[i] )
max = msh[i];
return max;
}
/* Driver program to test above function */
int main()
{
Box arr[] = { {4, 6, 7}, {1, 2, 3}, {4, 5, 6}, {10, 12, 32} };
int n = sizeof(arr)/sizeof(arr[0]);
printf("The maximum possible height of stack is %d\n",
maxStackHeight (arr, n) );
return 0;
}
Output:
The maximum possible height of stack is 60
In the above program, given input boxes are {4, 6, 7}, {1, 2, 3}, {4, 5, 6}, {10, 12, 32}. Following are all rotations of the boxes in decreasing
order of base area.
10 x 12
12 x 10
32 x 10
4 x 6 x
4 x 5 x
6 x 4 x
5 x 4 x
7 x 4 x
6 x 4 x
1 x 2 x
2 x 1 x
3 x 1 x
x 32
x 32
x 12
7
6
7
6
6
5
3
3
2
The height 60 is obtained by boxes { {3, 1, 2}, {1, 2, 3}, {6, 4, 5}, {4, 5, 6}, {4, 6, 7}, {32, 10, 12}, {10, 12, 32}}
Time Complexity: O(n^2)
Auxiliary Space: O(n)
Write a function int fib(int n) that returns Fn. For example, if n = 0, then fib() should return 0. If n = 1, then it should return 1. For n > 1, it should
return Fn-1 + Fn-2
Following are different methods to get the nth Fibonacci number.
Method 1 ( Use recursion )
A simple method that is a direct recusrive implementation mathematical recurance relation given above.
#include<stdio.h>
int fib(int n)
{
if (n <= 1)
return n;
return fib(n-1) + fib(n-2);
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}
Extra Space: O(n) if we consider the function call stack size, otherwise O(1).
Method 2 ( Use Dynamic Programming )
We can avoid the repeated work done is the method 1 by storing the Fibonacci numbers calculated so far.
#include<stdio.h>
int fib(int n)
{
/* Declare an array to store Fibonacci numbers. */
int f[n+1];
int i;
/* 0th and 1st number of the series are 0 and 1*/
f[0] = 0;
f[1] = 1;
for (i = 2; i <= n; i++)
{
/* Add the previous 2 numbers in the series
and store it */
f[i] = f[i-1] + f[i-2];
}
return f[n];
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}
#include <stdio.h>
/* Helper function that multiplies 2 matricies F and M of size 2*2, and
puts the multiplication result back to F[][] */
void multiply(int F[2][2], int M[2][2]);
/* Helper function that calculates F[][] raise to the power n and puts the
result in F[][]
Note that this function is desinged only for fib() and won't work as general
power function */
void power(int F[2][2], int n);
int fib(int n)
{
int F[2][2] = {{1,1},{1,0}};
if (n == 0)
return 0;
power(F, n-1);
return F[0][0];
}
void multiply(int F[2][2], int M[2][2])
{
int x = F[0][0]*M[0][0] + F[0][1]*M[1][0];
int y = F[0][0]*M[0][1] + F[0][1]*M[1][1];
=
=
=
=
x;
y;
z;
w;
}
void power(int F[2][2], int n)
{
int i;
int M[2][2] = {{1,1},{1,0}};
// n - 1 times multiply the matrix to {{1,0},{0,1}}
for (i = 2; i <= n; i++)
multiply(F, M);
}
/* Driver program to test above function */
int main()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}
=
=
=
=
int M[2][2])
+
+
+
+
F[0][1]*M[1][0];
F[0][1]*M[1][1];
F[1][1]*M[1][0];
F[1][1]*M[1][1];
x;
y;
z;
w;
}
/* Driver program to test above function */
int main()
{
int n = 9;
printf("%d", fib(9));
getchar();
return 0;
}
First element is 1, so can only go to 3. Second element is 3, so can make at most 3 steps eg to 5 or 8 or 9.
Method 1 (Naive Recursive Approach)
A naive approach is to start from the first element and recursively call for all the elements reachable from first element. The minimum number of
jumps to reach end from first can be calculated using minimum number of jumps needed to reach end from the elements reachable from first.
minJumps(start, end) = Min ( minJumps(k, end) ) for all k reachable from start
#include <stdio.h>
#include <limits.h>
// Returns minimum number of jumps to reach arr[h] from arr[l]
int minJumps(int arr[], int l, int h)
{
// Base case: when source and destination are same
if (h == l)
return 0;
// When nothing is reachable from the given source
if (arr[l] == 0)
return INT_MAX;
// Traverse through all the points reachable from arr[l]. Recursively
// get the minimum number of jumps needed to reach arr[h] from these
// reachable points.
int min = INT_MAX;
for (int i = l+1; i <= h && i <= l + arr[l]; i++)
{
int jumps = minJumps(arr, i, h);
if(jumps != INT_MAX && jumps + 1 < min)
min = jumps + 1;
}
return min;
}
// Driver program to test above function
int main()
{
int arr[] = {1, 3, 6, 3, 2, 3, 6, 8, 9, 5};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of jumps to reach end is %d ", minJumps(arr, 0, n-1));
return 0;
}
If we trace the execution of this method, we can see that there will be overlapping subproblems. For example, minJumps(3, 9) will be called two
times as arr[3] is reachable from arr[1] and arr[2]. So this problem has both properties (optimal substructure and overlapping subproblems) of
Dynamic Programming.
if (n == 0 || arr[0] == 0)
return INT_MAX;
jumps[0] = 0;
// Find the minimum number of jumps to reach arr[i]
// from arr[0], and assign this value to jumps[i]
for (i = 1; i < n; i++)
{
jumps[i] = INT_MAX;
for (j = 0; j < i; j++)
{
if (i <= j + arr[j] && jumps[j] != INT_MAX)
{
jumps[i] = min(jumps[i], jumps[j] + 1);
break;
}
}
}
return jumps[n-1];
}
// Driver program to test above function
int main()
{
int arr[] = {1, 3, 6, 1, 0, 9};
int size = sizeof(arr)/sizeof(int);
printf("Minimum number of jumps to reach end is %d ", minJumps(arr,size));
return 0;
}
Output:
Minimum number of jumps to reach end is 3
1
1
1
1
1
0
1
0
1
1
1
0
0
1
1
1
1
0
1
0
0
0
1
0
Algorithm:
Let the given binary matrix be M[R][C]. The idea of the algorithm is to construct an auxiliary size matrix S[][] in which each entry S[i][j] represents
size of the square sub-matrix with all 1s including M[i][j] where M[i][j] is the rightmost and bottommost entry in sub-matrix.
1) Construct a sum matrix S[R][C] for the given M[R][C].
a) Copy first row and first columns as it is from M[][] to S[][]
b) For other entries, use following expressions to construct S[][]
If M[i][j] is 1 then
S[i][j] = min(S[i][j-1], S[i-1][j], S[i-1][j-1]) + 1
Else /*If M[i][j] is 0*/
S[i][j] = 0
2) Find the maximum entry in S[R][C]
3) Using the value and coordinates of maximum entry in S[i], print
sub-matrix of M[][]
For the given M[R][C] in above example, constructed S[R][C] would be:
0
1
0
1
1
0
1
1
1
1
2
0
1
0
1
2
2
0
0
1
1
2
3
0
1
0
0
0
1
0
The value of maximum entry in above matrix is 3 and coordinates of the entry are (4, 3). Using the maximum value and its coordinates, we can find
out the required sub-matrix.
#include<stdio.h>
#define bool int
#define R 6
#define C 5
void printMaxSubSquare(bool M[R][C])
{
int i,j;
int S[R][C];
int max_of_s, max_i, max_j;
/* Set first column of S[][]*/
for(i = 0; i < R; i++)
S[i][0] = M[i][0];
/* Set first row of S[][]*/
for(j = 0; j < C; j++)
S[0][j] = M[0][j];
/* Construct other entries of S[][]*/
for(i = 1; i < R; i++)
{
for(j = 1; j < C; j++)
{
if(M[i][j] == 1)
S[i][j] = min(S[i][j-1], S[i-1][j], S[i-1][j-1]) + 1;
else
S[i][j] = 0;
}
}
Time Complexity: O(m*n) where m is number of rows and n is number of columns in the given matrix.
Auxiliary Space: O(m*n) where m is number of rows and n is number of columns in the given matrix.
Algorithmic Paradigm: Dynamic Programming
Ugly Numbers
Ugly numbers are numbers whose only prime factors are 2, 3 or 5. The sequence
1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15,
shows the first 11 ugly numbers. By convention, 1 is included.
Write a program to find and print the 150th ugly number.
METHOD 1 (Simple)
Thanks to Nedylko Draganov for suggesting this solution.
Algorithm:
Loop for all positive integers until ugly number count is smaller than n, if an integer is ugly than increment ugly number count.
To check if a number is ugly, divide the number by greatest divisible powers of 2, 3 and 5, if the number becomes 1 then it is an ugly number
otherwise not.
For example, let us see how to check for 300 is ugly or not. Greatest divisible power of 2 is 4, after dividing 300 by 4 we get 75. Greatest
divisible power of 3 is 3, after dividing 75 by 3 we get 25. Greatest divisible power of 5 is 25, after dividing 25 by 25 we get 1. Since we get 1
finally, 300 is ugly number.
Implementation:
# include<stdio.h>
# include<stdlib.h>
/*This function divides a by greatest divisible
power of b*/
int maxDivide(int a, int b)
{
while (a%b == 0)
a = a/b;
return a;
}
/* Function to check
int isUgly(int no)
{
no = maxDivide(no,
no = maxDivide(no,
no = maxDivide(no,
This method is not time efficient as it checks for all integers until ugly number count becomes n, but space complexity of this method is O(1)
METHOD 2 (Use Dynamic Programming)
Here is a time efficient solution with O(n) extra space. The ugly-number sequence is 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15,
because every number can only be divided by 2, 3, 5, one way to look at the sequence is to split the sequence to three groups as below:
(1) 12, 22, 32, 42, 52,
(2) 13, 23, 33, 43, 53,
(3) 15, 25, 35, 45, 55,
We can find that every subsequence is the ugly-sequence itself (1, 2, 3, 4, 5, ) multiply 2, 3, 5. Then we use similar merge method as merge sort,
to get every ugly number from the three subsequence. Every step we choose the smallest one, and move one step after.
Algorithm:
1 Declare an array for ugly numbers: ugly[150]
2 Initialize first ugly no: ugly[0] = 1
3 Initialize three array index variables i2, i3, i5 to point to
1st element of the ugly array:
i2 = i3 = i5 =0;
4 Initialize 3 choices for the next ugly no:
next_mulitple_of_2 = ugly[i2]*2;
next_mulitple_of_3 = ugly[i3]*3
next_mulitple_of_5 = ugly[i5]*5;
5 Now go in a loop to fill all ugly numbers till 150:
For (i = 1; i < 150; i++ )
{
/* These small steps are not optimized for good
readability. Will optimize them in C program */
next_ugly_no = Min(next_mulitple_of_2,
next_mulitple_of_3,
next_mulitple_of_5);
if (next_ugly_no == next_mulitple_of_2)
{
i2 = i2 + 1;
next_mulitple_of_2 = ugly[i2]*2;
}
if (next_ugly_no == next_mulitple_of_3)
{
i3 = i3 + 1;
next_mulitple_of_3 = ugly[i3]*3;
}
if (next_ugly_no == next_mulitple_of_5)
{
i5 = i5 + 1;
next_mulitple_of_5 = ugly[i5]*5;
}
ugly[i] = next_ugly_no
}/* end of for loop */
6.return next_ugly_no
Example:
Let us see how it works
initialize
ugly[] = | 1 |
i2 = i3 = i5 = 0;
First iteration
ugly[1] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(2, 3, 5)
= 2
ugly[] = | 1 | 2 |
i2 = 1, i3 = i5 = 0 (i2 got incremented )
Second iteration
ugly[2] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(4, 3, 5)
= 3
ugly[] = | 1 | 2 | 3 |
i2 = 1, i3 = 1, i5 = 0 (i3 got incremented )
Third iteration
ugly[3] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(4, 6, 5)
= 4
ugly[] = | 1 | 2 | 3 | 4 |
i2 = 2, i3 = 1, i5 = 0 (i2 got incremented )
Fourth iteration
ugly[4] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(6, 6, 5)
= 5
ugly[] = | 1 | 2 | 3 | 4 | 5 |
Program:
# include<stdio.h>
# include<stdlib.h>
# define bool int
/* Function to find minimum of 3 numbers */
unsigned min(unsigned , unsigned , unsigned );
/* Function to get the nth ugly number*/
unsigned getNthUglyNo(unsigned n)
{
unsigned *ugly =
(unsigned *)(malloc (sizeof(unsigned)*n));
unsigned i2 = 0, i3 = 0, i5 = 0;
unsigned i;
unsigned next_multiple_of_2 = 2;
unsigned next_multiple_of_3 = 3;
unsigned next_multiple_of_5 = 5;
unsigned next_ugly_no = 1;
*(ugly+0) = 1;
for(i=1; i<n; i++)
{
next_ugly_no = min(next_multiple_of_2,
next_multiple_of_3,
next_multiple_of_5);
*(ugly+i) = next_ugly_no;
if(next_ugly_no == next_multiple_of_2)
{
i2 = i2+1;
next_multiple_of_2 = *(ugly+i2)*2;
}
if(next_ugly_no == next_multiple_of_3)
{
i3 = i3+1;
next_multiple_of_3 = *(ugly+i3)*3;
}
if(next_ugly_no == next_multiple_of_5)
{
i5 = i5+1;
next_multiple_of_5 = *(ugly+i5)*5;
}
} /*End of for loop (i=1; i<n; i++) */
return next_ugly_no;
}
/* Function to find minimum of 3 numbers */
unsigned min(unsigned a, unsigned b, unsigned c)
{
if(a <= b)
{
if(a <= c)
return a;
else
return c;
}
if(b <= c)
return b;
else
return c;
}
/* Driver program to test above functions */
int main()
{
unsigned no = getNthUglyNo(150);
printf("%dth ugly no. is %d ", 150, no);
getchar();
return 0;
Explanation:
Simple idea of the Kadane's algorithm is to look for all positive contiguous segments of the array (max_ending_here is used for this). And keep
track of maximum sum contiguous segment among all positive segments (max_so_far is used for this). Each time we get a positive sum compare it
with max_so_far and update max_so_far if it is greater than max_so_far
Lets take the example:
{-2, -3, 4, -1, -2, 1, 5, -3}
max_so_far = max_ending_here = 0
for i=0, a[0] = -2
max_ending_here = max_ending_here + (-2)
Set max_ending_here = 0 because max_ending_here < 0
for i=1, a[1] = -3
max_ending_here = max_ending_here + (-3)
Set max_ending_here = 0 because max_ending_here < 0
for i=2, a[2] = 4
max_ending_here = max_ending_here + (4)
max_ending_here = 4
max_so_far is updated to 4 because max_ending_here greater
than max_so_far which was 0 till now
for i=3, a[3] = -1
max_ending_here = max_ending_here + (-1)
max_ending_here = 3
for i=4, a[4] = -2
max_ending_here = max_ending_here + (-2)
max_ending_here = 1
for i=5, a[5] = 1
max_ending_here = max_ending_here + (1)
max_ending_here = 2
for i=6, a[6] = 5
max_ending_here = max_ending_here + (5)
max_ending_here = 7
max_so_far is updated to 7 because max_ending_here is
greater than max_so_far
for i=7, a[7] = -3
max_ending_here = max_ending_here + (-3)
max_ending_here = 4
Program:
C++
// C++ program to print largest contiguous array sum
#include<iostream>
using namespace std;
int maxSubArraySum(int a[], int size)
{
int max_so_far = 0, max_ending_here = 0;
for (int i = 0; i < size; i++)
{
Python
# Python program to find maximum contiguous subarray
# Function to find the maximum contiguous subarray
def maxSubArraySum(a,size):
max_so_far = 0
max_ending_here = 0
for i in range(0, size):
max_ending_here = max_ending_here + a[i]
if max_ending_here < 0:
max_ending_here = 0
if (max_so_far < max_ending_here):
max_so_far = max_ending_here
return max_so_far
# Driver function to check the above function
a = [-2, -3, 4, -1, -2, 1, 5, -3]
print"Maximum contiguous sum is", maxSubArraySum(a,len(a))
#This code is contributed by _Devesh Agrawal_
Notes:
Algorithm doesn't work for all negative numbers. It simply returns 0 if all numbers are negative. For handling this we can add an extra phase before
actual implementation. The phase will look if all numbers are negative, if they are it will return maximum of them (or smallest in terms of absolute
value). There may be other ways to handle it though.
Above program can be optimized further, if we compare max_so_far with max_ending_here only if max_ending_here is greater than 0.
C++
int maxSubArraySum(int a[], int size)
{
int max_so_far = 0, max_ending_here = 0;
for (int i = 0; i < size; i++)
{
max_ending_here = max_ending_here + a[i];
if (max_ending_here < 0)
max_ending_here = 0;
/* Do not compare for all elements. Compare only
when max_ending_here > 0 */
else if (max_so_far < max_ending_here)
max_so_far = max_ending_here;
}
return max_so_far;
}
Python
def maxSubArraySum(a,size):
max_so_far = 0
max_ending_here = 0
for i in range(0, size):
max_ending_here = max_ending_here + a[i]
if max_ending_here < 0:
max_ending_here = 0
# Do not compare for all elements. Compare only
# when max_ending_here > 0
elif (max_so_far < max_ending_here):
max_so_far = max_ending_here
return max_so_far
if (k > maxLength)
{
start = i;
maxLength = k;
}
}
}
}
printf("Longest palindrome substring is: ");
printSubStr( str, start, start + maxLength - 1 );
return maxLength; // return length of LPS
}
// Driver program to test above functions
int main()
{
char str[] = "forgeeksskeegfor";
printf("\nLength is: %d\n", longestPalSubstr( str ) );
return 0;
}
Output:
Longest palindrome substring is: geeksskeeg
Length is: 10
Let all edges are processed in following order: (B,E), (D,B), (B,D), (A,B), (A,C), (D,C), (B,C), (E,D). We get following distances when all edges
are processed first time. The first row in shows initial distances. The second row shows distances when edges (B,E), (D,B), (B,D) and (A,B) are
processed. The third row shows distances when (A,C) is processed. The fourth row shows when (D,C), (B,C) and (E,D) are processed.
The first iteration guarantees to give all shortest paths which are at most 1 edge long. We get following distances when all edges are processed
second time (The last row shows final values).
The second iteration guarantees to give all shortest paths which are at most 2 edges long. The algorithm processes all edges 2 more times. The
distances are minimized after the second iteration, so third and fourth iterations dont update the distances.
Implementation:
C++
// A C / C++ program for Bellman-Ford's single source
// shortest path algorithm.
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<limits.h>
BellmanFord(graph, 0);
return 0;
}
Java
// A Java program for Bellman-Ford's single source shortest path
// algorithm.
import java.util.*;
import java.lang.*;
import java.io.*;
// A class to represent a connected, directed and weighted graph
class Graph
{
// A class to represent a weighted edge in graph
class Edge {
int src, dest, weight;
Edge() {
src = dest = weight = 0;
}
};
int V, E;
Edge edge[];
// Creates a graph with V vertices and E edges
Graph(int v, int e)
{
V = v;
E = e;
edge = new Edge[e];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// The main function that finds shortest distances from src
// to all other vertices using Bellman-Ford algorithm. The
// function also detects negative weight cycle
void BellmanFord(Graph graph,int src)
{
int V = graph.V, E = graph.E;
int dist[] = new int[V];
// Step 1: Initialize distances from src to all other
// vertices as INFINITE
for (int i=0; i<V; ++i)
dist[i] = Integer.MAX_VALUE;
dist[src] = 0;
// Step 2: Relax all edges |V| - 1 times. A simple
// shortest path from src to any other vertex can
// have at-most |V| - 1 edges
for (int i=1; i<V; ++i)
{
for (int j=0; j<E; ++j)
{
int u = graph.edge[j].src;
int v = graph.edge[j].dest;
int weight = graph.edge[j].weight;
if (dist[u]!=Integer.MAX_VALUE &&
dist[u]+weight<dist[v])
dist[v]=dist[u]+weight;
}
}
// Step 3: check for negative-weight cycles. The above
// step guarantees shortest distances if graph doesn't
// contain negative weight cycle. If we get a shorter
// path, then there is a cycle.
for (int j=0; j<E; ++j)
{
int u = graph.edge[j].src;
int v = graph.edge[j].dest;
int weight = graph.edge[j].weight;
if (dist[u]!=Integer.MAX_VALUE &&
dist[u]+weight<dist[v])
System.out.println("Graph contains negative weight cycle");
}
printArr(dist, V);
}
// A utility function used to print the solution
void printArr(int dist[], int V)
{
System.out.println("Vertex Distance from Source");
for (int i=0; i<V; ++i)
System.out.println(i+"\t\t"+dist[i]);
}
// Driver method to test above function
public static void main(String[] args)
{
int V = 5; // Number of vertices in graph
int E = 8; // Number of edges in graph
Graph graph = new Graph(V, E);
// add edge 0-1 (or A-B in above figure)
graph.edge[0].src = 0;
graph.edge[0].dest = 1;
graph.edge[0].weight = -1;
// add edge 0-2 (or A-C in above figure)
graph.edge[1].src = 0;
graph.edge[1].dest = 2;
graph.edge[1].weight = 4;
// add edge 1-2 (or B-C in above figure)
graph.edge[2].src = 1;
graph.edge[2].dest = 2;
graph.edge[2].weight = 3;
// add edge 1-3 (or B-D in above figure)
graph.edge[3].src = 1;
graph.edge[3].dest = 3;
graph.edge[3].weight = 2;
// add edge 1-4 (or A-E in above figure)
graph.edge[4].src = 1;
graph.edge[4].dest = 4;
graph.edge[4].weight = 2;
// add edge 3-2 (or D-C in above figure)
graph.edge[5].src = 3;
graph.edge[5].dest = 2;
graph.edge[5].weight = 5;
// add edge 3-1 (or D-B in above figure)
graph.edge[6].src = 3;
graph.edge[6].dest = 1;
graph.edge[6].weight = 1;
// add edge 4-3 (or E-D in above figure)
graph.edge[7].src = 4;
graph.edge[7].dest = 3;
graph.edge[7].weight = -3;
graph.BellmanFord(graph, 0);
}
}
// Contributed by Aakash Hasija
Vertex
0
1
2
3
4
Notes
1) Negative weights are found in various applications of graphs. For example, instead of paying cost for a path, we may get some advantage if we
follow the path.
2) Bellman-Ford works better (better than Dijksras) for distributed systems. Unlike Dijksras where we need to find minimum value of all vertices,
in Bellman-Ford, edges are considered one by one.
Exercise
1) The standard Bellman-Ford algorithm reports shortest path only if there is no negative weight cycles. Modify it so that it reports minimum
distances even if there is a negative weight cycle.
2) Can we use Dijksras algorithm for shortest paths for graphs with negative weights one idea can be, calculate the minimum weight value, add a
positive value (equal to absolute value of minimum weight value) to all weights and run the Dijksras algorithm for the modified graph. Will this
algorithm work?
References:
http://www.youtube.com/watch?v=Ttezuzs39nk
http://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm
http://www.cs.arizona.edu/classes/cs445/spring07/ShortestPath2.prn.pdf
20
/
10
\
12
V
1) Optimal Substructure:
The optimal cost for freq[i..j] can be recursively calculated using following formula.
Output:
Cost of Optimal BST is 142
Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. We can see many subproblems being repeated in the following recursion tree for freq[1..4].
Since same suproblems are called again, this problem has Overlapping Subprolems property. So optimal BST problem has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems
can be avoided by constructing a temporary array cost[][] in bottom up manner.
Dynamic Programming Solution
Following is C/C++ implementation for optimal BST problem using Dynamic Programming. We use an auxiliary array cost[n][n] to store the
solutions of subproblems. cost[0][n-1] will hold the final result. The challenge in implementation is, all diagonal values must be filled first, then the
values which lie on the line just above the diagonal. In other words, we must first fill all cost[i][i] values, then all cost[i][i+1] values, then all cost[i]
[i+2] values. So how to fill the 2D array in such manner> The idea used in the implementation is same as Matrix Chain Multiplication problem, we
use a variable L for chain length and increment L, one by one. We calculate column number j using the values of i and L.
// Dynamic Programming code for Optimal Binary Search Tree Problem
#include <stdio.h>
#include <limits.h>
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j);
/* A Dynamic Programming based function that calculates minimum cost of
a Binary Search Tree. */
int optimalSearchTree(int keys[], int freq[], int n)
{
/* Create an auxiliary 2D matrix to store results of subproblems */
int cost[n][n];
/* cost[i][j] = Optimal cost of binary search tree that can be
formed from keys[i] to keys[j].
Output:
Cost of Optimal BST is 142
Notes
1) The time complexity of the above solution is O(n^4). The time complexity can be easily reduced to O(n^3) by pre-calculating sum of
frequencies instead of calling sum() again and again.
2) In the above solutions, we have computed optimal cost only. The solutions can be easily modified to store the structure of BSTs also. We can
create another auxiliary array of size n to store the structure of tree. All we need to do is, store the chosen r in the innermost loop.
A Dynamic Programming solution solves a given problem using solutions of subproblems in bottom up manner. Can the given problem be solved
using solutions to subproblems? If yes, then what are the subproblems? Can we find largest independent set size (LISS) for a node X if we know
LISS for all descendants of X? If a node is considered as part of LIS, then its children cannot be part of LIS, but its grandchildren can be.
Following is optimal substructure property.
1) Optimal Substructure:
Let LISS(X) indicates size of largest independent set of a tree with root X.
LISS(X) = MAX { (1 + sum of LISS for all grandchildren of X),
(sum of LISS for all children of X) }
The idea is simple, there are two possibilities for every node X, either X is a member of the set or not a member. If X is a member, then the value
of LISS(X) is 1 plus LISS of all grandchildren. If X is not a member, then the value is sum of LISS of all children.
2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.
// A naive recursive implementation of Largest Independent Set problem
#include <stdio.h>
#include <stdlib.h>
// A utility function to find max of two integers
int max(int x, int y) { return (x > y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
struct node *left, *right;
};
// The function returns size of the largest independent set in a given
// binary tree
int LISS(struct node *root)
{
if (root == NULL)
return 0;
// Caculate size excluding the current node
int size_excl = LISS(root->left) + LISS(root->right);
// Calculate size including the current node
int size_incl = 1;
if (root->left)
size_incl += LISS(root->left->left) + LISS(root->left->right);
if (root->right)
size_incl += LISS(root->right->left) + LISS(root->right->right);
// Return the maximum of two sizes
return max(size_incl, size_excl);
}
// A utility function to create a node
struct node* newNode( int data )
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}
Output:
Size of the Largest Independent Set is 5
Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. For example, LISS of node with value 50 is evaluated for node with values 10 and 20 as 50 is grandchild of 10 and child of 20.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So LISS problem has both properties (see this and
this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be
avoided by storing the solutions to subproblems and solving problems in bottom up manner.
Following is C implementation of Dynamic Programming based solution. In the following solution, an additional field liss is added to tree nodes.
The initial value of liss is set as 0 for all nodes. The recursive function LISS() calculates liss for a node only if it is not already set.
/* Dynamic programming based program for Largest Independent Set problem */
#include <stdio.h>
#include <stdlib.h>
// A utility function to find max of two integers
int max(int x, int y) { return (x > y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
int liss;
struct node *left, *right;
};
// A memoization function returns size of the largest independent set in
// a given binary tree
int LISS(struct node *root)
{
if (root == NULL)
return 0;
if (root->liss)
return root->liss;
if (root->left == NULL && root->right == NULL)
return (root->liss = 1);
// Calculate size excluding the current node
int liss_excl = LISS(root->left) + LISS(root->right);
// Calculate size including the current node
int liss_incl = 1;
if (root->left)
liss_incl += LISS(root->left->left) + LISS(root->left->right);
if (root->right)
liss_incl += LISS(root->right->left) + LISS(root->right->right);
// Maximum of two sizes is LISS, store it for future uses.
root->liss = max(liss_incl, liss_excl);
return root->liss;
}
// A utility function to create a node
struct node* newNode(int data)
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
temp->liss = 0;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}
Output
Size of the Largest Independent Set is 5
Time Complexity: O(n) where n is the number of nodes in given Binary tree.
Following extensions to above solution can be tried as an exercise.
1) Extend the above solution for n-ary tree.
2) The above solution modifies the given tree structure by adding an additional field liss to tree nodes. Extend the solution so that it doesnt modify
the tree structure.
3) The above solution only returns size of LIS, it doesnt print elements of LIS. Extend the solution to print all nodes that are part of LIS.
Let isSubSetSum(int set[], int n, int sum) be the function to find whether there is a subset of set[] with sum equal to sum. n is the number of
elements in set[].
The isSubsetSum problem can be divided into two subproblems
a) Include the last element, recur for n = n-1, sum = sum set[n-1]
b) Exclude the last element, recur for n = n-1.
If any of the above the above subproblems return true, then return true.
Following is the recursive formula for isSubsetSum() problem.
isSubsetSum(set, n, sum) = isSubsetSum(set, n-1, sum) ||
isSubsetSum(arr, n-1, sum-set[n-1])
Base Cases:
isSubsetSum(set, n, sum) = false, if sum > 0 and n == 0
isSubsetSum(set, n, sum) = true, if sum == 0
Following is naive recursive implementation that simply follows the recursive structure mentioned above.
// A recursive solution for subset sum problem
#include <stdio.h>
// Returns true if there is a subset of set[] with sun equal to given sum
bool isSubsetSum(int set[], int n, int sum)
{
// Base Cases
if (sum == 0)
return true;
if (n == 0 && sum != 0)
return false;
// If last element is greater than sum, then ignore it
if (set[n-1] > sum)
return isSubsetSum(set, n-1, sum);
/* else, check if sum can be obtained by any of the following
(a) including the last element
(b) excluding the last element */
return isSubsetSum(set, n-1, sum) || isSubsetSum(set, n-1, sum-set[n-1]);
}
// Driver program to test above function
int main()
{
int set[] = {3, 34, 4, 12, 5, 2};
int sum = 9;
int n = sizeof(set)/sizeof(set[0]);
if (isSubsetSum(set, n, sum) == true)
printf("Found a subset with given sum");
else
printf("No subset with given sum");
return 0;
}
Output:
Found a subset with given sum
The above solution may try all subsets of given set in worst case. Therefore time complexity of the above solution is exponential. The problem is infact NP-Complete (There is no known polynomial time solution for this problem).
We can solve the problem in Pseudo-polynomial time using Dynamic programming. We create a boolean 2D table subset[][] and fill it in
bottom up manner. The value of subset[i][j] will be true if there is a subset of set[0..j-1] with sum equal to i., otherwise false. Finally, we return
subset[sum][n]
// A Dynamic Programming solution for subset sum problem
#include <stdio.h>
// Returns true if there is a subset of set[] with sun equal to given sum
bool isSubsetSum(int set[], int n, int sum)
{
Output:
Found a subset with given sum
This problem is mainly an extension of Largest Sum Contiguous Subarray for 1D array.
The naive solution for this problem is to check every possible rectangle in given 2D array. This solution requires 4 nested loops and time
complexity of this solution would be O(n^4).
Kadanes algorithm for 1D array can be used to reduce the time complexity to O(n^3). The idea is to fix the left and right columns one by one
and find the maximum sum contiguous rows for every left and right column pair. We basically find top and bottom row numbers (which have
maximum sum) for every fixed left and right column pair. To find the top and bottom row numbers, calculate sun of elements in every row from left
to right and store these sums in an array say temp[]. So temp[i] indicates sum of elements from left to right in row i. If we apply Kadanes 1D
algorithm on temp[], and get the maximum sum subarray of temp, this maximum sum would be the maximum possible sum with left and right as
boundary columns. To get the overall maximum sum, we compare this sum with the maximum sum so far.
// Program to find maximum sum subarray in a given 2D array
#include <stdio.h>
#include <string.h>
#include <limits.h>
#define ROW 4
#define COL 5
// Implementation of Kadane's algorithm for 1D array. The function returns the
// maximum sum and stores starting and ending indexes of the maximum sum subarray
// at addresses pointed by start and finish pointers respectively.
int kadane(int* arr, int* start, int* finish, int n)
{
// initialize sum, maxSum and
int sum = 0, maxSum = INT_MIN, i;
// Just some initial value to check for all negative values case
*finish = -1;
// local variable
int local_start = 0;
for (i = 0; i < n; ++i)
{
sum += arr[i];
if (sum < 0)
{
sum = 0;
local_start = i+1;
}
else if (sum > maxSum)
{
maxSum = sum;
*start = local_start;
*finish = i;
}
}
// There is at-least one non-negative number
if (*finish != -1)
return maxSum;
// Special Case: When all numbers in arr[] are negative
maxSum = arr[0];
*start = *finish = 0;
Output:
(Top, Left) (1, 1)
(Bottom, Right) (3, 3)
Max sum is: 29
This problem can be solved using Dynamic Programming. Let a[i] be the number of binary strings of length i which do not contain any two
consecutive 1s and which end in 0. Similarly, let b[i] be the number of such strings which end in 1. We can append either 0 or 1 to a string ending
in 0, but we can only append 0 to a string ending in 1. This yields the recurrence relation:
a[i] = a[i - 1] + b[i - 1]
b[i] = a[i - 1]
The base cases of above recurrence are a[1] = b[1] = 1. The total number of strings of length i is just a[i] + b[i].
Following is C++ implementation of above solution. In the following implementation, indexes start from 0. So a[i] represents the number of binary
strings for input length i+1. Similarly, b[i] represents binary strings for input length i+1.
// C++ program to count all distinct binary strings
// without two consecutive 1's
#include <iostream>
using namespace std;
int countStrings(int n)
{
int a[n], b[n];
a[0] = b[0] = 1;
for (int i = 1; i < n; i++)
{
a[i] = a[i-1] + b[i-1];
b[i] = a[i-1];
}
return a[n-1] + b[n-1];
}
// Driver program to test above functions
int main()
{
cout << countStrings(3) << endl;
return 0;
}
Output:
5
Source:
courses.csail.mit.edu/6.006/oldquizzes/solutions/q2-f2009-sol.pdf
Count the number of ways we can parenthesize the expression so that the value of expression evaluates to true.
Let the input be in form of two arrays one contains the symbols (T and F) in order and other contains operators (&, | and ^}
Examples:
Input: symbol[]
= {T, F, T}
operator[] = {^, &}
Output: 2
The given expression is "T ^ F & T", it evaluates true
in two ways "((T ^ F) & T)" and "(T ^ (F & T))"
Input: symbol[]
= {T, F, F}
operator[] = {^, |}
Output: 2
The given expression is "T ^ F | F", it evaluates true
in two ways "( (T ^ F) | F )" and "( T ^ (F | F) )".
Input: symbol[]
= {T, T, F, T}
operator[] = {|, &, ^}
Output: 4
The given expression is "T | T & F ^ T", it evaluates true
in 4 ways ((T|T)&(F^T)), (T|(T&(F^T))), (((T|T)&F)^T)
and (T|((T&F)^T)).
Solution:
Let T(i, j) represents the number of ways to parenthesize the symbols between i and j (both inclusive) such that the subexpression between i and j
evaluates to true.
Let F(i, j) represents the number of ways to parenthesize the symbols between i and j (both inclusive) such that the subexpression between i and j
evaluates to false.
Base Cases:
T(i, i) = 1 if symbol[i] = 'T'
T(i, i) = 0 if symbol[i] = 'F'
F(i, i) = 1 if symbol[i] = 'F'
F(i, i) = 0 if symbol[i] = 'T'
If we draw recursion tree of above recursive solution, we can observe that it many overlapping subproblems. Like other dynamic programming
problems, it can be solved by filling a table in bottom up manner. Following is C++ implementation of dynamic programming solution.
#include<iostream>
#include<cstring>
using namespace std;
// Returns count of all possible parenthesizations that lead to
// result true for a boolean expression with symbols like true
// and false and operators like &, | and ^ filled between symbols
Output:
4
Consider the example shown in diagram. The value of n is 3. There are 3 ways to reach the top. The diagram is taken from Easier Fibonacci
puzzles
More Examples:
Input: n = 1
Output: 1
There is only one way to climb 1 stair
Input: n = 2
Output: 2
There are two ways: (1, 1) and (2)
Input: n = 4
Output: 5
(1, 1, 1, 1), (1, 1, 2), (2, 1, 1), (1, 2, 1), (2, 2)
We can easily find recursive nature in above problem. The person can reach nth stair from either (n-1)th stair or from (n-2)th stair. Let the total
number of ways to reach nt stair be ways(n). The value of ways(n) can be written as following.
ways(n) = ways(n-1) + ways(n-2)
The above expression is actually the expression for Fibonacci numbers, but there is one thing to notice, the value of ways(n) is equal to
fibonacci(n+1).
ways(1) = fib(2) = 1
ways(2) = fib(3) = 2
ways(3) = fib(4) = 3
So we can use function for fibonacci numbers to find the value of ways(n). Following is C++ implementation of the above idea.
// A C program to count number of ways to reach n't stair when
// a person can climb 1, 2, ..m stairs at a time.
#include<stdio.h>
// A simple recursive program to find n'th fibonacci number
int fib(int n)
{
if (n <= 1)
return n;
return fib(n-1) + fib(n-2);
}
// Returns number of ways to reach s'th stair
int countWays(int s)
{
return fib(s + 1);
}
// Driver program to test above functions
int main ()
{
int s = 4;
printf("Number of ways = %d", countWays(s));
getchar();
return 0;
}
Output:
Number of ways = 5
The time complexity of the above implementation is exponential (golden ratio raised to power n). It can be optimized to work in O(Logn) time
using the previously discussed Fibonacci function optimizations.
Generalization of the above problem
How to count number of ways if the person can climb up to m stairs for a given value m? For example if m is 4, the person can climb 1 stair or 2
stairs or 3 stairs or 4 stairs at a time.
We can write the recurrence as following.
ways(n, m) = ways(n-1, m) + ways(n-2, m) + ... ways(n-m, m)
Output:
Number of ways = 5
The time complexity of above solution is exponential. It can be optimized to O(mn) by using dynamic programming. Following is dynamic
programming based solution. We build a table res[] in bottom up manner.
// A C program to count number of ways to reach n't stair when
// a person can climb 1, 2, ..m stairs at a time
#include<stdio.h>
// A recursive function used by countWays
int countWaysUtil(int n, int m)
{
int res[n];
res[0] = 1; res[1] = 1;
for (int i=2; i<n; i++)
{
res[i] = 0;
for (int j=1; j<=m && j<=i; j++)
res[i] += res[i-j];
}
return res[n-1];
}
Output:
Number of ways = 5
Two triangulations of the same convex pentagon. The triangulation on the left has a cost of 8 + 2?2 + 2?5 (approximately 15.30), the
one on the right has a cost of 4 + 2?2 + 4?5 (approximately 15.77).
This problem has recursive substructure. The idea is to divide the polygon into three parts: a single triangle, the sub-polygon to the left, and the
sub-polygon to the right. We try all possible divisions like this and find the one that minimizes the cost of the triangle plus the cost of the
triangulation of the two sub-polygons.
Let Minimum Cost of triangulation of vertices from i to j be minCost(i, j)
If j <= i + 2 Then
minCost(i, j) = 0
Else
minCost(i, j) = Min { minCost(i, k) + minCost(k, j) + cost(i, k, j) }
Here k varies from 'i+1' to 'j-1'
Cost of a triangle formed by edges (i, j), (j, k) and (k, j) is
cost(i, j, k) = dist(i, j) + dist(j, k) + dist(k, j)
return 0;
// Initialize result as infinite
double res = MAX;
// Find minimum triangulation by considering all
for (int k=i+1; k<j; k++)
res = min(res, (mTC(points, i, k) + mTC(points, k, j) +
cost(points, i, k, j)));
return res;
}
// Driver program to test above functions
int main()
{
Point points[] = {{0, 0}, {1, 0}, {2, 1}, {1, 2}, {0, 2}};
int n = sizeof(points)/sizeof(points[0]);
cout << mTC(points, 0, n-1);
return 0;
}
Output:
15.3006
The above problem is similar to Matrix Chain Multiplication. The following is recursion tree for mTC(points[], 0, 4).
It can be easily seen in the above recursion tree that the problem has many overlapping subproblems. Since the problem has both properties:
Optimal Substructure and Overlapping Subproblems, it can be efficiently solved using dynamic programming.
Following is C++ implementation of dynamic programming solution.
// A Dynamic Programming based program to find minimum cost of convex
// polygon triangulation
#include <iostream>
#include <cmath>
#define MAX 1000000.0
using namespace std;
// Structure of a point in 2D plane
struct Point
{
int x, y;
};
// Utility function to find minimum of two double values
double min(double x, double y)
{
return (x <= y)? x : y;
}
// A utility function
double dist(Point p1,
{
return sqrt((p1.x
(p1.y
}
Output:
15.3006
Given the mobile numeric keypad. You can only press buttons that are up, left, right or down to the current
button. You are not allowed to press bottom row corner buttons (i.e. * and # ).
Given a number N, find out the number of possible numbers of given length.
Examples:
For N=1, number of possible numbers would be 10 (0, 1, 2, 3, ., 9)
For N=2, number of possible numbers would be 36
Possible numbers: 00,08 11,12,14 22,21,23,25 and so on.
If we start with 0, valid numbers will be 00, 08 (count: 2)
If we start with 1, valid numbers will be 11, 12, 14 (count: 3)
If we start with 2, valid numbers will be 22, 21, 23,25 (count: 4)
If we start with 3, valid numbers will be 33, 32, 36 (count: 3)
If we start with 4, valid numbers will be 44,41,45,47 (count: 4)
If we start with 5, valid numbers will be 55,54,52,56,58 (count: 5)
We need to print the count of possible numbers.
N = 1 is trivial case, number of possible numbers would be 10 (0, 1, 2, 3, ., 9)
For N > 1, we need to start from some button, then move to any of the four direction (up, left, right or down) which takes to a valid button (should
not go to *, #). Keep doing this until N length number is obtained (depth first traversal).
Recursive Solution:
Mobile Keypad is a rectangular grid of 4X3 (4 rows and 3 columns)
Lets say Count(i, j, N) represents the count of N length numbers starting from position (i, j)
If N = 1
Count(i, j, N) = 10
Else
Count(i, j, N) = Sum of all Count(r, c, N-1) where (r, c) is new
position after valid move of length 1 from current
position (i, j)
%d\n",
%d\n",
%d\n",
%d\n",
%d\n",
1,
2,
3,
4,
5,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
1));
2));
3));
4));
5));
return 0;
}
Output:
Count
Count
Count
Count
Count
for
for
for
for
for
numbers
numbers
numbers
numbers
numbers
of
of
of
of
of
length
length
length
length
length
1:
2:
3:
4:
5:
10
36
138
532
2062
Dynamic Programming
There are many repeated traversal on smaller paths (traversal for smaller N) to find all possible longer paths (traversal for bigger N). See following
two diagrams for example. In this traversal, for N = 4 from two starting positions (buttons 4 and 8), we can see there are few repeated traversals
for N = 2 (e.g. 4 -> 1, 6 -> 3, 8 -> 9, 8 -> 7 etc).
Since the problem has both properties: Optimal Substructure and Overlapping Subproblems, it can be efficiently solved using dynamic
programming.
Following is C program for dynamic programming implementation.
// A Dynamic Programming based C program to count number of
// possible numbers of given length
#include <stdio.h>
// Return count of all possible numbers of length n
// in a given numeric keyboard
int getCount(char keypad[][3], int n)
{
if(keypad == NULL || n <= 0)
return 0;
if(n == 1)
return 10;
// left, up, right, down move from current location
int row[] = {0, 0, -1, 0, 1};
int col[] = {0, -1, 0, 1, 0};
// taking n+1 for simplicity - count[i][j] will store
// number count starting with digit i and length j
int count[10][n+1];
int i=0, j=0, k=0, move=0, ro=0, co=0, num = 0;
int nextNum=0, totalCount = 0;
// count numbers starting with digit i and of lengths 0 and 1
for (i=0; i<=9; i++)
{
count[i][0] = 0;
count[i][1] = 1;
}
// Bottom up - Get number count of length 2, 3, 4, ... , n
for (k=2; k<=n; k++)
{
for (i=0; i<4; i++) // Loop on keypad row
{
for (j=0; j<3; j++) // Loop on keypad column
{
// Process for 0 to 9 digits
if (keypad[i][j] != '*' && keypad[i][j] != '#')
{
// Here we are counting the numbers starting with
%d\n",
%d\n",
%d\n",
%d\n",
%d\n",
1,
2,
3,
4,
5,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
1));
2));
3));
4));
5));
return 0;
}
Output:
Count
Count
Count
Count
Count
for
for
for
for
for
numbers
numbers
numbers
numbers
numbers
of
of
of
of
of
length
length
length
length
length
1:
2:
3:
4:
5:
10
36
138
532
2062
Output:
Count
Count
Count
Count
Count
for
for
for
for
for
numbers
numbers
numbers
numbers
numbers
of
of
of
of
of
length
length
length
length
length
1:
2:
3:
4:
5:
10
36
138
532
2062
%d\n",
%d\n",
%d\n",
%d\n",
%d\n",
1,
2,
3,
4,
5,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
1));
2));
3));
4));
5));
The idea is simple, we subtract all values from 0 to 9 from given sum and recur for sum minus that digit. Below is recursive formula.
countRec(n, sum) = ?finalCount(n-1, sum-x)
where 1 =< x <= 9 and
sum-x >= 0
One important observation is, leading 0's must be
handled explicitly as they are not counted as digits.
So our final count can be written as below.
finalCount(n, sum) = ?finalCount(n-1, sum-x)
where 0 =< x <= 9 and
sum-x >= 0
{
int n = 2, sum = 5;
cout << finalCount(n, sum);
return 0;
}
Output:
5
The time complexity of above solution is exponential. If we draw the complete recursion tree, we can observer that many subproblems are solved
again and again. For example, if we start with n = 3 and sum = 10, we can reach n = 1, sum = 8, by considering digit sequences 1,1 or 2, 0.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So min square sum problem has both properties (see
this and this) of a dynamic programming problem.
Below is Memoization based C++ implementation.
// A memoization based recursive program to count
// numbers with sum of n as given 'sum'
#include<bits/stdc++.h>
using namespace std;
// A lookup table used for memoization
unsigned long long int lookup[101][50001];
// Memoizatiob based implementation of recursive
// function
unsigned long long int countRec(int n, int sum)
{
// Base case
if (n == 0)
return sum == 0;
// If this subproblem is already evaluated,
// return the evaluated value
if (lookup[n][sum] != -1)
return lookup[n][sum];
// Initialize answer
unsigned long long int ans = 0;
// Traverse through every digit and
// recursively count numbers beginning
// with it
for (int i=0; i<10; i++)
if (sum-i >= 0)
ans += countRec(n-1, sum-i);
return lookup[n][sum] = ans;
}
// This is mainly a wrapper over countRec. It
// explicitly handles leading digit and calls
// countRec() for remaining n.
unsigned long long int finalCount(int n, int sum)
{
// Initialize all entries of lookup table
memset(lookup, -1, sizeof lookup);
// Initialize final answer
unsigned long long int ans = 0;
// Traverse through every digit from 1 to
// 9 and count numbers beginning with it
for (int i = 1; i <= 9; i++)
if (sum-i >= 0)
ans += countRec(n-1, sum-i);
return ans;
}
// Driver program
int main()
{
int n = 3, sum = 5;
cout << finalCount(n, sum);
return 0;
}
Output:
One way to look at the problem is, count of numbers is equal to count n digit number ending with 9 plus count of ending with digit 8 plus count for
7 and so on. How to get count ending with a particular digit? We can recur for n-1 length and digits smaller than or equal to the last digit. So below
is recursive formula.
Count of n digit numbers = (Count of (n-1) digit numbers Ending with digit 9) +
(Count of (n-1) digit numbers Ending with digit 8) +
.............................................+
.............................................+
(Count of (n-1) digit numbers Ending with digit 0)
The above recursive solution is going to have many overlapping subproblems. Therefore, we can use Dynamic Programming to build a table in
bottom up manner. Below is Dynamic programming based C++ program.
// C++ program to count non-decreasing number with n digits
#include<bits/stdc++.h>
using namespace std;
long long int countNonDecreasing(int n)
{
// dp[i][j] contains total count of non decreasing
// numbers ending with digit i and of length j
long long int dp[10][n+1];
memset(dp, 0, sizeof dp);
// Fill table for non decreasing numbers of length 1
// Base cases 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
for (int i = 0; i < 10; i++)
dp[i][1] = 1;
// Fill the table in bottom-up manner
for (int digit = 0; digit <= 9; digit++)
{
// Compute total numbers of non decreasing
// numbers of length 'len'
for (int len = 2; len <= n; len++)
{
// sum of all numbers of length of len-1
// in which last digit x is <= 'digit'
for (int x = 0; x <= digit; x++)
dp[digit][len] += dp[x][len-1];
}
}
long long int count = 0;
// There total nondecreasing numbers of length n
// wiint be dp[0][n] + dp[1][n] ..+ dp[9][n]
for (int i = 0; i < 10; i++)
count += dp[i][n];
return count;
}
// Driver program
int main()
{
int n = 3;
cout << countNonDecreasing(n);
return 0;
}
Output:
220
Output:
220
For general n digit case, we can apply Mathematical Induction. The count would be equal to count n-1 digit beginning with 0, i.e., N*(N+1)/2*
(N+2)/3* .*(N+n-1-1)/(n-1). Plus count of n-1 digit numbers beginning with 1, i.e., (N-1)*(N)/2*(N+1)/3* .*(N-1+n-1-1)/(n-1) (Note that N is
replaced by N-1) and so on.
Find length of the longest consecutive path from a given starting character
Given a matrix of characters. Find length of the longest path from a given character, such that all characters in the path are consecutive to each
other, i.e., every character in path is next to previous in alphabetical order. It is allowed to move in all 8 directions from a cell.
Example
Input: mat[][] = { {a,
{h,
{i,
Starting Point =
c, d},
b, e},
g, f}}
'e'
Output: 5
If starting point is 'e', then longest path with consecutive
characters is "e f g h i".
Input: mat[R][C] = { {b, e, f},
{h, d, a},
{i, c, a}};
Starting Point = 'b'
Output: 1
'c' is not present in all adjacent cells of 'b'
The idea is to first search given starting character in the given matrix. Do Depth First Search (DFS) from all occurrences to find all consecutive
paths. While doing DFS, we may encounter many subproblems again and again. So we use dynamic programming to store results of subproblems.
Below is C++ implementation of above idea.
// C++ program to find the longest consecutive path
#include<bits/stdc++.h>
#define R 3
#define C 3
using namespace std;
// tool matrices to recur for adjacent cells.
int x[] = {0, 1, 1, -1, 1, 0, -1, -1};
int y[] = {1, 0, 1, 1, -1, -1, 0, -1};
// dp[i][j] Stores length of longest consecutive path
// starting at arr[i][j].
int dp[R][C];
// check whether mat[i][j] is a valid cell or not.
bool isvalid(int i, int j)
{
if (i < 0 || j < 0 || i >= R || j >= C)
return false;
return true;
}
// Check whether current character is adjacent to previous
// character (character processed in parent call) or not.
bool isadjacent(char prev, char curr)
{
return ((curr - prev) == 1);
}
'a')
'e')
'b')
'f')
<<
<<
<<
<<
endl;
endl;
endl;
endl;
Output:
4
0
3
4
This problem is a variation of the problem discussed Coin Change Problem. Here instead of finding total number of possible solutions, we need to
find the solution with minimum number of coins.
The minimum number of coins for a value V can be computed using below recursive formula.
If V == 0, then 0 coins required.
If V > 0
minCoin(coins[0..m-1], V) = min {1 + minCoins(V-coin[i])}
where i varies from 0 to m-1
and coin[i] <= V
Output:
Minimum coins required is 2
The time complexity of above solution is exponential. If we draw the complete recursion tree, we can observer that many subproblems are solved
again and again. For example, when we start from V = 11, we can reach 6 by subtracting one 5 times and by subtracting 5 one times. So the
subproblem for 6 is called twice.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So the min coins problem has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems
can be avoided by constructing a temporary array table[][] in bottom up manner. Below is Dynamic Programming based solution.
// A Dynamic Programming based C++ program to find minimum of coins
// to make a given change V
#include<bits/stdc++.h>
using namespace std;
// m is size of coins array (number of different coins)
int minCoins(int coins[], int m, int V)
{
// table[i] will be storing the minimum number of coins
// required for i value. So table[V] will have result
int table[V+1];
// Base case (If given value V is 0)
table[0] = 0;
// Initialize all table values as Infinite
for (int i=1; i<=V; i++)
table[i] = INT_MAX;
// Compute minimum coins required for all
// values from 1 to V
for (int i=1; i<=V; i++)
{
// Go through all coins smaller than i
for (int j=0; j<m; j++)
if (coins[j] <= i)
{
int sub_res = table[i-coins[j]];
if (sub_res != INT_MAX && sub_res + 1 < table[i])
table[i] = sub_res + 1;
}
}
return table[V];
}
// Driver program to test above function
int main()
{
int coins[] = {9, 6, 5, 1};
int m = sizeof(coins)/sizeof(coins[0]);
int V = 11;
cout << "Minimum coins required is "
<< minCoins(coins, m, V);
return 0;
}
Output:
Minimum coins required is 2
6,
2,
1,
1,
1,
8, 2},
4, 3},
20, 10},
20, 10},
20, 10},
Output: 73
Explanation :
Source: http://qa.geeksforgeeks.org/1485/running-through-the-grid-to-get-maximum-nutritional-value
Both traversals always move forward along x
Base Cases:
// If destinations reached
if (x == R-1 && y1 == 0 && y2 == C-1)
maxPoints(arr, x, y1, y2) = arr[x][y1] + arr[x][y2];
// If any of the two locations is invalid (going out of grid)
if input is not valid
maxPoints(arr, x, y1, y2) = -INF (minus infinite)
// If both traversals are at same cell, then we count the value of cell
// only once.
If y1 and y2 are same
result = arr[x][y1]
Else
result = arr[x][y1] + arr[x][y2]
result += max { // Max of 9 cases
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
}
y1+1, y2),
y1+1, y2+1),
y1+1, y2-1),
y1-1, y2),
y1-1, y2+1),
y1-1, y2-1),
y1, y2),
y1, y2+1),
y1, y2-1)
The above recursive solution has many subproblems that are solved again and again. Therefore, we can use Dynamic Programming to solve the
above problem more efficiently. Below is memoization (Memoization is alternative to table based iterative solution in Dynamic Programming) based
implementation. In below implementation, we use a memoization table mem to keep track of already solved problems.
// A Memoization based program to find maximum collection
// using two traversals of a grid
#include<bits/stdc++.h>
using namespace std;
#define R 5
#define C 4
// checks whether a given input is valid or not
bool isValid(int x, int y1, int y2)
{
return (x >= 0 && x < R && y1 >=0 &&
y1 < C && y2 >=0 && y2 < C);
}
// Driver function to collect max value
int getMaxUtil(int arr[R][C], int mem[R][C][C], int x, int y1, int y2)
{
/*---------- BASE CASES -----------*/
// if P1 or P2 is at an invalid cell
if (!isValid(x, y1, y2)) return INT_MIN;
// if both traversals reach their destinations
if (x == R-1 && y1 == 0 && y2 == C-1)
return arr[x][y1] + arr[x][y2];
// If both traversals are at last row but not at their destination
if (x == R-1) return INT_MIN;
// If subproblem is already solved
if (mem[x][y1][y2] != -1) return mem[x][y1][y2];
// Initialize answer for this subproblem
int ans = INT_MIN;
// this variable is used to store gain of current cell(s)
int temp = (y1 == y2)? arr[x][y1]: arr[x][y1] + arr[x][y2];
/* Recur for all possible cases, then
one with max value */
ans = max(ans, temp + getMaxUtil(arr,
ans = max(ans, temp + getMaxUtil(arr,
ans = max(ans, temp + getMaxUtil(arr,
Output:
Maximum collection is 73
Thanks to Gaurav Ahirwar for suggesting above problem and solution here.
This problem is closely related to longest common subsequence problem. Below are steps.
1) Find Longest Common Subsequence (lcs) of two given strings. For example, lcs of geek and eke is ek.
2) Insert non-lcs characters (in their original order in strings) to the lcs found above, and return the result. So ek becomes geeke which is shortest
common supersequence.
Let us consider another example, str1 = AGGTAB and str2 = GXTXAYB. LCS of str1 and str2 is GTAB. Once we find LCS, we insert
characters of both strings in order and we get AGXGTXAYB
How does this work?
We need to find a string that has both strings as subsequences and is shortest such string. If both strings have all characters different, then result is
sum of lengths of two given strings. If there are common characters, then we dont want them multiple times as the task is to minimize length.
Therefore, we fist find the longest common subsequence, take one occurrence of this subsequence and add extra characters.
Length of the shortest supersequence = (Sum of lengths of given two strings) (Length of LCS of two given strings)
Below is C implementation of above idea. The below implementation only finds length of the shortest supersequence.
/* C program to find length of the shortest supersequence */
#include<stdio.h>
#include<string.h>
/* Utility function to get max of 2 integers */
int max(int a, int b) { return (a > b)? a : b; }
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n);
// Function to find length of the shortest supersequence
// of X and Y.
int shortestSuperSequence(char *X, char *Y)
{
int m = strlen(X), n = strlen(Y);
int l = lcs(X, Y, m, n); // find lcs
// Result is sum of input string lengths - length of lcs
return (m + n - l);
}
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n)
{
int L[m+1][n+1];
int i, j;
/* Following steps build L[m+1][n+1] in bottom up fashion.
Note that L[i][j] contains length of LCS of X[0..i-1]
and Y[0..j-1] */
for (i=0; i<=m; i++)
{
for (j=0; j<=n; j++)
{
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i-1] == Y[j-1])
L[i][j] = L[i-1][j-1] + 1;
else
L[i][j] = max(L[i-1][j], L[i][j-1]);
}
}
Output:
Length of the shortest supersequence is 9
Output:
Length of the shortest supersequence is 9
Time complexity of the above solution exponential O(2min(m, n)). Since there are overlapping subproblems, we can efficiently solve this recursive
problem using Dynamic Programming. Below is Dynamic Programming based implementation. Time complexity of this solution is O(mn).
/* A dynamic programming based C program to find length
of the shortest supersequence */
#include<bits/stdc++.h>
using namespace std;
// Returns length of the shortest supersequence of X and Y
int superSeq(char* X, char* Y, int m, int n)
{
int dp[m+1][n+1];
// Fill table in bottom up manner
for (int i = 0; i <= m; i++)
{
for (int j = 0; j <= n; j++)
{
// Below steps follow above recurrence
if (!i)
dp[i][j] = j;
else if (!j)
dp[i][j] = i;
else if (X[i-1] == Y[j-1])
dp[i][j] = 1 + dp[i-1][j-1];
else
dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1]);
}
}
return dp[m][n];
}
// Driver program to test above function
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
cout << "Length of the shortest supersequence is "
<< superSeq(X, Y, strlen(X), strlen(Y));
return 0;
}
Output:
Length of the shortest supersequence is 9
Naive Solution:
A naive solution is to go through every number x from 1 to n, and compute sum in x by traversing all digits of x. Below is C++ implementation of
this idea.
// A Simple C++ program to compute sum of digits in numbers from 1 to n
#include<iostream>
using namespace std;
int sumOfDigits(int );
// Returns sum of all digits in numbers from 1 to n
int sumOfDigitsFrom1ToN(int n)
{
int result = 0; // initialize result
// One by one compute sum of digits in every number from
// 1 to n
for (int x=1; x<=n; x++)
result += sumOfDigits(x);
return result;
}
// A utility function to compute sum of digits in a
// given number x
int sumOfDigits(int x)
{
int sum = 0;
while (x != 0)
{
sum += x %10;
x = x /10;
}
return sum;
}
// Driver Program
int main()
{
int n = 328;
cout << "Sum of digits in numbers from 1 to " << n << " is "
<< sumOfDigitsFrom1ToN(n);
return 0;
}
Output
Sum of digits in numbers from 1 to 328 is 3241
Efficient Solution:
Above is a naive solution. We can do it more efficiently by finding a pattern.
Let us take few examples.
sum(9) = 1 + 2 + 3 + 4 ........... + 9
= 9*10/2
= 45
sum(99) =
=
=
=
= sum(9)*10 + 45*10
sum(999) = sum(99)*10 + 45*100
In below implementation, the above formula is implemented using dynamic programming as there are overlapping subproblems.
The above formula is one core step of the idea. Below is complete algorithm
Algorithm: sum(n)
1) Find number of digits minus one in n. Let this value be 'd'.
For 328, d is 2.
2) Compute some of digits in numbers from 1 to 10d - 1.
Let this sum be w. For 328, we compute sum of digits from 1 to
99 using above formula.
3) Find Most significant digit (msd) in n. For 328, msd is 3.
4) Overall sum is sum of following terms
a) Sum of digits in 1 to "msd * 10d - 1". For 328, sum of
digits in numbers from 1 to 299.
For 328, we compute 3*sum(99) + (1 + 2)*100. Note that sum of
sum(299) is sum(99) + sum of digits from 100 to 199 + sum of digits
from 200 to 299.
Sum of 100 to 199 is sum(99) + 1*100 and sum of 299 is sum(99) + 2*100.
In general, this sum can be computed as w*msd + (msd*(msd-1)/2)*10d
b) Sum of digits in msd * 10d to n. For 328, sum of digits in
300 to 328.
For 328, this sum is computed as 3*29 + recursive call "sum(28)"
In general, this sum can be computed as msd * (n % (msd*10d) + 1)
+ sum(n % (10d))
Output
Sum of digits in numbers from 1 to 328 is 3241
The efficient algorithm has one more advantage that we need to compute the array a[] only once even when we are given multiple inputs.
on one side.
on other side
building.
on both sides.
N = 3
Output = 25
3 sections, which means possible ways for one side are
BSS, BSB, SSS, SBS, SSB where B represents a building
and S represents an empty space
Total possible ways are 25, because a way to place on
one side can correspond to any of 5 ways on other side.
N = 4
Output = 64
We can simplify the problem to first calculate for one side only. If we know the result for one side, we can always do square of the result and get
result for two sides.
A new building can be placed on a section if section just before it has space. A space can be placed anywhere (it doesnt matter whether the
previous section has a building or not).
Let countB(i) be count of
ending with
countS(i) be count of
ending with
}
// Result for one side is sum of ways ending with building
// and ending with space
int result = countS + countB;
// Result for 2 sides is square of result for one side
return (result*result);
}
// Driver program
int main()
{
int N = 3;
cout << "Count of ways for " << N
<< " sections is " << countWays(N);
return 0;
}
Output:
25
Therefore, we can use O(LogN) implementation of Fibonacci Numbers to find number of ways in O(logN) time.
Maximum possible using one transaction can be calculated using following O(n) algorithm
Maximum difference between two elements such that larger element appears after the smaller number
Time complexity of above simple solution is O(n2).
We can do this O(n) using following Efficient Solution. The idea is to store maximum possible profit of every subarray and solve the problem in
following two phases.
1) Create a table profit[0..n-1] and initialize all values in it 0.
2) Traverse price[] from right to left and update profit[i] such that profit[i] stores maximum profit achievable from one transaction in subarray
price[i..n-1]
3) Traverse price[] from left to right and update profit[i] such that profit[i] stores maximum profit such that profit[i] contains maximum achievable
profit from two transactions in subarray price[0..i].
4) Return profit[n-1]
To do step 1, we need to keep track of maximum price from right to left side and to do step 2, we need to keep track of minimum price from left
to right. Why we traverse in reverse directions? The idea is to save space, in second step, we use same array for both purposes, maximum with 1
transaction and maximum with 2 transactions. After an iteration i, the array profit[0..i] contains maximum profit with 2 transactions and
profit[i+1..n-1] contains profit with two transactions.
Below are implementations of above idea.
C++
// C++ program to find maximum possible profit with at most
// two transactions
#include<iostream>
using namespace std;
// Returns maximum profit with two transactions on a given
// list of stock prices, price[0..n-1]
int maxProfit(int price[], int n)
{
// Create profit array and initialize it as 0
int *profit = new int[n];
for (int i=0; i<n; i++)
profit[i] = 0;
Python
# Returns maximum profit with two transactions on a given
# list of stock prices price[0..n-1]
def maxProfit(price,n):
# Create profit array and initialize it as 0
profit = [0]*n
# Get the maximum profit with only one transaction
# allowed. After this loop, profit[i] contains maximum
# profit from price[i..n-1] using at most one trans.
max_price=price[n-1]
for i in range( n-2, 0 ,-1):
if price[i]> max_price:
max_price = price[i]
# we can get profit[i] by taking maximum of:
# a) previous maximum, i.e., profit[i+1]
# b) profit by buying at price[i] and selling at
#
max_price
profit[i] = max(profit[i+1], max_price - price[i])
# Get the maximum profit with two transactions allowed
# After this loop, profit[n-1] contains the result
min_price=price[0]
for i in range(1,n):
Examples:
Input: N = 3
Output: 3
We can at most get 3 A's on screen by pressing
following key sequence.
A, A, A
Input: N = 7
Output: 9
We can at most get 9 A's on screen by pressing
following key sequence.
A, A, A, Ctrl A, Ctrl C, Ctrl V, Ctrl V
Input: N = 11
Output: 27
We can at most get 27 A's on screen by pressing
following key sequence.
A, A, A, Ctrl A, Ctrl C, Ctrl V, Ctrl V, Ctrl A,
Ctrl C, Ctrl V, Ctrl V
max = curr;
}
return max;
}
// Driver program
int main()
{
int N;
// for the rest of the array we will rely on the previous
// entries to compute new ones
for (N=1; N<=20; N++)
printf("Maximum Number of A's with %d keystrokes is %d\n",
N, findoptimal(N));
}
Output:
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
1 keystrokes is 1
2 keystrokes is 2
3 keystrokes is 3
4 keystrokes is 4
5 keystrokes is 5
6 keystrokes is 6
7 keystrokes is 9
8 keystrokes is 12
9 keystrokes is 16
10 keystrokes is 20
11 keystrokes is 27
12 keystrokes is 36
13 keystrokes is 48
14 keystrokes is 64
15 keystrokes is 81
16 keystrokes is 108
17 keystrokes is 144
18 keystrokes is 192
19 keystrokes is 256
20 keystrokes is 324
The above function computes the same subproblems again and again. Recomputations of same subproblems can be avoided by storing the
solutions to subproblems and solving problems in bottom up manner.
Below is Dynamic Programming based C implementation where an auxiliary array screen[N] is used to store result of subproblems.
/* A Dynamic Programming based C program to find maximum number of A's
that can be printed using four keys */
#include<stdio.h>
// this function returns the optimal length string for N keystrokes
int findoptimal(int N)
{
// The optimal string length is N when N is smaller than 7
if (N <= 6)
return N;
// An array to store result of subproblems
int screen[N];
int b; // To pick a breakpoint
// Initializing the optimal lengths array for uptil 6 input
// strokes.
int n;
for (n=1; n<=6; n++)
screen[n-1] = n;
// Solve all subproblems in bottom manner
for (n=7; n<=N; n++)
{
// Initialize length of optimal string for n keystrokes
screen[n-1] = 0;
// For any keystroke n, we need to loop from n-3 keystrokes
// back to 1 keystroke to find a breakpoint 'b' after which we
// will have ctrl-a, ctrl-c and then only ctrl-v all the way.
for (b=n-3; b>=1; b--)
{
// if the breakpoint is at b'th keystroke then
// the optimal string would have length
// (n-b-1)*screen[b-1];
int curr = (n-b-1)*screen[b-1];
if (curr > screen[n-1])
screen[n-1] = curr;
}
}
return screen[N-1];
}
// Driver program
int main()
{
int N;
// for the rest of the array we will rely on the previous
// entries to compute new ones
for (N=1; N<=20; N++)
printf("Maximum Number of A's with %d keystrokes is %d\n",
N, findoptimal(N));
}
Output:
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
1 keystrokes is 1
2 keystrokes is 2
3 keystrokes is 3
4 keystrokes is 4
5 keystrokes is 5
6 keystrokes is 6
7 keystrokes is 9
8 keystrokes is 12
9 keystrokes is 16
10 keystrokes is 20
11 keystrokes is 27
12 keystrokes is 36
13 keystrokes is 48
14 keystrokes is 64
15 keystrokes is 81
16 keystrokes is 108
17 keystrokes is 144
18 keystrokes is 192
19 keystrokes is 256
20 keystrokes is 324
Thanks to Gaurav Saxena for providing the above approach to solve this problem.
The minimum cost to reach N-1 from 0 can be recursively written as following:
minCost(0, N-1) = MIN { cost[0][n-1],
cost[0][1] + minCost(1, N-1),
minCost(0, 2) + minCost(2, N-1),
........,
minCost(0, N-2) + cost[N-2][n-1] }
Output:
The Minimum cost to reach station 4 is 65
Time complexity of the above implementation is exponential as it tries every possible path from 0 to N-1. The above solution solves same
subrpoblems multiple times (it can be seen by drawing recursion tree for minCostPathRec(0, 5).
Since this problem has both properties of dynamic programming problems ((see this and this). Like other typical Dynamic Programming(DP)
problems, re-computations of same subproblems can be avoided by storing the solutions to subproblems and solving problems in bottom up
manner.
One dynamic programming solution is to create a 2D table and fill the table using above given recursive formula. The extra space required in this
solution would be O(N2) and time complexity would be O(N3)
We can solve this problem using O(N) extra space and O(N2) time. The idea is based on the fact that given input matrix is a Directed Acyclic
Graph (DAG). The shortest path in DAG can be calculated using the approach discussed in below post.
Shortest Path in Directed Acyclic Graph
We need to do less work here compared to above mentioned post as we know topological sorting of the graph. The topological sorting of vertices
here is 0, 1, ..., N-1. Following is the idea once topological sorting is known.
The idea in below code is to first calculate min cost for station 1, then for station 2, and so on. These costs are stored in an array dist[0...N-1].
1) The min cost for station 0 is 0, i.e., dist[0] = 0
2) The min cost for station 1 is cost[0][1], i.e., dist[1] = cost[0][1]
3) The min cost for station 2 is minimum of following two.
a) dist[0] + cost[0][2]
b) dist[1] + cost[1][2]
3) The min cost for station 3 is minimum of following three.
a) dist[0] + cost[0][3]
b) dist[1] + cost[1][3]
c) dist[2] + cost[2][3]
Similarly, dist[4], dist[5], ... dist[N-1] are calculated.
Below is C++ implementation of above idea.
// A Dynamic Programming based solution to find min cost
// to reach station N-1 from station 0.
#include<iostream>
#include<climits>
using namespace std;
#define INF INT_MAX
#define N 4
// This function returns the smallest possible cost to
// reach station N-1 from station 0.
int minCost(int cost[][N])
{
// dist[i] stores minimum cost to reach station i
// from station 0.
int dist[N];
for (int i=0; i<N; i++)
dist[i] = INF;
dist[0] = 0;
// Go through every station and check if using it
// as an intermediate station gives better path
for (int i=0; i<N; i++)
for (int j=i+1; j<N; j++)
if (dist[j] > dist[i] + cost[i][j])
dist[j] = dist[i] + cost[i][j];
return dist[N-1];
}
// Driver program to test above function
int main()
{
int cost[N][N] = { {0, 15, 80, 90},
{INF, 0, 40, 50},
{INF, INF, 0, 70},
{INF, INF, INF, 0}
};
cout << "The Minimum cost to reach station "
<< N << " is " << minCost(cost);
return 0;
}
Output:
The Minimum cost to reach station 4 is 65
The idea is to consider following two possibilities for root and recursively for all nodes down the root.
1) Root is part of vertex cover: In this case root covers all children edges. We recursively calculate size of vertex covers for left and right
subtrees and add 1 to the result (for root).
2) Root is not part of vertex cover: In this case, both children of root must be included in vertex cover to cover all root to children edges. We
recursively calculate size of vertex covers of all grandchildren and number of children to the result (for two children of root).
Below is C implementation of above idea.
// A naive recursive C implementation for vertex cover problem for a tree
#include <stdio.h>
#include <stdlib.h>
// A utility function to find min of two integers
int min(int x, int y) { return (x < y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
struct node *left, *right;
};
// The function returns size of the minimum vertex cover
int vCover(struct node *root)
{
// The size of minimum vertex cover is zero if tree is empty or there
// is only one node
if (root == NULL)
return 0;
if (root->left == NULL && root->right == NULL)
return 0;
// Calculate size of vertex cover when root is part of it
int size_incl = 1 + vCover(root->left) + vCover(root->right);
// Calculate size of vertex cover when root is not part of it
int size_excl = 0;
if (root->left)
size_excl += 1 + vCover(root->left->left) + vCover(root->left->right);
if (root->right)
size_excl += 1 + vCover(root->right->left) + vCover(root->right->right);
// Return the minimum of two sizes
return min(size_incl, size_excl);
}
// A utility function to create a node
struct node* newNode( int data )
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the smallest vertex cover is %d ", vCover(root));
return 0;
}
Output:
Size of the smallest vertex cover is 3
Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. For example, vCover of node with value 50 is evaluated twice as 50 is grandchild of 10 and child of 20.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So Vertex Cover problem has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, re-computations of same subproblems
can be avoided by storing the solutions to subproblems and solving problems in bottom up manner.
Following is C implementation of Dynamic Programming based solution. In the following solution, an additional field vc is added to tree nodes. The
initial value of vc is set as 0 for all nodes. The recursive function vCover() calculates vc for a node only if it is not already set.
/* Dynamic programming based program for Vertex Cover problem for
a Binary Tree */
#include <stdio.h>
#include <stdlib.h>
// A utility function to find min of two integers
int min(int x, int y) { return (x < y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
int vc;
struct node *left, *right;
};
// A memoization based function that returns size of the minimum vertex cover.
int vCover(struct node *root)
{
// The size of minimum vertex cover is zero if tree is empty or there
// is only one node
if (root == NULL)
return 0;
if (root->left == NULL && root->right == NULL)
return 0;
// If vertex cover for this node is already evaluated, then return it
// to save recomputation of same subproblem again.
if (root->vc != 0)
return root->vc;
// Calculate size of vertex cover when root is part of it
int size_incl = 1 + vCover(root->left) + vCover(root->right);
// Calculate size of vertex cover when root is not part of it
int size_excl = 0;
if (root->left)
size_excl += 1 + vCover(root->left->left) + vCover(root->left->right);
if (root->right)
size_excl += 1 + vCover(root->right->left) + vCover(root->right->right);
Output:
Size of the smallest vertex cover is 3
References:
http://courses.csail.mit.edu/6.006/spring11/lectures/lec21.pdf
Exercise:
Extend the above solution for n-ary trees.
We strongly recommend you to minimize the browser and try this yourself first.
This problem is a variation of coin change problem and can be solved in O(n) time and O(n) auxiliary space.
The idea is to create a table of size n+1 to store counts of all scores from 0 to n. For every possible move (3, 5 and 10), increment values in table.
// A C program to count number of possible ways to a given score
// can be reached in a game where a move can earn 3 or 5 or 10
#include <stdio.h>
// Returns number of ways to reach score n
int count(int n)
{
// table[i] will store count of solutions for
// value i.
int table[n+1], i;
// Initialize all table values as 0
memset(table, 0, sizeof(table));
// Base case (If given value is 0)
table[0] = 1;
// One by one consider given 3 moves and update the table[]
// values after the index greater than or equal to the
// value of the picked move
for (i=3; i<=n; i++)
table[i] += table[i-3];
for (i=5; i<=n; i++)
table[i] += table[i-5];
for (i=10; i<=n; i++)
table[i] += table[i-10];
return table[n];
}
// Driver program
int main(void)
{
int n = 20;
printf("Count for %d is %d\n", n, count(n));
n = 13;
printf("Count for %d is %d", n, count(n));
return 0;
}
Output:
Count for 20 is 4
Count for 13 is 2
Exercise: How to count score when (10, 5, 5), (5, 5, 10) and (5, 10, 5) are considered as different sequences of moves. Similarly, (5, 3, 3), (3,
5, 3) and (3, 3, 5) are considered different.
A simple version of this problem is discussed here where every job has same profit or value. The Greedy Strategy for activity selection doesnt
work here as the longer schedule may have smaller profit or value.
The above problem can be solved using following recursive solution.
1) First sort jobs according to finish time.
2) Now apply following recursive process.
// Here arr[] is array of n jobs
findMaximumProfit(arr[], n)
{
a) if (n == 1) return arr[0];
b) Return the maximum of following two profits.
(i) Maximum profit by excluding current job, i.e.,
findMaximumProfit(arr, n-1)
(ii) Maximum profit by including the current job
}
How to find the profit including current job?
The idea is to find the latest job before the current job (in
sorted array) that doesn't conflict with current job 'arr[n-1]'.
Once we find such a job, we recur for all jobs till that job and
add profit of current job to result.
In the above example, "job 1" is the latest non-conflicting
for "job 4" and "job 2" is the latest non-conflicting for "job 3".
}
// A recursive function that returns the maximum possible
// profit from given array of jobs. The array of jobs must
// be sorted according to finish time.
int findMaxProfitRec(Job arr[], int n)
{
// Base case
if (n == 1) return arr[n-1].profit;
// Find profit when current job is inclueded
int inclProf = arr[n-1].profit;
int i = latestNonConflict(arr, n);
if (i != -1)
inclProf += findMaxProfitRec(arr, i+1);
// Find profit when current job is excluded
int exclProf = findMaxProfitRec(arr, n-1);
return max(inclProf, exclProf);
}
// The main function that returns the maximum possible
// profit from given array of jobs
int findMaxProfit(Job arr[], int n)
{
// Sort jobs according to finish time
sort(arr, arr+n, myfunction);
return findMaxProfitRec(arr, n);
}
// Driver program
int main()
{
Job arr[] = {{3, 10, 20}, {1, 2, 50}, {6, 19, 100}, {2, 100, 200}};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "The optimal profit is " << findMaxProfit(arr, n);
return 0;
}
Output:
The optimal profit is 250
The above solution may contain many overlapping subproblems. For example if lastNonConflicting() always returns previous job, then
findMaxProfitRec(arr, n-1) is called twice and the time complexity becomes O(n*2n). As another example when lastNonConflicting() returns
previous to previous job, there are two recursive calls, for n-2 and n-1. In this example case, recursion becomes same as Fibonacci Numbers.
So this problem has both properties of Dynamic Programming, Optimal Substructure and Overlapping Subproblems.
Like other Dynamic Programming Problems, we can solve this problem by making a table that stores solution of subproblems.
Below is C++ implementation based on Dynamic Programming.
// C++ program for weighted job scheduling using Dynamic Programming.
#include <iostream>
#include <algorithm>
using namespace std;
// A job has start time, finish time and profit.
struct Job
{
int start, finish, profit;
};
// A utility function that is used for sorting events
// according to finish time
bool myfunction(Job s1, Job s2)
{
return (s1.finish < s2.finish);
}
// Find the latest job (in sorted array) that doesn't
// conflict with the job[i]
int latestNonConflict(Job arr[], int i)
{
for (int j=i-1; j>=0; j--)
{
if (arr[j].finish <= arr[i].start)
return j;
}
return -1;
}
// The main function that returns the maximum possible
// profit from given array of jobs
int findMaxProfit(Job arr[], int n)
{
// Sort jobs according to finish time
sort(arr, arr+n, myfunction);
// Create an array to store solutions of subproblems. table[i]
// stores the profit for jobs till arr[i] (including arr[i])
int *table = new int[n];
table[0] = arr[0].profit;
// Fill entries in M[] using recursive property
for (int i=1; i<n; i++)
{
// Find profit including the current job
int inclProf = arr[i].profit;
int l = latestNonConflict(arr, i);
if (l != -1)
inclProf += table[l];
// Store maximum of including and excluding
table[i] = max(inclProf, table[i-1]);
}
// Store result and free dynamic memory allocated for table[]
int result = table[n-1];
delete[] table;
return result;
}
// Driver program
int main()
{
Job arr[] = {{3, 10, 20}, {1, 2, 50}, {6, 19, 100}, {2, 100, 200}};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "The optimal profit is " << findMaxProfit(arr, n);
return 0;
}
Output:
The optimal profit is 250
Time Complexity of the above Dynamic Programming Solution is O(n2). Note that the above solution can be optimized to O(nLogn) using Binary
Search in latestNonConflict() instead of linear search. Thanks to Garvit for suggesting this optimization.
References:
http://courses.cs.washington.edu/courses/cse521/13wi/slides/06dp-sched.pdf
Longest Even Length Substring such that Sum of First and Second Half is same
Given a string str of digits, find length of the longest substring of str, such that the length of the substring is 2k digits and sum of left k digits is equal
to the sum of right k digits.
Examples:
Input: str = "123123"
Output: 6
The complete string is of even length and sum of first and second
half digits is same
Input: str = "1538023"
Output: 4
The longest substring with same first and second half sum is "5380"
Output:
Length of the substring is 4
{
int n = strlen(str);
int maxlen = 0; // Initialize result
// A 2D table where sum[i][j] stores sum of digits
// from str[i] to str[j]. Only filled entries are
// the entries where j >= i
int sum[n][n];
// Fill the diagonal values for sunstrings of length 1
for (int i =0; i<n; i++)
sum[i][i] = str[i]-'0';
// Fill entries for substrings of length 2 to n
for (int len=2; len<=n; len++)
{
// Pick i and j for current substring
for (int i=0; i<n-len+1; i++)
{
int j = i+len-1;
int k = len/2;
// Calculate value of sum[i][j]
sum[i][j] = sum[i][j-k] + sum[j-k+1][j];
// Update result if 'len' is even, left and right
// sums are same and len is more than maxlen
if (len%2 == 0 && sum[i][j-k] == sum[(j-k+1)][j]
&& len > maxlen)
maxlen = len;
}
}
return maxlen;
}
// Driver program to test above function
int main(void)
{
char str[] = "153803";
printf("Length of the substring is %d", findLength(str));
return 0;
}
Output:
Length of the substring is 4
Time complexity of the above solution is O(n2), but it requires O(n2) extra space.
}
}
return ans;
}
// Driver program to test above function
int main()
{
string str = "123123";
cout << "Length of the substring is " << findLength(str, str.length());
return 0;
}
Output:
Length of the substring is 6
Output:
Length of the substring is 6
Two triangulations of the same convex pentagon. The triangulation on the left has a cost of 8 + 2?2 + 2?5 (approximately 15.30), the
one on the right has a cost of 4 + 2?2 + 4?5 (approximately 15.77).
This problem has recursive substructure. The idea is to divide the polygon into three parts: a single triangle, the sub-polygon to the left, and the
sub-polygon to the right. We try all possible divisions like this and find the one that minimizes the cost of the triangle plus the cost of the
triangulation of the two sub-polygons.
Let Minimum Cost of triangulation of vertices from i to j be minCost(i, j)
If j <= i + 2 Then
minCost(i, j) = 0
Else
minCost(i, j) = Min { minCost(i, k) + minCost(k, j) + cost(i, k, j) }
Here k varies from 'i+1' to 'j-1'
Cost of a triangle formed by edges (i, j), (j, k) and (k, j) is
cost(i, j, k) = dist(i, j) + dist(j, k) + dist(k, j)
return 0;
// Initialize result as infinite
double res = MAX;
// Find minimum triangulation by considering all
for (int k=i+1; k<j; k++)
res = min(res, (mTC(points, i, k) + mTC(points, k, j) +
cost(points, i, k, j)));
return res;
}
// Driver program to test above functions
int main()
{
Point points[] = {{0, 0}, {1, 0}, {2, 1}, {1, 2}, {0, 2}};
int n = sizeof(points)/sizeof(points[0]);
cout << mTC(points, 0, n-1);
return 0;
}
Output:
15.3006
The above problem is similar to Matrix Chain Multiplication. The following is recursion tree for mTC(points[], 0, 4).
It can be easily seen in the above recursion tree that the problem has many overlapping subproblems. Since the problem has both properties:
Optimal Substructure and Overlapping Subproblems, it can be efficiently solved using dynamic programming.
Following is C++ implementation of dynamic programming solution.
// A Dynamic Programming based program to find minimum cost of convex
// polygon triangulation
#include <iostream>
#include <cmath>
#define MAX 1000000.0
using namespace std;
// Structure of a point in 2D plane
struct Point
{
int x, y;
};
// Utility function to find minimum of two double values
double min(double x, double y)
{
return (x <= y)? x : y;
}
// A utility function
double dist(Point p1,
{
return sqrt((p1.x
(p1.y
}
Output:
15.3006
Output:
Pattern found at index 10
2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database,
pattern searching algorithms are used to show the search results.
Naive Pattern Searching:
Slide the pattern over text one by one and check for a match. If a match is found, then slides by 1 again to check for subsequent matches.
C
// C program for Naive Pattern Searching algorithm
#include<stdio.h>
#include<string.h>
void search(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
/* A loop to slide pat[] one by one */
for (int i = 0; i <= N - M; i++)
{
int j;
/* For current index i, check for pattern match */
for (j = 0; j < M; j++)
if (txt[i+j] != pat[j])
break;
if (j == M) // if pat[0...M-1] = txt[i, i+1, ...i+M-1]
printf("Pattern found at index %d \n", i);
}
}
/* Driver program to test above function */
int main()
{
char txt[] = "AABAACAADAABAAABAA";
char pat[] = "AABA";
search(pat, txt);
return 0;
}
Python
# Python program for Naive Pattern Searching
def search(pat, txt):
M = len(pat)
N = len(txt)
# A loop to slide pat[] one by one
for i in xrange(N-M+1):
2) Worst case also occurs when only the last character is different.
txt[] = "AAAAAAAAAAAAAAAAAB"
pat[] = "AAAAB"
Number of comparisons in worst case is O(m*(n-m+1)). Although strings which have repeated characters are not likely to appear in English text,
they may well occur in other applications (for example, in binary texts). The KMP matching algorithm improves the worst case to O(n). We will be
covering KMP in the next post. Also, we will be writing more posts to cover all pattern searching algorithms and data structures.
Output:
Pattern found at index 10
2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database,
pattern searching algorithms are used to show the search results.
We have discussed Naive pattern searching algorithm in the previous post. The worst case complexity of Naive algorithm is O(m(n-m+1)). Time
complexity of KMP algorithm is O(n) in worst case.
KMP (Knuth Morris Pratt) Pattern Searching
The Naive pattern searching algorithm doesnt work well in cases where we see many matching characters followed by a mismatching character.
Following are some examples.
txt[] = "AAAAAAAAAAAAAAAAAB"
pat[] = "AAAAB"
txt[] = "ABABABCABABABCABABABC"
pat[] = "ABABAC" (not a worst case, but a bad case for Naive)
The KMP matching algorithm uses degenerating property (pattern having same sub-patterns appearing more than once in the pattern) of the
pattern and improves the worst case complexity to O(n). The basic idea behind KMPs algorithm is: whenever we detect a mismatch (after some
matches), we already know some of the characters in the text (since they matched the pattern characters prior to the mismatch). We take
advantage of this information to avoid matching the characters that we know will anyway match.
KMP algorithm does some preprocessing over the pattern pat[] and constructs an auxiliary array lps[] of size m (same as size of pattern). Here
name lps indicates longest proper prefix which is also suffix.. For each sub-pattern pat[0i] where i = 0 to m-1, lps[i] stores length of the
maximum matching proper prefix which is also a suffix of the sub-pattern pat[0..i].
lps[i] = the longest proper prefix of pat[0..i]
which is also a suffix of pat[0..i].
Examples:
For the pattern AABAACAABAA, lps[] is [0, 1, 0, 1, 2, 0, 1, 2, 3, 4, 5]
For the pattern ABCDE, lps[] is [0, 0, 0, 0, 0]
For the pattern AAAAA, lps[] is [0, 1, 2, 3, 4]
For the pattern AAABAAA, lps[] is [0, 1, 2, 0, 1, 2, 3]
For the pattern AAACAAAAAC, lps[] is [0, 1, 2, 0, 1, 2, 3, 3, 3, 4]
Searching Algorithm:
Unlike the Naive algo where we slide the pattern by one, we use a value from lps[] to decide the next sliding position. Let us see how we do that.
When we compare pat[j] with txt[i] and see a mismatch, we know that characters pat[0..j-1] match with txt[i-j+1i-1], and we also know that
lps[j-1] characters of pat[0j-1] are both proper prefix and suffix which means we do not need to match these lps[j-1] characters with txt[i-ji-1]
because we know that these characters will anyway match. See KMPSearch() in the below code for details.
Preprocessing Algorithm:
In the preprocessing part, we calculate values in lps[]. To do that, we keep track of the length of the longest prefix suffix value (we use len variable
for this purpose) for the previous index. We initialize lps[0] and len as 0. If pat[len] and pat[i] match, we increment len by 1 and assign the
incremented value to lps[i]. If pat[i] and pat[len] do not match and len is not 0, we update len to lps[len-1]. See computeLPSArray () in the below
code for details.
C
// C program for implementation of KMP pattern searching
// algorithm
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void computeLPSArray(char *pat, int M, int *lps);
void KMPSearch(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix
// values for pattern
int *lps = (int *)malloc(sizeof(int)*M);
int j = 0; // index for pat[]
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
while (i < N)
{
if (pat[j] == txt[i])
{
j++;
i++;
}
if (j == M)
{
printf("Found pattern at index %d \n", i-j);
j = lps[j-1];
}
// mismatch after j matches
else if (i < N && pat[j] != txt[i])
{
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j-1];
else
i = i+1;
}
}
free(lps); // to avoid memory leak
}
void computeLPSArray(char *pat, int M, int *lps)
{
int len = 0; // length of the previous longest prefix suffix
int i;
lps[0] = 0; // lps[0] is always 0
i = 1;
// the loop calculates lps[i] for i = 1 to M-1
while (i < M)
{
if (pat[i] == pat[len])
{
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
if (len != 0)
{
// This is tricky. Consider the example
// AAACAAAA and i = 7.
len = lps[len-1];
// Also, note that we do not increment i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
// Driver program to test above function
int main()
{
char *txt = "ABABDABACDABABCABAB";
char *pat = "ABABCABAB";
KMPSearch(pat, txt);
return 0;
}
Python
# Python program for KMP Algorithm
def KMPSearch(pat, txt):
M = len(pat)
N = len(txt)
# create lps[] that will hold the longest prefix suffix
# values for pattern
lps = [0]*M
j = 0 # index for pat[]
# Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps)
i = 0 # index for txt[]
while i < N:
if pat[j] == txt[i]:
i+=1
j+=1
if j==M:
print "Found pattern at index " + str(i-j)
j = lps[j-1]
# mismatch after j matches
elif i < N and pat[j] != txt[i]:
# Do not match lps[0..lps[j-1]] characters,
# they will match anyway
if j != 0:
j = lps[j-1]
else:
i+=1
def computeLPSArray(pat, M, lps):
len = 0 # length of the previous longest prefix suffix
lps[0] # lps[0] is always 0
i = 1
# the loop calculates lps[i] for i = 1 to M-1
while i < M:
if pat[i]==pat[len]:
len+=1
lps[i] = len
i+=1
else:
if len!=0:
# This is tricky. Consier the example AAACAAAA
# and i = 7
len = lps[len-1]
# Also, note that we do not increment i here
else:
lps[i] = 0
i+=1
txt = "ABABDABACDABABCABAB"
pat = "ABABCABAB"
KMPSearch(pat, txt)
# This code is contributed by Bhavya Jain
Output:
Pattern found at index 10
2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
The Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all
characters match then prints the match.
Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches
the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual
characters. So Rabin Karp algorithm needs to calculate hash values for following strings.
1) Pattern itself.
2) All the substrings of text of length m.
Since we need to efficiently calculate hash values for all the substrings of size m of text, we must have a hash function which has following property.
Hash at the next shift must be efficiently computable from the current hash value and next character in text or we can say hash(txt[s+1 .. s+m])
must be efficiently computable from hash(txt[s .. s+m-1]) and txt[s+m] i.e., hash(txt[s+1 .. s+m])= rehash(txt[s+m], hash(txt[s .. s+m1]) and rehash must be O(1) operation.
The hash function suggested by Rabin and Karp calculates an integer value. The integer value for a string is numeric value of a string. For example,
if all possible characters are from 1 to 10, the numeric value of 122 will be 122. The number of possible characters is higher than 10 (256 in
general) and pattern length can be large. So the numeric values cannot be practically stored as an integer. Therefore, the numeric value is calculated
using modular arithmetic to make sure that the hash values can be stored in an integer variable (can fit in memory words). To do rehashing, we
need to take off the most significant digit and add the new least significant digit for in hash value. Rehashing is done using the following formula.
hash( txt[s+1 .. s+m] ) = d ( hash( txt[s .. s+m-1]) txt[s]*h ) + txt[s + m] ) mod q
hash( txt[s .. s+m-1] ) : Hash value at shift s.
hash( txt[s+1 .. s+m] ) : Hash value at next shift (or shift s+1)
d: Number of characters in the alphabet
q: A prime number
h: d^(m-1)
C/C++
/* Following program is a C implementation of Rabin Karp
Algorithm given in the CLRS book */
#include<stdio.h>
#include<string.h>
// d is the number of characters in input alphabet
#define d 256
/* pat -> pattern
txt -> text
q -> A prime number
*/
void search(char pat[], char txt[], int q)
{
int M = strlen(pat);
int N = strlen(txt);
int
int
int
int
i, j;
p = 0; // hash value for pattern
t = 0; // hash value for txt
h = 1;
}
// Calculate hash value for next window of text: Remove
// leading digit, add trailing digit
if ( i < N-M )
{
t = (d*(t - txt[i]*h) + txt[i+M])%q;
// We might get negative value of t, converting it
// to positive
if (t < 0)
t = (t + q);
}
}
}
/* Driver program to test above function */
int main()
{
char txt[] = "GEEKS FOR GEEKS";
char pat[] = "GEEK";
int q = 101; // A prime number
search(pat, txt, q);
return 0;
}
Python
# Following program is the python implementation of
# Rabin Karp Algorithm given in CLRS book
# d is the number of characters in input alphabet
d = 256
# pat -> pattern
# txt -> text
# q
-> A prime number
def search(pat, txt, q):
M = len(pat)
N = len(txt)
i = 0
j = 0
p = 0
# hash value for pattern
t = 0
h = 1
The average and best case running time of the Rabin-Karp algorithm is O(n+m), but its worst-case time is O(nm). Worst case of Rabin-Karp
algorithm occurs when all characters of pattern and text are same as the hash values of all the substrings of txt[] match with hash value of pat[]. For
example pat[] = AAA and txt[] = AAAAAAA.
References:
http://net.pku.edu.cn/~course/cs101/2007/resource/Intro2Algorithm/book6/chap34.htm
http://www.cs.princeton.edu/courses/archive/fall04/cos226/lectures/string.4up.pdf
http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm
Related Posts:
Searching for Patterns | Set 1 (Naive Pattern Searching)
Searching for Patterns | Set 2 (KMP Algorithm)
C
/* C program for A modified Naive Pattern Searching
algorithm that is optimized for the cases when all
characters of pattern are different */
#include<stdio.h>
#include<string.h>
/* A modified Naive Pettern Searching algorithn that is optimized
for the cases when all characters of pattern are different */
void search(char pat[], char txt[])
{
int M = strlen(pat);
int N = strlen(txt);
int i = 0;
while (i <= N - M)
{
int j;
/* For current index i, check for pattern match */
for (j = 0; j < M; j++)
if (txt[i+j] != pat[j])
break;
if (j == M) // if pat[0...M-1] = txt[i, i+1, ...i+M-1]
{
printf("Pattern found at index %d \n", i);
i = i + M;
}
else if (j == 0)
i = i + 1;
else
i = i + j; // slide the pattern by j
}
}
/* Driver program to test above function */
int main()
{
char txt[] = "ABCEABCDABCEABCD";
char pat[] = "ABCD";
search(pat, txt);
return 0;
}
Python
# Python program for A modified Naive Pattern Searching
# algorithm that is optimized for the cases when all
# characters of pattern are different
def search(pat, txt):
M = len(pat)
N = len(txt)
i = 0
while i <= N-M:
# For current index i, check for pattern match
for j in xrange(M):
if txt[i+j] != pat[j]:
break
j += 1
if j==M:
# if pat[0...M-1] = txt[i,i+1,...i+M-1]
print "Pattern found at index " + str(i)
i = i + M
elif j==0:
i = i + 1
else:
i = i+ j
Output:
Pattern found at index 10
2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database,
pattern searching algorithms are used to show the search results.
We have discussed the following algorithms in the previous posts:
Naive Algorithm
KMP Algorithm
Rabin Karp Algorithm
In this post, we will discuss Finite Automata (FA) based pattern searching algorithm. In FA based algorithm, we preprocess the pattern and build a
2D array that represents a Finite Automata. Construction of the FA is the main tricky part of this algorithm. Once the FA is built, the searching is
simple. In search, we simply need to start from the first state of the automata and first character of the text. At every step, we consider next
character of text, look for the next state in the built FA and move to new state. If we reach final state, then pattern is found in text. Time complexity
of the search prcess is O(n).
Before we discuss FA construction, let us take a look at the following FA for pattern ACACAGA.
The abvoe diagrams represent graphical and tabular representations of pattern ACACAGA.
Number of states in FA will be M+1 where M is length of the pattern. The main thing to construct FA is to get the next state from the current state
for every possible character. Given a character x and a state k, we can get the next state by considering the string pat[0..k-1]x which is basically
concatenation of pattern characters pat[0], pat[1] pat[k-1] and the character x. The idea is to get length of the longest prefix of the given pattern
such that the prefix is also suffix of pat[0..k-1]x. The value of length gives us the next state. For example, let us see how to get the next state from
current state 5 and character C in the above diagram. We need to consider the string, pat[0..5]C which is ACACAC. The lenght of the longest
prefix of the pattern such that the prefix is suffix of ACACACis 4 (ACAC). So the next state (from state 5) is 4 for character C.
In the following code, computeTF() constructs the FA. The time complexity of the computeTF() is O(m^3*NO_OF_CHARS) where m is length
of the pattern and NO_OF_CHARS is size of alphabet (total number of possible characters in pattern and text). The implementation tries all
possible prefixes starting from the longest possible that can be a suffix of pat[0..k-1]x. There are better implementations to construct FA in
O(m*NO_OF_CHARS) (Hint: we can use something like lps array construction in KMP algorithm). We have covered the better implementation
in our next post on pattern searching.
#include<stdio.h>
#include<string.h>
#define NO_OF_CHARS 256
int getNextState(char *pat, int M, int state, int x)
{
// If the character c is same as next character in pattern,
// then simply increment state
if (state < M && x == pat[state])
return state+1;
int ns, i; // ns stores the result which is next state
// ns finally contains the longest prefix which is also suffix
// in "pat[0..state-1]c"
// Start from the largest possible value and stop when you find
// a prefix which is also suffix
for (ns = state; ns > 0; ns--)
{
if(pat[ns-1] == x)
{
for(i = 0; i < ns-1; i++)
{
if (pat[i] != pat[state-ns+1+i])
break;
}
if (i == ns-1)
return ns;
}
}
return 0;
}
/* This function builds the TF table which represents Finite Automata for a
given pattern */
void computeTF(char *pat, int M, int TF[][NO_OF_CHARS])
{
int state, x;
for (state = 0; state <= M; ++state)
for (x = 0; x < NO_OF_CHARS; ++x)
TF[state][x] = getNextState(pat, M, state, x);
}
/* Prints all occurrences of pat in txt */
void search(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
int TF[M+1][NO_OF_CHARS];
computeTF(pat, M, TF);
// Process txt over FA.
int i, state=0;
for (i = 0; i < N; i++)
{
state = TF[state][txt[i]];
if (state == M)
{
printf ("\n patterb found at index %d", i-M+1);
}
}
}
// Driver program to test above function
int main()
{
char *txt = "AABAACAADAABAAABAA";
char *pat = "AABA";
search(pat, txt);
return 0;
Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
References:
Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
The abvoe diagrams represent graphical and tabular representations of pattern ACACAGA.
Algorithm:
1) Fill the first row. All entries in first row are always 0 except the entry for pat[0] character. For pat[0] character, we always need to go to state
1.
2) Initialize lps as 0. lps for the first index is always 0.
3) Do following for rows at index i = 1 to M. (M is the length of the pattern)
..a) Copy the entries from the row at index equal to lps.
..b) Update the entry for pat[i] character to i+1.
..c) Update lps lps = TF[lps][pat[i]] where TF is the 2D array which is being constructed.
Implementation
Following is C implementation for the above algorithm.
#include<stdio.h>
#include<string.h>
#define NO_OF_CHARS 256
/* This function builds the TF table which represents Finite Automata for a
given pattern */
void computeTransFun(char *pat, int M, int TF[][NO_OF_CHARS])
{
int i, lps = 0, x;
// Fill entries in first row
for (x =0; x < NO_OF_CHARS; x++)
TF[0][x] = 0;
TF[0][pat[0]] = 1;
// Fill entries in other rows
for (i = 1; i<= M; i++)
{
// Copy values from row at index lps
for (x = 0; x < NO_OF_CHARS; x++)
TF[i][x] = TF[lps][x];
// Update the entry corresponding to this character
TF[i][pat[i]] = i + 1;
// Update lps for next row to be filled
if (i < M)
lps = TF[lps][pat[i]];
}
}
/* Prints all occurrences of pat in txt */
Output:
pattern found at index 0
pattern found at index 10
Time Complexity for FA construction is O(M*NO_OF_CHARS). The code for search is same as the previous post and time complexity for it is
O(n).
Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database,
pattern searching algorithms are used to show the search results.
We have discussed the following algorithms in the previous posts:
Naive Algorithm
KMP Algorithm
Rabin Karp Algorithm
Finite Automata based Algorithm
In this post, we will discuss Boyer Moore pattern searching algorithm. Like KMP and Finite Automata algorithms, Boyer Moore algorithm also
preprocesses the pattern.
Boyer Moore is a combination of following two approaches.
1) Bad Character Heuristic
2) Good Suffix Heuristic
Both of the above heuristics can also be used independently to search a pattern in a text. Let us first understand how two independent approaches
work together in the Boyer Moore algorithm. If we take a look at the Naive algorithm, it slides the pattern over the text one by one. KMP
algorithm does preprocessing over the pattern so that the pattern can be shifted by more than one. The Boyer Moore algorithm does
preprocessing for the same reason. It preporcesses the pattern and creates different arrays for both heuristics. At every step, it slides the pattern
by max of the slides suggested by the two heuristics. So it uses best of the two heuristics at every step. Unlike the previous pattern searching
algorithms, Boyer Moore algorithm starts matching from the last character of the pattern.
In this post, we will discuss bad character heuristic, and discuss Good Suffix heuristic in the next post.
The idea of bad character heuristic is simple. The character of the text which doesnt match with the current character of pattern is called the Bad
Character. Whenever a character doesnt match, we slide the pattern in such a way that aligns the bad character with the last occurrence of it in
pattern. We preprocess the pattern and store the last occurrence of every possible character in an array of size equal to alphabet size. If the
character is not present at all, then it may result in a shift by m (length of pattern). Therefore, the bad character heuristic takes O(n/m) time in the
best case.
/* Program for Bad Character Heuristic of Boyer Moore String Matching Algorithm */
# include <limits.h>
# include <string.h>
# include <stdio.h>
# define NO_OF_CHARS 256
// A utility function to get maximum of two integers
int max (int a, int b) { return (a > b)? a: b; }
// The preprocessing function for Boyer Moore's bad character heuristic
void badCharHeuristic( char *str, int size, int badchar[NO_OF_CHARS])
{
int i;
// Initialize all occurrences as -1
Output:
pattern occurs at shift = 4
The Bad Character Heuristic may take O(mn) time in worst case. The worst case occurs when all characters of the text and pattern are same. For
example, txt[] = AAAAAAAAAAAAAAAAAA and pat[] = AAAAA.
banana
anana
nana
ana
na
a
5
3
1
0
4
2
a
ana
anana
banana
na
nana
Output:
Following is suffix array for banana
5 3 1 0 4 2
The time complexity of above method to build suffix array is O(n2Logn) if we consider a O(nLogn) algorithm used for sorting. The sorting step
itself takes O(n2Logn) time as every comparison is a comparison of two strings and the comparison takes O(n) time.
There are many efficient algorithms to build suffix array. We will soon be covering them as separate posts.
Search a pattern using the built Suffix Array
To search a pattern in a text, we preprocess the text and build a suffix array of the text. Since we have a sorted array of all suffixes, Binary Search
can be used to search. Following is the search function. Note that the function doesnt report all occurrences of pattern, it only report one of them.
// This code only contains search() and main. To make it a complete running
// above code or see http://ideone.com/1Io9eN
// A suffix array based search function to search a given pattern
// 'pat' in given text 'txt' using suffix array suffArr[]
void search(char *pat, char *txt, int *suffArr, int n)
{
int m = strlen(pat); // get length of pattern, needed for strncmp()
// Do simple binary search for the pat in txt using the
// built suffix array
int l = 0, r = n-1; // Initilize left and right indexes
while (l <= r)
{
// See if 'pat' is prefix of middle suffix in suffix array
int mid = l + (r - l)/2;
int res = strncmp(pat, txt+suffArr[mid], m);
// If match found at the middle, print it and return
if (res == 0)
{
cout << "Pattern found at index " << suffArr[mid];
return;
}
// Move to left half if pattern is alphabtically less than
// the mid suffix
if (res < 0) r = mid - 1;
// Otherwise move to right half
else l = mid + 1;
}
// We reach here if return statement in loop is not executed
cout << "Pattern not found";
}
// Driver program to test above function
int main()
{
char txt[] = "banana"; // text
char pat[] = "nan"; // pattern to be searched in text
// Build suffix array
int n = strlen(txt);
int *suffArr = buildSuffixArray(txt, n);
Output:
Pattern found at index 2
The time complexity of the above search function is O(mLogn). There are more efficient algorithms to search pattern once the suffix array is built.
In fact there is a O(m) suffix array based algorithm to search a pattern. We will soon be discussing efficient algorithm for search.
Applications of Suffix Array
Suffix array is an extremely useful data structure, it can be used for a wide range of problems. Following are some famous problems where Suffix
array can be used.
1) Pattern Searching
2) Finding the longest repeated substring
3) Finding the longest common substring
4) Finding the longest palindrome in a string
See this for more problems where Suffix arrays can be used.
This post is a simple introduction. There is a lot to cover in Suffix arrays. We have discussed a O(nLogn) algorithm for Suffix Array construction
here. We will soon be discussing more efficient suffix array algorithms.
References:
http://www.stanford.edu/class/cs97si/suffix-array.pdf
http://en.wikipedia.org/wiki/Suffix_array
This problem is slightly different from standard pattern searching problem, here we need to search for anagrams as well. Therefore, we cannot
directly apply standard pattern searching algorithms like KMP, Rabin Karp, Boyer Moore, etc.
A simple idea is to modify Rabin Karp Algorithm. For example we can keep the hash value as sum of ASCII values of all characters under modulo
of a big prime number. For every character of text, we can add the current character to hash value and subtract the first character of previous
window. This solution looks good, but like standard Rabin Karp, the worst case time complexity of this solution is O(mn). The worst case occurs
when all hash values match and we one by one match all characters.
We can achieve O(n) time complexity under the assumption that alphabet size is fixed which is typically true as we have maximum 256 possible
characters in ASCII. The idea is to use two count arrays:
1) The first count array store frequencies of characters in pattern.
2) The second count array stores frequencies of characters in current window of text.
The important thing to note is, time complexity to compare two count arrays is O(1) as the number of elements in them are fixed (independent of
pattern and text sizes). Following are steps of this algorithm.
1) Store counts of frequencies of pattern in first count array countP[]. Also store counts of frequencies of characters in first window of text in
array countTW[].
2) Now run a loop from i = M to N-1. Do following in loop.
..a) If the two count arrays are identical, we found an occurrence.
..b) Increment count of current character of text in countTW[]
..c) Decrement count of first character in previous window in countWT[]
3) The last window is not checked by above loop, so explicitly check it.
Following is C++ implementation of above algorithm.
// C++ program to search all anagrams of a pattern in a text
#include<iostream>
#include<cstring>
#define MAX 256
using namespace std;
// This function returns true if contents of arr1[] and arr2[]
// are same, otherwise false.
bool compare(char arr1[], char arr2[])
{
for (int i=0; i<MAX; i++)
if (arr1[i] != arr2[i])
return false;
return true;
}
// This function search for all permutations of pat[] in txt[]
void search(char *pat, char *txt)
{
int M = strlen(pat), N = strlen(txt);
// countP[]: Store count of all characters of pattern
// countTW[]: Store count of current window of text
char countP[MAX] = {0}, countTW[MAX] = {0};
for (int i = 0; i < M; i++)
{
(countP[pat[i]])++;
(countTW[txt[i]])++;
}
Output:
Found at Index 0
Found at Index 5
Found at Index 6
If we consider all of the above suffixes as individual words and build a Trie, we get following.
Output:
Search for 'ee'
Pattern found at position 9
Pattern found at position 1
Search for 'geek'
Pattern found at position 8
Pattern found at position 0
Search for 'quiz'
Pattern not found
Search for 'forgeeks'
Pattern found at position 5
Time Complexity of the above search function is O(m+k) where m is length of the pattern and k is the number of occurrences of pattern in text.
Here center position is not only the actual string character position but it could be the position between two characters also.
Consider string abaaba of even length. This string is palindrome around the position between 3rd and 4th characters a and a respectively.
To find Longest Palindromic Substring of a string of length N, one way is take each possible 2*N + 1 centers (the N character positions, N-1
between two character positions and 2 positions at left and right ends), do the character match in both left and right directions at each 2*N+ 1
centers and keep track of LPS. This approach takes O(N^2) time and thats what we are doing in Set 2.
Lets consider two strings abababa and abaaba as shown below:
In these two strings, left and right side of the center positions (position 7 in 1st string and position 6 in 2nd string) are symmetric. Why? Because
the whole string is palindrome around the center position.
If we need to calculate Longest Palindromic Substring at each 2*N+1 positions from left to right, then palindromes symmetric property could help
to avoid some of the unnecessary computations (i.e. character comparison). If there is a palindrome of some length L cantered at any position P,
then we may not need to compare all characters in left and right side at position P+1. We already calculated LPS at positions before P and they
can help to avoid some of the comparisons after position P.
This use of information from previous positions at a later point of time makes the Manachers algorithm linear. In Set 2, there is no reuse of previous
information and so that is quadratic.
Manachers algorithm is probably considered complex to understand, so here we will discuss it in as detailed way as we can. Some of its portions
may require multiple reading to understand it properly.
Lets look at string abababa. In 3rd figure above, 15 center positions are shown. We need to calculate length of longest palindromic string at each
of these positions.
At position 0, there is no LPS at all (no character on left side to compare), so length of LPS will be 0.
At position 1, LPS is a, so length of LPS will be 1.
At position 2, there is no LPS at all (left and right characters a and b dont match), so length of LPS will be 0.
At position 3, LPS is aba, so length of LPS will be 3.
At position 4, there is no LPS at all (left and right characters b and a dont match), so length of LPS will be 0.
At position 5, LPS is ababa, so length of LPS will be 5.
and so on
We store all these palindromic lengths in an array, say L. Then string S and LPS Length L look like below:
In LPS Array L:
LPS length value at odd positions (the actual character positions) will be odd and greater than or equal to 1 (1 will come from the center
character itself if nothing else matches in left and right side of it)
LPS length value at even positions (the positions between two characters, extreme left and right positions) will be even and greater than or
equal to 0 (0 will come when there is no match in left and right side)
Position and index for the string are two different things here. For a given string S of length N, indexes will be from 0 to N-1 (total N
indexes) and positions will be from 0 to 2*N (total 2*N+1 positions).
LPS length value can be interpreted in two ways, one in terms of index and second in terms of position. LPS value d at position I (L[i] = d) tells
that:
Substring from position i-d to i+d is a palindrome of length d (in terms of position)
Substring from index (i-d)/2 to [(i+d)/2 1] is a palindrome of length d (in terms of index)
e.g. in string abaaba, L[3] = 3 means substring from position 0 (3-3) to 6 (3+3) is a palindrome which is aba of length 3, it also means that
substring from index 0 [(3-3)/2] to 2 [(3+3)/2 1] is a palindrome which is aba of length 3.
Now the main task is to compute LPS array efficiently. Once this array is computed, LPS of string S will be centered at position with maximum
LPS length value.
We will see it in Part 2.
If we already know LPS length values at positions 1, 2, 3, 4, 5, 6 and 7 already then we may not need to calculate LPS length at positions 8, 9,
10, 11, 12 and 13 because they are equal to LPS length values at corresponding positions on left side of position 7.
Can you see why LPS length values are symmetric around positions 3, 6, 9 in string abaaba? Thats because there is a palindromic substring around
these positions. Same is the case in string abababa around position 7.
Is it always true that LPS length values around at palindromic center position are always symmetric (same)?
Answer is NO.
Look at positions 3 and 11 in string abababa. Both positions have LPS length 3. Immediate left and right positions are symmetric (with value 0),
but not the next one. Positions 1 and 5 (around position 3) are not symmetric. Similarly, positions 9 and 13 (around position 11) are not
symmetric.
At this point, we can see that if there is a palindrome in a string centered at some position, then LPS length values around the center position may
or may not be symmetric depending on some situation. If we can identify the situation when left and right positions WILL BE SYMMETRIC
around the center position, we NEED NOT calculate LPS length of the right position because it will be exactly same as LPS value of
corresponding position on the left side which is already known. And this fact where we are avoiding LPS length computation at few positions
makes Manachers Algorithm linear.
In situations when left and right positions WILL NOT BE SYMMETRIC around the center position, we compare characters in left and right side
to find palindrome, but here also algorithm tries to avoid certain no of comparisons. We will see all these scenarios soon.
Lets introduce few terms to proceed further:
(click to see it clearly)
centerPosition This is the position for which LPS length is calculated and lets say LPS length at centerPosition is d (i.e. L[centerPosition] =
d)
centerRightPosition This is the position which is right to the centerPosition and d position away from centerPosition (i.e.
centerRightPosition = centerPosition + d)
centerLeftPosition This is the position which is left to the centerPosition and d position away from centerPosition (i.e. centerLeftPosition
= centerPosition d)
currentRightPosition This is the position which is right of the centerPosition for which LPS length is not yet known and has to be
calculated
currentLeftPosition This is the position on the left side of centerPosition which corresponds to the currentRightPosition
centerPosition currentLeftPosition = currentRightPosition centerPosition
currentLeftPosition = 2* centerPosition currentRightPosition
i-left palindrome The palindrome i positions left of centerPosition, i.e. at currentLeftPosition
i-right palindrome The palindrome i positions right of centerPosition, i.e. at currentRightPosition
center palindrome The palindrome at centerPosition
When we are at centerPosition for which LPS length is known, then we also know LPS length of all positions smaller than centerPosition. Lets say
LPS length at centerPosition is d, i.e.
L[centerPosition] = d
It means that substring between positions centerPosition-d to centerPosition+d is a palindrom.
Now we proceed further to calculate LPS length of positions greater than centerPosition.
Lets say we are at currentRightPosition ( > centerPosition) where we need to find LPS length.
For this we look at LPS length of currentLeftPosition which is already calculated.
If LPS length of currentLeftPosition is less than centerRightPosition currentRightPosition, then LPS length of currentRightPosition will be equal to
LPS length of currentLeftPosition. So
L[currentRightPosition] = L[currentLeftPosition] if L[currentLeftPosition] < centerRightPosition currentRightPosition. This is Case 1.
Lets consider below scenario for string abababa:
(click to see it clearly)
We have calculated LPS length up-to position 7 where L[7] = 7, if we consider position 7 as centerPosition, then centerLeftPosition will be 0 and
centerRightPosition will be 14.
Now we need to calculate LPS length of other positions on the right of centerPosition.
For currentRightPosition = 8, currentLeftPosition is 6 and L[currentLeftPosition] = 0
Also centerRightPosition currentRightPosition = 14 8 = 6
Case 1 applies here and so L[currentRightPosition] = L[8] = 0
Case 1 applies to positions 10 and 12, so,
L[10] = L[4] = 0
L[12] = L[2] = 0
If we look at position 9, then:
currentRightPosition = 9
currentLeftPosition = 2* centerPosition currentRightPosition = 2*7 9 = 5
centerRightPosition currentRightPosition = 14 9 = 5
Here L[currentLeftPosition] = centerRightPosition currentRightPosition, so Case 1 doesnt apply here. Also note that centerRightPosition is the
extreme end position of the string. That means center palindrome is suffix of input string. In that case, L[currentRightPosition] =
L[currentLeftPosition]. This is Case 2.
Case 2 applies to positions 9, 11, 13 and 14, so:
L[9] = L[5] = 5
L[11] = L[3] = 3
L[13] = L[1] = 1
L[14] = L[0] = 0
What is really happening in Case 1 and Case 2? This is just utilizing the palindromic symmetric property and without any character match, it is
finding LPS length of new positions.
When a bigger length palindrome contains a smaller length palindrome centered at left side of its own center, then based on symmetric property,
there will be another same smaller palindrome centered on the right of bigger palindrome center. If left side smaller palindrome is not prefix of
bigger palindrome, then Case 1 applies and if it is a prefix AND bigger palindrome is suffix of the input string itself, then Case 2 applies.
The longest palindrome i places to the right of the current center (the i-right palindrome) is as long as the longest palindrome i places to
the left of the current center (the i-left palindrome) if the i-left palindrome is completely contained in the longest palindrome around the
current center (the center palindrome) and the i-left palindrome is not a prefix of the center palindrome (Case 1) or (i.e. when i-left
palindrome is a prefix of center palindrome) if the center palindrome is a suffix of the entire string (Case 2).
In Case 1 and Case 2, i-right palindrome cant expand more than corresponding i-left palindrome (can you visualize why it cant expand more?),
and so LPS length of i-right palindrome is exactly same as LPS length of i-left palindrome.
Here both i-left and i-right palindromes are completely contained in center palindrome (i.e. L[currentLeftPosition] <= centerRightPosition
currentRightPosition)
Now if i-left palindrome is not a prefix of center palindrome (L[currentLeftPosition] < centerRightPosition currentRightPosition), that means that ileft palindrome was not able to expand up-to position centerLeftPosition.
If we look at following with centerPosition = 11, then
(click to see it clearly)
If we take center position 7, then Case 3 applies at currentRightPosition 11 because i-left palindrome at currentLeftPosition 3 is a prefix of center
palindrome and i-right palindrome is not suffix of input string, so here L[11] = 9, which is greater than i-left palindrome length L[3] = 3. In the case,
it is guaranteed that L[11] will be at least 3, and so in implementation, we 1st set L[11] = 3 and then we try to expand it by comparing characters
in left and right side starting from distance 4 (As up-to distance 3, it is already known that characters will match).
If we take center position 11, then Case 4 applies at currentRightPosition 15 because L[currentLeftPosition] = L[7] = 7 > centerRightPosition
currentRightPosition = 20 15 = 5. In the case, it is guaranteed that L[15] will be at least 5, and so in implementation, we 1st set L[15] = 5 and then
we try to expand it by comparing characters in left and right side starting from distance 5 (As up-to distance 5, it is already known that characters
will match).
Now one point left to discuss is, when we work at one center position and compute LPS lengths for different rightPositions, how to know that
what would be next center position. We change centerPosition to currentRightPosition if palindrome centered at currentRightPosition expands
beyond centerRightPosition.
Here we have seen four different cases on how LPS length of a position will depend on a previous positions LPS length.
In Part 3, we have discussed code implementation of it and also we have looked at these four cases in a different way and implement that too.
If at all we need a comparison, we will only compare actual characters, which are at odd positions like 1, 3, 5, 7, etc.
Even positions do not represent a character in string, so no comparison will be preformed for even positions.
If two characters at different odd positions match, then they will increase LPS length by 2.
There are many ways to implement this depending on how even and odd positions are handled. One way would be to create a new string 1st
where we insert some unique character (say #, $ etc) in all even positions and then run algorithm on that (to avoid different way of even and odd
position handling). Other way could be to work on given string itself but here even and odd positions should be handled appropriately.
Here we will start with given string itself. When there is a need of expansion and character comparison required, we will expand in left and right
positions one by one. When odd position is found, comparison will be done and LPS Length will be incremented by ONE. When even position is
found, no comparison done and LPS Length will be incremented by ONE (So overall, one odd and one even positions on both left and right side
will increase LPS Length by TWO).
C/C++
// A C program to implement Manachers Algorithm
#include <stdio.h>
#include <string.h>
char text[100];
void findLongestPalindromicString()
{
int N = strlen(text);
if(N == 0)
return;
N = 2*N + 1; //Position count
int L[N]; //LPS Length Array
L[0] = 0;
L[1] = 1;
int C = 1; //centerPosition
int R = 2; //centerRightPosition
int i = 0; //currentRightPosition
int iMirror; //currentLeftPosition
int expand = -1;
int diff = -1;
int maxLPSLength = 0;
int maxLPSCenterPosition = 0;
int start = -1;
int end = -1;
//Uncomment it to print LPS Length array
//printf("%d %d ", L[0], L[1]);
for (i = 2; i < N; i++)
{
//get currentLeftPosition iMirror for currentRightPosition i
iMirror = 2*C-i;
//Reset expand - means no expansion required
expand = 0;
diff = R - i;
//If currentRightPosition i is within centerRightPosition R
if(diff > 0)
{
if(L[iMirror] < diff) // Case 1
L[i] = L[iMirror];
else if(L[iMirror] == diff && i == N-1) // Case 2
L[i] = L[iMirror];
else if(L[iMirror] == diff && i < N-1) // Case 3
{
L[i] = L[iMirror];
expand = 1; // expansion required
}
else if(L[iMirror] > diff) // Case 4
{
L[i] = diff;
expand = 1; // expansion required
}
}
else
{
L[i] = 0;
expand = 1; // expansion required
}
if (expand == 1)
{
//Attempt to expand palindrome centered at currentRightPosition i
//Here for odd positions, we compare characters and
//if match then increment LPS Length by ONE
//If even position, we just increment LPS by ONE without
//any character comparison
while (((i + L[i]) < N && (i - L[i]) > 0) &&
( ((i + L[i] + 1) % 2 == 0) ||
(text[(i + L[i] + 1)/2] == text[(i-L[i]-1)/2] )))
{
L[i]++;
}
}
if(L[i] > maxLPSLength) // Track maxLPSLength
{
maxLPSLength = L[i];
maxLPSCenterPosition = i;
}
// If palindrome centered at currentRightPosition i
// expand beyond centerRightPosition R,
// adjust centerPosition C based on expanded palindrome.
if (i + L[i] > R)
{
C = i;
R = i + L[i];
}
//Uncomment it to print LPS Length array
//printf("%d ", L[i]);
}
//printf("\n");
start = (maxLPSCenterPosition - maxLPSLength)/2;
end = start + maxLPSLength - 1;
//printf("start: %d end: %d\n", start, end);
printf("LPS of string is %s : ", text);
for(i=start; i<=end; i++)
printf("%c", text[i]);
printf("\n");
}
int main(int argc, char *argv[])
{
strcpy(text, "babcbabcbaccba");
findLongestPalindromicString();
strcpy(text, "abaaba");
findLongestPalindromicString();
strcpy(text, "abababa");
findLongestPalindromicString();
strcpy(text, "abcbabcbabcba");
findLongestPalindromicString();
strcpy(text, "forgeeksskeegfor");
findLongestPalindromicString();
strcpy(text, "caba");
findLongestPalindromicString();
strcpy(text, "abacdfgdcaba");
findLongestPalindromicString();
strcpy(text, "abacdfgdcabba");
findLongestPalindromicString();
strcpy(text, "abacdedcaba");
findLongestPalindromicString();
return 0;
}
Python
# Python program to implement Manacher's Algorithm
def findLongestPalindromicString(text):
N = len(text)
if N == 0:
return
N = 2*N+1
# Position count
L = [0] * N
L[0] = 0
L[1] = 1
C = 1
# centerPosition
R = 2
# centerRightPosition
i = 0
# currentRightPosition
iMirror = 0
# currentLeftPosition
maxLPSLength = 0
maxLPSCenterPosition = 0
start = -1
end = -1
diff = -1
# Uncomment it to print LPS Length array
# printf("%d %d ", L[0], L[1]);
for i in xrange(2,N):
# get currentLeftPosition iMirror for currentRightPosition i
iMirror = 2*C-i
L[i] = 0
diff = R - i
# If currentRightPosition i is within centerRightPosition R
if diff > 0:
L[i] = min(L[iMirror], diff)
# Attempt to expand palindrome centered at currentRightPosition i
# Here for odd positions, we compare characters and
# if match then increment LPS Length by ONE
# If even position, we just increment LPS by ONE without
# any character comparison
try:
while ((i+L[i]) < N and (i-L[i]) > 0) and \
(((i+L[i]+1) % 2 == 0) or \
(text[(i+L[i]+1)/2] == text[(i-L[i]-1)/2])):
L[i]+=1
except Exception as e:
pass
if L[i] > maxLPSLength:
maxLPSLength = L[i]
maxLPSCenterPosition = i
# Track maxLPSLength
text4 = "abcbabcbabcba"
findLongestPalindromicString(text4)
text5 = "forgeeksskeegfor"
findLongestPalindromicString(text5)
text6 = "caba"
findLongestPalindromicString(text6)
text7 = "abacdfgdcaba"
findLongestPalindromicString(text7)
text8 = "abacdfgdcabba"
findLongestPalindromicString(text8)
text9 = "abacdedcaba"
findLongestPalindromicString(text9)
# This code is contributed by BHAVYA JAIN
Output:
LPS
LPS
LPS
LPS
LPS
LPS
LPS
LPS
LPS
of
of
of
of
of
of
of
of
of
string
string
string
string
string
string
string
string
string
is
is
is
is
is
is
is
is
is
babcbabcbaccba : abcbabcba
abaaba : abaaba
abababa : abababa
abcbabcbabcba : abcbabcbabcba
forgeeksskeegfor : geeksskeeg
caba : aba
abacdfgdcaba : aba
abacdfgdcabba : abba
abacdedcaba : abacdedcaba
This is the implementation based on the four cases discussed in Part 2. In Part 4, we have discussed a different way to look at these four cases and
few other approaches.
If we look at all four cases, we will see that we 1st set minimum of L[iMirror] and R-i to L[i] and then we try to expand the palindrome
in whichever case it can expand.
Above observation may look more intuitive, easier to understand and implement, given that one understands LPS length array, position, index,
symmetry property etc.
C/C++
// A C program to implement Manachers Algorithm
#include <stdio.h>
#include <string.h>
char text[100];
int min(int a, int b)
{
int res = a;
if(b < a)
res = b;
return res;
}
void findLongestPalindromicString()
{
int N = strlen(text);
if(N == 0)
return;
N = 2*N + 1; //Position count
int L[N]; //LPS Length Array
L[0] = 0;
L[1] = 1;
int C = 1; //centerPosition
int R = 2; //centerRightPosition
int i = 0; //currentRightPosition
int iMirror; //currentLeftPosition
int maxLPSLength = 0;
int maxLPSCenterPosition = 0;
int start = -1;
int end = -1;
int diff = -1;
//Uncomment it to print LPS Length array
//printf("%d %d ", L[0], L[1]);
for (i = 2; i < N; i++)
{
//get currentLeftPosition iMirror for currentRightPosition i
iMirror = 2*C-i;
L[i] = 0;
diff = R - i;
//If currentRightPosition i is within centerRightPosition R
if(diff > 0)
L[i] = min(L[iMirror], diff);
//Attempt to expand palindrome centered at currentRightPosition i
//Here for odd positions, we compare characters and
//if match then increment LPS Length by ONE
//If even position, we just increment LPS by ONE without
//any character comparison
while ( ((i + L[i]) < N && (i - L[i]) > 0) &&
( ((i + L[i] + 1) % 2 == 0) ||
(text[(i + L[i] + 1)/2] == text[(i - L[i] - 1)/2] )))
{
L[i]++;
}
if(L[i] > maxLPSLength) // Track maxLPSLength
{
maxLPSLength = L[i];
maxLPSCenterPosition = i;
}
//If palindrome centered at currentRightPosition i
//expand beyond centerRightPosition R,
//adjust centerPosition C based on expanded palindrome.
if (i + L[i] > R)
{
C = i;
R = i + L[i];
}
//Uncomment it to print LPS Length array
//printf("%d ", L[i]);
}
//printf("\n");
start = (maxLPSCenterPosition - maxLPSLength)/2;
end = start + maxLPSLength - 1;
printf("LPS of string is %s : ", text);
for(i=start; i<=end; i++)
printf("%c", text[i]);
printf("\n");
}
int main(int argc, char *argv[])
{
strcpy(text, "babcbabcbaccba");
findLongestPalindromicString();
strcpy(text, "abaaba");
findLongestPalindromicString();
strcpy(text, "abababa");
findLongestPalindromicString();
strcpy(text, "abcbabcbabcba");
findLongestPalindromicString();
strcpy(text, "forgeeksskeegfor");
findLongestPalindromicString();
strcpy(text, "caba");
findLongestPalindromicString();
strcpy(text, "abacdfgdcaba");
findLongestPalindromicString();
strcpy(text, "abacdfgdcabba");
findLongestPalindromicString();
strcpy(text, "abacdedcaba");
findLongestPalindromicString();
return 0;
}
Python
# Python program to implement Manacher's Algorithm
def findLongestPalindromicString(text):
N = len(text)
if N == 0:
return
N = 2*N+1
# Position count
L = [0] * N
L[0] = 0
L[1] = 1
C = 1
# centerPosition
R = 2
# centerRightPosition
i = 0
# currentRightPosition
iMirror = 0
# currentLeftPosition
maxLPSLength = 0
maxLPSCenterPosition = 0
start = -1
end = -1
diff = -1
# Uncomment it to print LPS Length array
# printf("%d %d ", L[0], L[1]);
for i in xrange(2,N):
# get currentLeftPosition iMirror for currentRightPosition i
iMirror = 2*C-i
L[i] = 0
diff = R - i
# If currentRightPosition i is within centerRightPosition R
if diff > 0:
L[i] = min(L[iMirror], diff)
# Attempt to expand palindrome centered at currentRightPosition i
# Here for odd positions, we compare characters and
# if match then increment LPS Length by ONE
# If even position, we just increment LPS by ONE without
# any character comparison
try:
while ((i + L[i]) < N and (i - L[i]) > 0) and \
(((i + L[i] + 1) % 2 == 0) or \
(text[(i + L[i] + 1) / 2] == text[(i - L[i] - 1) / 2])):
L[i]+=1
except Exception as e:
pass
if L[i] > maxLPSLength:
maxLPSLength = L[i]
maxLPSCenterPosition = i
# Track maxLPSLength
Output:
LPS of string is babcbabcbaccba : abcbabcba
LPS of string is abaaba : abaaba
LPS
LPS
LPS
LPS
LPS
LPS
LPS
of
of
of
of
of
of
of
string
string
string
string
string
string
string
is
is
is
is
is
is
is
abababa : abababa
abcbabcbabcba : abcbabcbabcba
forgeeksskeegfor : geeksskeeg
caba : aba
abacdfgdcaba : aba
abacdfgdcabba : abba
abacdedcaba : abacdedcaba
Other Approaches
We have discussed two approaches here. One in Part 3 and other in current article. In both approaches, we worked on given string. Here we had
to handle even and odd positions differently while comparing characters for expansion (because even positions do not represent any character in
string).
To avoid this different handling of even and odd positions, we need to make even positions also to represent some character (actually all even
positions should represent SAME character because they MUST match while character comparison). One way to do this is to set some character
at all even positions by modifying given string or create a new copy of given string. For example, if input string is abcb, new string should be
#a#b#c#b# if we add # as unique character at even positions.
The two approaches discussed already can be modified a bit to work on modified string where different handling of even and odd positions will not
be needed.
We may also add two DIFFERENT characters (not yet used anywhere in string at even and odd positions) at start and end of string as sentinels to
avoid bound check. With these changes string abcb will look like ^#a#b#c#b#$ where ^ and $ are sentinels.
This implementation may look cleaner with the cost of more memory.
We are not implementing these here as its a simple change in given implementations.
Implementation of approach discussed in current article on a modified string can be found at Longest Palindromic Substring Part II and a Java
Translation of the same by Princeton.
Longest Even Length Substring such that Sum of First and Second Half is same
Given a string str of digits, find length of the longest substring of str, such that the length of the substring is 2k digits and sum of left k digits is equal
to the sum of right k digits.
Examples:
Input: str = "123123"
Output: 6
The complete string is of even length and sum of first and second
half digits is same
Input: str = "1538023"
Output: 4
The longest substring with same first and second half sum is "5380"
Output:
Length of the substring is 4
{
int n = strlen(str);
int maxlen = 0; // Initialize result
// A 2D table where sum[i][j] stores sum of digits
// from str[i] to str[j]. Only filled entries are
// the entries where j >= i
int sum[n][n];
// Fill the diagonal values for sunstrings of length 1
for (int i =0; i<n; i++)
sum[i][i] = str[i]-'0';
// Fill entries for substrings of length 2 to n
for (int len=2; len<=n; len++)
{
// Pick i and j for current substring
for (int i=0; i<n-len+1; i++)
{
int j = i+len-1;
int k = len/2;
// Calculate value of sum[i][j]
sum[i][j] = sum[i][j-k] + sum[j-k+1][j];
// Update result if 'len' is even, left and right
// sums are same and len is more than maxlen
if (len%2 == 0 && sum[i][j-k] == sum[(j-k+1)][j]
&& len > maxlen)
maxlen = len;
}
}
return maxlen;
}
// Driver program to test above function
int main(void)
{
char str[] = "153803";
printf("Length of the substring is %d", findLength(str));
return 0;
}
Output:
Length of the substring is 4
Time complexity of the above solution is O(n2), but it requires O(n2) extra space.
}
}
return ans;
}
// Driver program to test above function
int main()
{
string str = "123123";
cout << "Length of the substring is " << findLength(str, str.length());
return 0;
}
Output:
Length of the substring is 6
Output:
Length of the substring is 6
C/C++
// C++ program to print permutations of a given string with spaces.
#include <iostream>
#include <cstring>
using namespace std;
/* Function recursively prints the strings having space pattern.
i and j are indices in 'str[]' and 'buff[]' respectively */
void printPatternUtil(char str[], char buff[], int i, int j, int n)
{
if (i==n)
{
buff[j] = '\0';
cout << buff << endl;
return;
}
// Either put the character
buff[j] = str[i];
printPatternUtil(str, buff, i+1, j+1, n);
// Or put a space followed by next character
buff[j] = ' ';
buff[j+1] = str[i];
printPatternUtil(str, buff, i+1, j+2, n);
}
// This function creates buf[] to store individual output string and uses
// printPatternUtil() to print all permutations.
void printPattern(char *str)
{
int n = strlen(str);
// Buffer to hold the string containing spaces
char buf[2*n]; // 2n-1 characters and 1 string terminator
// Copy the first character as it is, since it will be always
// at first position
buf[0] = str[0];
printPatternUtil(str, buf, 1, 1, n);
}
// Driver program to test above functions
int main()
{
char *str = "ABCD";
printPattern(str);
return 0;
}
Python
# Python program to print permutations of a given string with
# spaces.
# Utility function
def toString(List):
s = ""
for x in List:
if x == '\0':
break
s += x
return s
# Function recursively prints the strings having space pattern.
# i and j are indices in 'str[]' and 'buff[]' respectively
def printPatternUtil(string, buff, i, j, n):
if i == n:
buff[j] = '\0'
print toString(buff)
return
# Either put the character
buff[j] = string[i]
printPatternUtil(string, buff, i+1, j+1, n)
# Or put a space followed by next character
buff[j] = ' '
buff[j+1] = string[i]
printPatternUtil(string, buff, i+1, j+2, n)
# This function creates buf[] to store individual output string
# and uses printPatternUtil() to print all permutations.
def printPattern(string):
n = len(string)
# Buffer to hold the string containing spaces
buff = [0] * (2*n) # 2n-1 characters and 1 string terminator
# Copy the first character as it is, since it will be always
# at first position
buff[0] = string[0]
printPatternUtil(string, buff, 1, 1, n)
# Driver program
string = "ABCD"
printPattern(string)
# This code is contributed by BHAVYA JAIN
ABCD
ABC D
AB CD
AB C D
A BCD
A BC D
A B CD
A B C D
Time Complexity: Since number of Gaps are n-1, there are total 2^(n-1) patters each having length ranging from n to 2n-1. Thus overall
complexity would be O(n*(2^n)).
C/C++
// C program to print all permutations with duplicates allowed
#include <stdio.h>
#include <string.h>
/* Function to swap values at two pointers */
void swap(char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
/* Function to print permutations of string
This function takes three parameters:
1. String
2. Starting index of the string
3. Ending index of the string. */
void permute(char *a, int l, int r)
{
int i;
if (l == r)
printf("%s\n", a);
else
{
for (i = l; i <= r; i++)
{
swap((a+l), (a+i));
permute(a, l+1, r);
swap((a+l), (a+i)); //backtrack
}
}
}
/* Driver program to test above functions */
int main()
{
char str[] = "ABC";
int n = strlen(str);
permute(str, 0, n-1);
return 0;
}
Python
# Python program to print all permutations with
# duplicates allowed
ABC
ACB
BAC
BCA
CBA
CAB
Backtracking works in an incremental way to attack problems. Typically, we start from an empty solution vector and one by one add items
(Meaning of item varies from problem to problem. In context of Knights tour problem, an item is a Knights move). When we add an item, we
check if adding the current item violates the problem constraint, if it does then we remove the item and try other alternatives. If none of the
alternatives work out then we go to previous stage and remove the item added in the previous stage. If we reach the initial stage back then we say
that no solution exists. If adding an item doesnt violate constraints then we recursively add items one by one. If the solution vector becomes
complete then we print the solution.
Backtracking Algorithm for Knights tour
Following is the Backtracking algorithm for Knights tour problem.
If all squares are visited
print the solution
Else
a) Add one of the next moves to solution vector and recursively
check if this move leads to a solution. (A Knight can make maximum
eight moves. We choose one of the 8 moves in this step).
b) If the move chosen in the above step doesn't lead to a solution
then remove this move from the solution vector and try other
alternative moves.
c) If none of the alternatives work then return false (Returning false
will remove the previously added item in recursion and if false is
returned by the initial call of recursion then "no solution exists" )
Following are implementations for Knights tour problem. It prints one of the possible solutions in 2D matrix form. Basically, the output is a 2D 8*8
matrix with numbers from 0 to 63 and these numbers show steps made by Knight.
C
// C program for Knight Tour problem
#include<stdio.h>
#define N 8
int solveKTUtil(int x, int y, int movei, int sol[N][N],
int xMove[], int yMove[]);
/* A utility function to check if i,j are valid indexes
for N*N chessboard */
bool isSafe(int x, int y, int sol[N][N])
{
return ( x >= 0 && x < N && y >= 0 &&
y < N && sol[x][y] == -1);
}
/* A utility function to print solution matrix sol[N][N] */
void printSolution(int sol[N][N])
{
for (int x = 0; x < N; x++)
{
Java
return false;
}
/* Driver program to test above functions */
public static void main(String args[]) {
solveKT();
}
}
// This code is contributed by Abhishek Shankhadhar
0
37
58
35
42
47
56
51
59
34
1
48
57
50
43
46
38
31
36
41
2
45
52
55
33
60
39
26
49
54
3
44
30
9
32
61
40
25
22
53
17
62
27
10
23
20
13
4
8
29
18
15
6
11
24
21
63
16
7
28
19
14
5
12
Note that Backtracking is not the best solution for the Knights tour problem. See this for other better solutions. The purpose of this post is to
explain Backtracking with an example.
References:
http://see.stanford.edu/materials/icspacs106b/H19-RecBacktrackExamples.pdf
http://www.cis.upenn.edu/~matuszek/cit594-2009/Lectures/35-backtracking.ppt
http://mathworld.wolfram.com/KnightsTour.html
http://en.wikipedia.org/wiki/Knight%27s_tour
0,
1,
1,
1,
0,
0,
0,
1,
0}
1}
0}
1}
Following is the solution matrix (output of program) for the above input matrx.
{1, 0, 0,
{1, 1, 0,
{0, 1, 0,
{0, 1, 1,
All enteries in solution
0}
0}
0}
1}
path are marked as 1.
Naive Algorithm
The Naive Algorithm is to generate all paths from source to destination and one by one check if the generated path satisfies the constraints.
while there are untried paths
{
generate the next path
if this path has all blocks as 1
{
print this path;
}
}
Backtrackng Algorithm
If destination is reached
print the solution matrix
Else
a) Mark current cell in solution matrix as 1.
b) Move forward in horizontal direction and recursively check if this
move leads to a solution.
c) If the move chosen in the above step doesn't lead to a solution
then move down and check if this move leads to a solution.
C/C++
/* C/C++ program to solve Rat in a Maze problem using
backtracking */
#include<stdio.h>
// Maze size
#define N 4
bool solveMazeUtil(int maze[N][N], int x, int y, int sol[N][N]);
/* A utility function to print solution matrix sol[N][N] */
void printSolution(int sol[N][N])
{
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
printf(" %d ", sol[i][j]);
printf("\n");
}
}
/* A utility function to check if x,y is valid index for N*N maze */
bool isSafe(int maze[N][N], int x, int y)
{
// if (x,y outside maze) return false
if(x >= 0 && x < N && y >= 0 && y < N && maze[x][y] == 1)
return true;
return false;
}
/* This function solves the Maze problem using Backtracking. It mainly
uses solveMazeUtil() to solve the problem. It returns false if no
path is possible, otherwise return true and prints the path in the
form of 1s. Please note that there may be more than one solutions,
this function prints one of the feasible solutions.*/
bool solveMaze(int maze[N][N])
{
int sol[N][N] = { {0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0}
};
if(solveMazeUtil(maze, 0, 0, sol) == false)
{
printf("Solution doesn't exist");
return false;
}
printSolution(sol);
return true;
}
/* A recursive utility function to solve Maze problem */
bool solveMazeUtil(int maze[N][N], int x, int y, int sol[N][N])
{
// if (x,y is goal) return true
if(x == N-1 && y == N-1)
{
sol[x][y] = 1;
return true;
}
// Check if maze[x][y] is valid
if(isSafe(maze, x, y) == true)
{
// mark x,y as part of solution path
sol[x][y] = 1;
/* Move forward in x direction */
if (solveMazeUtil(maze, x+1, y, sol) == true)
return true;
Java
/* Java program to solve Rat in a Maze problem using
backtracking */
public class RatMaze
{
final int N = 4;
/* A utility function to print solution matrix
sol[N][N] */
void printSolution(int sol[][])
{
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
System.out.print(" " + sol[i][j] +
" ");
System.out.println();
}
}
/* A utility function to check if x,y is valid
index for N*N maze */
boolean isSafe(int maze[][], int x, int y)
{
// if (x,y outside maze) return false
return (x >= 0 && x < N && y >= 0 &&
y < N && maze[x][y] == 1);
}
/* This function solves the Maze problem using
Backtracking. It mainly uses solveMazeUtil()
to solve the problem. It returns false if no
path is possible, otherwise return true and
prints the path in the form of 1s. Please note
that there may be more than one solutions, this
function prints one of the feasible solutions.*/
boolean solveMaze(int maze[][])
{
int sol[][] = {{0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0}
};
if (solveMazeUtil(maze, 0, 0, sol) == false)
{
System.out.print("Solution doesn't exist");
return false;
}
printSolution(sol);
return true;
}
/* A recursive utility function to solve Maze
problem */
boolean solveMazeUtil(int maze[][], int x, int y,
int sol[][])
{
// if (x,y is goal) return true
if (x == N - 1 && y == N - 1)
{
sol[x][y] = 1;
return true;
}
// Check if maze[x][y] is valid
if (isSafe(maze, x, y) == true)
{
// mark x,y as part of solution path
sol[x][y] = 1;
/* Move forward in x direction */
if (solveMazeUtil(maze, x + 1, y, sol))
return true;
/* If moving in x direction doesn't give
solution then Move down in y direction */
if (solveMazeUtil(maze, x, y + 1, sol))
return true;
/* If none of the above movements work then
BACKTRACK: unmark x,y as part of solution
path */
sol[x][y] = 0;
return false;
}
return false;
}
public static void main(String args[])
{
RatMaze rat = new RatMaze();
int maze[][] = {{1, 0, 0, 0},
{1, 1, 0, 1},
{0, 1, 0, 0},
{1, 1, 1, 1}
};
rat.solveMaze(maze);
}
}
// This code is contributed by Abhishek Shankhadhar
1
1
0
0
0
1
1
1
0
0
0
1
0
0
0
1
The expected output is a binary matrix which has 1s for the blocks where queens are placed. For example following is the output matrix for above
4 queen solution.
{
{
{
{
0,
0,
1,
0,
1,
0,
0,
0,
0,
0,
0,
1,
0}
1}
0}
0}
Naive Algorithm
Generate all possible configurations of queens on board and print a configuration that satisfies the given constraints.
while there are untried conflagrations
{
generate the next configuration
if queens don't attack in this configuration then
{
print this configuration;
}
}
Backtracking Algorithm
The idea is to place queens one by one in different columns, starting from the leftmost column. When we place a queen in a column, we check for
clashes with already placed queens. In the current column, if we find a row for which there is no clash, we mark this row and column as part of the
solution. If we do not find such a row due to clashes then we backtrack and return false.
1) Start in the leftmost column
2) If all queens are placed
return true
3) Try all rows in the current column. Do following for every tried row.
a) If the queen can be placed safely in this row then mark this [row,
column] as part of the solution and recursively check if placing
queen here leads to a solution.
b) If placing queen in [row, column] leads to a solution then return
true.
c) If placing queen doesn't lead to a solution then umark this [row,
column] (Backtrack) and go to step (a) to try other rows.
3) If all rows have been tried and nothing worked, return false to trigger
backtracking.
C/C++
/* C/C++ program to solve N Queen Problem using
backtracking */
#define N 4
#include<stdio.h>
/* A utility function to print solution */
void printSolution(int board[N][N])
{
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
printf(" %d ", board[i][j]);
printf("\n");
}
}
/* A utility function to check if a queen can
be placed on board[row][col]. Note that this
function is called when "col" queens are
already placed in columns from 0 to col -1.
So we need to check only left side for
attacking queens */
bool isSafe(int board[N][N], int row, int col)
{
int i, j;
/* Check this row on left side */
for (i = 0; i < col; i++)
if (board[row][i])
return false;
/* Check upper diagonal on left side */
for (i=row, j=col; i>=0 && j>=0; i--, j--)
if (board[i][j])
return false;
/* Check lower diagonal on left side */
for (i=row, j=col; j>=0 && i<N; i++, j--)
if (board[i][j])
return false;
return true;
}
/* A recursive utility function to solve N
Queen problem */
bool solveNQUtil(int board[N][N], int col)
{
/* base case: If all queens are placed
then return true */
if (col >= N)
return true;
/* Consider this column and try placing
this queen in all rows one by one */
for (int i = 0; i < N; i++)
{
/* Check if queen can be placed on
board[i][col] */
if ( isSafe(board, i, col) )
{
/* Place this queen in board[i][col] */
board[i][col] = 1;
/* recur to place rest of the queens */
if ( solveNQUtil(board, col + 1) )
return true;
/* If placing queen in board[i][col]
doesn't lead to a solution, then
remove queen from board[i][col] */
board[i][col] = 0; // BACKTRACK
}
}
/* If queen can not be place in any row in
this colum col then return false */
return false;
}
/* This function solves the N Queen problem using
Backtracking. It mainly uses solveNQUtil() to
solve the problem. It returns false if queens
cannot be placed, otherwise return true and
prints placement of queens in the form of 1s.
Please note that there may be more than one
solutions, this function prints one of the
feasible solutions.*/
bool solveNQ()
{
int board[N][N] = { {0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0}
};
if ( solveNQUtil(board, 0) == false )
{
printf("Solution does not exist");
return false;
}
printSolution(board);
return true;
}
// driver program to test above function
int main()
{
solveNQ();
return 0;
}
Java
/* Java program to solve N Queen Problem using
backtracking */
public class NQueenProblem
{
final int N = 4;
/* A utility function to print solution */
void printSolution(int board[][])
{
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
System.out.print(" " + board[i][j]
+ " ");
System.out.println();
}
}
/* A utility function to check if a queen can
be placed on board[row][col]. Note that this
function is called when "col" queens are already
placeed in columns from 0 to col -1. So we need
to check only left side for attacking queens */
boolean isSafe(int board[][], int row, int col)
{
int i, j;
/* Check this row on left side */
for (i = 0; i < col; i++)
if (board[row][i] == 1)
return false;
/* Check upper diagonal on left side */
for (i=row, j=col; i>=0 && j>=0; i--, j--)
if (board[i][j] == 1)
return false;
/* Check lower diagonal on left side */
for (i=row, j=col; j>=0 && i<N; i++, j--)
if (board[i][j] == 1)
return false;
return true;
}
/* A recursive utility function to solve N
Queen problem */
boolean solveNQUtil(int board[][], int col)
{
/* base case: If all queens are placed
then return true */
if (col >= N)
return true;
/* Consider this column and try placing
this queen in all rows one by one */
for (int i = 0; i < N; i++)
{
/* Check if queen can be placed on
board[i][col] */
if (isSafe(board, i, col))
{
/* Place this queen in board[i][col] */
board[i][col] = 1;
/* recur to place rest of the queens */
if (solveNQUtil(board, col + 1) == true)
return true;
/* If placing queen in board[i][col]
doesn't lead to a solution then
remove queen from board[i][col] */
board[i][col] = 0; // BACKTRACK
}
}
/* If queen can not be place in any row in
this colum col, then return false */
return false;
}
/* This function solves the N Queen problem using
Backtracking. It mainly uses solveNQUtil() to
solve the problem. It returns false if queens
cannot be placed, otherwise return true and
prints placement of queens in the form of 1s.
Please note that there may be more than one
solutions, this function prints one of the
feasible solutions.*/
boolean solveNQ()
{
int board[][] = {{0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0},
{0, 0, 0, 0}
};
if (solveNQUtil(board, 0) == false)
{
System.out.print("Solution does not exist");
return false;
}
printSolution(board);
return true;
}
// driver program to test above function
public static void main(String args[])
{
NQueenProblem Queen = new NQueenProblem();
Queen.solveNQ();
}
}
// This code is contributed by Abhishek Shankhadhar
0
1
0
0
0
0
0
1
1
0
0
0
0
0
1
0
Sources:
http://see.stanford.edu/materials/icspacs106b/H19-RecBacktrackExamples.pdf
http://en.literateprograms.org/Eight_queens_puzzle_%28C%29
http://en.wikipedia.org/wiki/Eight_queens_puzzle
In the above tree, a node represents function call and a branch represents candidate element. The root node contains 4 children. Inother words,
root considers every element of the set as different branch. The next levelsub-treescorrespondto the subsets that includes the parent node. The
branches at each level represent tuple element to be considered. For example, if we are at level 1, tuple_vector[1] can take any value of four
branches generated. If we are at level 2 of left most node,tuple_vector[2] can take any value of three branches generated, and so on
For example the left most child of root generates all those subsets that include w[1]. Similarly the second child of root generates all those subsets
that includes w[2] and excludes w[1].
As we go down along depth of tree we add elements so far, and if the added sum issatisfyingexplicit constraints, we will continue to generate child
nodes further. Whenever the constraints are not met, we stop further generation of sub-treesof that node, and backtrack to previous node to
explore the nodes not yet explored. In many scenarios, it saves considerableamountof processing time.
The tree should trigger a clue to implement the backtracking algorithm (try yourself). It prints all those subsets whose sum add up to given
number.We need to explore the nodes along the breadth and depth of the tree. Generating nodes along breadth is controlled by loop and nodes
along the depth are generated using recursion (post order traversal). Pseudo code given below,
if(subset is satisfying the constraint)
print the subset
exclude the current element and consider next element
else
generate the nodes of present level along breadth of tree and
recur for next levels
Following is C implementation of subset sum using variable size tuple vector. Note that the following program explores all possibilities similar to
exhaustive search. It is to demonstrate how backtracking can be used. See next code to verify, how we can optimize the backtracking solution.
#include <stdio.h>
#include <stdlib.h>
#define ARRAYSIZE(a) (sizeof(a))/(sizeof(a[0]))
static int total_nodes;
// prints subset found
The power of backtracking appears when we combine explicit and implicit constraints, and we stop generating nodes when these checks fail. We
can improve the above algorithm by strengthening the constraint checks and presorting the data. By sorting the initial array, we need not to
consider rest of the array, once the sum so far is greater than target number. We can backtrack and check other possibilities.
Similarly, assume the array is presorted and we found one subset. We can generate next node excluding the present node only when inclusion of
next nodesatisfiesthe constraints. Given below is optimized implementation (it prunes the subtree if it is not satisfying contraints).
#include <stdio.h>
#include <stdlib.h>
#define ARRAYSIZE(a) (sizeof(a))/(sizeof(a[0]))
static int total_nodes;
// prints subset found
As another approach, we can generate the tree in fixed size tupleanalogsto binary pattern. We will kill thesub-treeswhen the constraints are not
satisfied.
Naive Algorithm
Generate all possible configurations of colors and print a configuration that satisfies the given constraints.
while there are untried conflagrations
{
generate the next configuration
if no adjacent vertices are colored with same color
{
print this configuration;
}
}
C/C++
#include<stdio.h>
// Number of vertices in the graph
#define V 4
void printSolution(int color[]);
/* A utility function to check if the current color assignment
is safe for vertex v */
bool isSafe (int v, bool graph[V][V], int color[], int c)
{
for (int i = 0; i < V; i++)
if (graph[v][i] && c == color[i])
return false;
return true;
}
/* A recursive utility function to solve m coloring problem */
bool graphColoringUtil(bool graph[V][V], int m, int color[], int v)
{
/* base case: If all vertices are assigned a color then
return true */
if (v == V)
return true;
/* Consider this vertex v and try different colors */
for (int c = 1; c <= m; c++)
{
/* Check if assignment of color c to v is fine*/
if (isSafe(v, graph, color, c))
{
color[v] = c;
/* recur to assign colors to rest of the vertices */
if (graphColoringUtil (graph, m, color, v+1) == true)
return true;
/* If assigning color c doesn't lead to a solution
then remove it */
color[v] = 0;
}
}
/* If no color can be assigned to this vertex then return false */
return false;
}
/* This function solves the m Coloring problem using Backtracking.
It mainly uses graphColoringUtil() to solve the problem. It returns
false if the m colors cannot be assigned, otherwise return true and
prints assignments of colors to all vertices. Please note that there
may be more than one solutions, this function prints one of the
feasible solutions.*/
bool graphColoring(bool graph[V][V], int m)
{
// Initialize all color values as 0. This initialization is needed
// correct functioning of isSafe()
int *color = new int[V];
for (int i = 0; i < V; i++)
color[i] = 0;
// Call graphColoringUtil() for vertex 0
if (graphColoringUtil(graph, m, color, 0) == false)
{
printf("Solution does not exist");
return false;
}
// Print the solution
printSolution(color);
return true;
}
/* A utility function to print solution */
void printSolution(int color[])
{
printf("Solution Exists:"
" Following are the assigned colors \n");
for (int i = 0; i < V; i++)
printf(" %d ", color[i]);
printf("\n");
}
// driver program to test above function
int main()
{
/* Create following graph and test whether it is 3 colorable
(3)---(2)
| / |
| / |
| / |
(0)---(1)
*/
bool graph[V][V] = {{0, 1, 1, 1},
{1, 0, 1, 0},
{1, 1, 0, 1},
{1, 0, 1, 0},
};
int m = 3; // Number of colors
graphColoring (graph, m);
return 0;
}
Java
/* Java program for solution of M Coloring problem
using backtracking */
public class mColoringProblem {
final int V = 4;
int color[];
/* A utility function to check if the current
color assignment is safe for vertex v */
boolean isSafe(int v, int graph[][], int color[],
int c)
{
for (int i = 0; i < V; i++)
if (graph[v][i] == 1 && c == color[i])
return false;
return true;
}
/* A recursive utility function to solve m
coloring problem */
boolean graphColoringUtil(int graph[][], int m,
int color[], int v)
{
/* base case: If all vertices are assigned
a color then return true */
if (v == V)
return true;
/* Consider this vertex v and try different
colors */
for (int c = 1; c <= m; c++)
{
/* Check if assignment of color c to v
is fine*/
if (isSafe(v, graph, color, c))
{
color[v] = c;
/* recur to assign colors to rest
of the vertices */
if (graphColoringUtil(graph, m,
color, v + 1))
return true;
/* If assigning color c doesn't lead
to a solution then remove it */
color[v] = 0;
}
}
/* If no color can be assigned to this vertex
then return false */
return false;
}
/* This function solves the m Coloring problem using
Backtracking. It mainly uses graphColoringUtil()
to solve the problem. It returns false if the m
colors cannot be assigned, otherwise return true
and prints assignments of colors to all vertices.
Please note that there may be more than one
solutions, this function prints one of the
feasible solutions.*/
boolean graphColoring(int graph[][], int m)
{
// Initialize all color values as 0. This
// initialization is needed correct functioning
// of isSafe()
color = new int[V];
for (int i = 0; i < V; i++)
color[i] = 0;
// Call graphColoringUtil() for vertex 0
if (!graphColoringUtil(graph, m, color, 0))
{
System.out.println("Solution does not exist");
return false;
}
References:
http://en.wikipedia.org/wiki/Graph_coloring
Naive Algorithm
Generate all possible configurations of vertices and print a configuration that satisfies the given constraints. There will be n! (n factorial)
configurations.
while there are untried conflagrations
{
generate the next configuration
if ( there are edges between two consecutive vertices of this
configuration and there is an edge from the last vertex to
the first ).
{
print this configuration;
break;
}
}
Backtracking Algorithm
Create an empty path array and add vertex 0 to it. Add other vertices, starting from the vertex 1. Before adding a vertex, check for whether it is
adjacent to the previously added vertex and not already added. If we find such a vertex, we add the vertex as part of the solution. If we do not
find a vertex then we return false.
Implementation of Backtracking solution
Following are implementations of the Backtracking solution.
C/C++
/* C/C++ program for solution of Hamiltonian Cycle problem
using backtracking */
#include<stdio.h>
// Number of vertices in the graph
#define V 5
void printSolution(int path[]);
/* A utility function to check if the vertex v can be added at
index 'pos' in the Hamiltonian Cycle constructed so far (stored
in 'path[]') */
bool isSafe(int v, bool graph[V][V], int path[], int pos)
{
/* Check if this vertex is an adjacent vertex of the previously
added vertex. */
if (graph [ path[pos-1] ][ v ] == 0)
return false;
/* Check if the vertex has already been included.
This step can be optimized by creating an array of size V */
for (int i = 0; i < pos; i++)
if (path[i] == v)
return false;
return true;
}
/* A recursive utility function to solve hamiltonian cycle problem */
bool hamCycleUtil(bool graph[V][V], int path[], int pos)
{
/* base case: If all vertices are included in Hamiltonian Cycle */
if (pos == V)
{
// And if there is an edge from the last included vertex to the
// first vertex
if ( graph[ path[pos-1] ][ path[0] ] == 1 )
return true;
else
return false;
}
// Try different vertices as a next candidate in Hamiltonian Cycle.
// We don't try for 0 as we included 0 as starting point in in hamCycle()
for (int v = 1; v < V; v++)
{
/* Check if this vertex can be added to Hamiltonian Cycle */
if (isSafe(v, graph, path, pos))
{
path[pos] = v;
/* recur to construct rest of the path */
if (hamCycleUtil (graph, path, pos+1) == true)
return true;
/* If adding vertex v doesn't lead to a solution,
then remove it */
path[pos] = -1;
}
}
/* If no vertex can be added to Hamiltonian Cycle constructed so far,
then return false */
return false;
}
/* This function solves the Hamiltonian Cycle problem using Backtracking.
It mainly uses hamCycleUtil() to solve the problem. It returns false
if there is no Hamiltonian Cycle possible, otherwise return true and
prints the path. Please note that there may be more than one solutions,
this function prints one of the feasible solutions. */
bool hamCycle(bool graph[V][V])
{
int *path = new int[V];
for (int i = 0; i < V; i++)
path[i] = -1;
/* Let us put vertex 0 as the first vertex in the path. If there is
a Hamiltonian Cycle, then the path can be started from any point
of the cycle as the graph is undirected */
path[0] = 0;
if ( hamCycleUtil(graph, path, 1) == false )
{
printf("\nSolution does not exist");
return false;
}
printSolution(path);
return true;
}
/* A utility function to print solution */
void printSolution(int path[])
{
printf ("Solution Exists:"
" Following is one Hamiltonian Cycle \n");
for (int i = 0; i < V; i++)
printf(" %d ", path[i]);
// Let us print the first vertex again to show the complete cycle
printf(" %d ", path[0]);
printf("\n");
}
// driver program to test above function
int main()
{
/* Let us create the following graph
(0)--(1)--(2)
| / \ |
| / \ |
| /
\ |
(3)-------(4)
*/
bool graph1[V][V] = {{0, 1, 0, 1, 0},
{1, 0, 1, 1, 1},
{0, 1, 0, 0, 1},
{1, 1, 0, 0, 1},
{0, 1, 1, 1, 0},
};
// Print the solution
hamCycle(graph1);
/* Let us create the following graph
(0)--(1)--(2)
| / \ |
| / \ |
| /
\ |
(3)
(4)
*/
bool graph2[V][V] = {{0, 1, 0, 1, 0},
{1, 0, 1, 1, 1},
{0, 1, 0, 0, 1},
{1, 1, 0, 0, 0},
{0, 1, 1, 0, 0},
};
// Print the solution
hamCycle(graph2);
return 0;
}
Java
/* Java program for solution of Hamiltonian Cycle problem
using backtracking */
class HamiltonianCycle
{
final int V = 5;
int path[];
/* A utility function to check if the vertex v can be
added at index 'pos'in the Hamiltonian Cycle
constructed so far (stored in 'path[]') */
boolean isSafe(int v, int graph[][], int path[], int pos)
{
/* Check if this vertex is an adjacent vertex of
the previously added vertex. */
if (graph[path[pos - 1]][v] == 0)
return false;
/* Check if the vertex has already been included.
This step can be optimized by creating an array
of size V */
for (int i = 0; i < pos; i++)
if (path[i] == v)
return false;
return true;
}
/* A recursive utility function to solve hamiltonian
cycle problem */
boolean hamCycleUtil(int graph[][], int path[], int pos)
{
/* base case: If all vertices are included in
Hamiltonian Cycle */
if (pos == V)
{
// And if there is an edge from the last included
// vertex to the first vertex
if (graph[path[pos - 1]][path[0]] == 1)
return true;
else
return false;
}
// Try different vertices as a next candidate in
// Hamiltonian Cycle. We don't try for 0 as we
// included 0 as starting point in in hamCycle()
for (int v = 1; v < V; v++)
{
/* Check if this vertex can be added to Hamiltonian
Cycle */
if (isSafe(v, graph, path, pos))
{
path[pos] = v;
/* recur to construct rest of the path */
if (hamCycleUtil(graph, path, pos + 1) == true)
return true;
/* If adding vertex v doesn't lead to a solution,
then remove it */
path[pos] = -1;
}
}
/* If no vertex can be added to Hamiltonian Cycle
constructed so far, then return false */
return false;
}
/* This function solves the Hamiltonian Cycle problem using
Backtracking. It mainly uses hamCycleUtil() to solve the
problem. It returns false if there is no Hamiltonian Cycle
possible, otherwise return true and prints the path.
Please note that there may be more than one solutions,
this function prints one of the feasible solutions. */
int hamCycle(int graph[][])
{
path = new int[V];
for (int i = 0; i < V; i++)
path[i] = -1;
/* Let us put vertex 0 as the first vertex in the path.
If there is a Hamiltonian Cycle, then the path can be
started from any point of the cycle as the graph is
undirected */
path[0] = 0;
if (hamCycleUtil(graph, path, 1) == false)
{
System.out.println("\nSolution does not exist");
return 0;
}
printSolution(path);
return 1;
}
/* A utility function to print solution */
void printSolution(int path[])
{
System.out.println("Solution Exists: Following" +
" is one Hamiltonian Cycle");
for (int i = 0; i < V; i++)
System.out.print(" " + path[i] + " ");
// Let us print the first vertex again to show the
// complete cycle
System.out.println(" " + path[0] + " ");
}
// driver program to test above function
public static void main(String args[])
{
HamiltonianCycle hamiltonian =
new HamiltonianCycle();
/* Let us create the following graph
(0)--(1)--(2)
| / \ |
| / \ |
| /
\ |
(3)-------(4)
int graph1[][] =
{1, 0, 1, 1,
{0, 1, 0, 0,
{1, 1, 0, 0,
{0, 1, 1, 1,
};
*/
{{0, 1, 0, 1, 0},
1},
1},
1},
0},
*/
{{0, 1, 0, 1, 0},
1},
1},
0},
0},
Naive Algorithm
The Naive Algorithm is to generate all possible configurations of numbers from 1 to 9 to fill the empty cells. Try every configuration one by one
until the correct configuration is found.
Backtracking Algorithm
Like all other Backtracking problems, we can solve Sudoku by one by one assigning numbers to empty cells. Before assigning a number, we
check whether it is safe to assign. We basically check that the same number is not present in current row, current column and current 3X3 subgrid.
After checking for safety, we assign the number, and recursively check whether this assignment leads to a solution or not. If the assignment doesnt
lead to a solution, then we try next number for current empty cell. And if none of number (1 to 9) lead to solution, we return false.
Find row, col of an unassigned cell
If there is none, return true
For digits from 1 to 9
a) If there is no conflict for digit at row,col
assign digit to row,col and recursively try fill in rest of grid
b) If recursion successful, return true
c) Else, remove digit and try another
If all digits have been tried and nothing worked, return false
Following is C++ implementation for Sudoku problem. It prints the completely filled grid as output.
// A Backtracking program in C++ to solve Sudoku problem
#include <stdio.h>
// UNASSIGNED is used for empty cells in sudoku grid
#define UNASSIGNED 0
// N is used for size of Sudoku grid. Size will be NxN
#define N 9
// This function finds an entry in grid that is still unassigned
bool FindUnassignedLocation(int grid[N][N], int &row, int &col);
// Checks whether it will be legal to assign num to the given row,col
bool isSafe(int grid[N][N], int row, int col, int num);
/* Takes a partially filled-in grid and attempts to assign values to
all unassigned locations in such a way to meet the requirements
for Sudoku solution (non-duplication across rows, columns, and boxes) */
bool SolveSudoku(int grid[N][N])
{
int row, col;
// If there is no unassigned location, we are done
if (!FindUnassignedLocation(grid, row, col))
return true; // success!
// consider digits 1 to 9
for (int num = 1; num <= 9; num++)
{
// if looks promising
if (isSafe(grid, row, col, num))
{
// make tentative assignment
grid[row][col] = num;
// return, if success, yay!
if (SolveSudoku(grid))
return true;
// failure, unmake & try again
grid[row][col] = UNASSIGNED;
}
}
return false; // this triggers backtracking
}
/* Searches the grid to find an entry that is still unassigned. If
found, the reference parameters row, col will be set the location
that is unassigned, and true is returned. If no unassigned entries
remain, false is returned. */
bool FindUnassignedLocation(int grid[N][N], int &row, int &col)
{
for (row = 0; row < N; row++)
for (col = 0; col < N; col++)
if (grid[row][col] == UNASSIGNED)
return true;
return false;
}
/* Returns a boolean which indicates whether any assigned entry
in the specified row matches the given number. */
bool UsedInRow(int grid[N][N], int row, int num)
{
for (int col = 0; col < N; col++)
if (grid[row][col] == num)
return true;
return false;
}
/* Returns a boolean which indicates whether any assigned entry
in the specified column matches the given number. */
bool UsedInCol(int grid[N][N], int col, int num)
{
for (int row = 0; row < N; row++)
if (grid[row][col] == num)
return true;
return false;
}
/* Returns a boolean which indicates whether any assigned entry
within the specified 3x3 box matches the given number. */
bool UsedInBox(int grid[N][N], int boxStartRow, int boxStartCol, int num)
{
for (int row = 0; row < 3; row++)
for (int col = 0; col < 3; col++)
if (grid[row+boxStartRow][col+boxStartCol] == num)
return true;
return false;
}
/* Returns a boolean which indicates whether it will be legal to assign
num to the given row,col location. */
bool isSafe(int grid[N][N], int row, int col, int num)
{
/* Check if 'num' is not already placed in current row,
current column and current 3x3 box */
return !UsedInRow(grid, row, num) &&
!UsedInCol(grid, col, num) &&
!UsedInBox(grid, row - row%3 , col - col%3, num);
}
/* A utility function to print grid */
void printGrid(int grid[N][N])
{
for (int row = 0; row < N; row++)
{
for (int col = 0; col < N; col++)
printf("%2d", grid[row][col]);
printf("\n");
}
}
/* Driver Program to test above functions */
int main()
{
// 0 means unassigned cells
int grid[N][N] = {{3, 0, 6, 5, 0, 8, 4, 0, 0},
{5, 2,
{0, 8,
{0, 0,
{9, 0,
{0, 5,
{1, 3,
{0, 0,
{0, 0,
if (SolveSudoku(grid) ==
printGrid(grid);
else
printf("No solution
0, 0,
7, 0,
3, 0,
0, 8,
0, 0,
0, 0,
0, 0,
5, 2,
true)
0,
0,
1,
6,
9,
0,
0,
0,
0,
0,
0,
3,
0,
0,
0,
6,
0,
0,
0,
0,
6,
2,
0,
3,
0,
3,
8,
0,
0,
5,
7,
0,
0},
1},
0},
5},
0},
0},
4},
0}};
exists");
return 0;
}
Output:
3
5
4
2
9
8
1
6
7
1
2
8
6
7
5
3
9
4
6
9
7
3
4
1
8
2
5
5
1
6
4
8
7
9
3
2
7
3
2
1
6
9
4
5
8
8
4
9
5
3
2
7
1
6
4
7
5
9
1
6
2
8
3
9
6
3
8
2
4
5
7
1
2
8
1
7
5
3
6
4
9
References:
http://see.stanford.edu/materials/icspacs106b/H19-RecBacktrackExamples.pdf
Tug of War
Given a set of n integers, divide the set in two subsets of n/2 sizes each such that the difference of the sum of two subsets is as minimum as
possible. If n is even, then sizes of two subsets must be strictly n/2 and if n is odd, then size of one subset must be (n-1)/2 and size of other subset
must be (n+1)/2.
For example, let given set be {3, 4, 5, -3, 100, 1, 89, 54, 23, 20}, the size of set is 10. Output for this set should be {4, 100, 1, 23, 20} and {3,
5, -3, 89, 54}. Both output subsets are of size 5 and sum of elements in both subsets is same (148 and 148).
Let us consider another example where n is odd. Let given set be {23, 45, -34, 12, 0, 98, -99, 4, 189, -1, 4}. The output subsets should be {45,
-34, 12, 98, -1} and {23, 0, -99, 4, 189, 4}. The sums of elements in two subsets are 120 and 121 respectively.
The following solution tries every possible subset of half size. If one subset of half size is formed, the remaining elements form the other subset. We
initialize current set as empty and one by one build it. There are two possibilities for every element, either it is part of current set, or it is part of the
remaining elements (other subset). We consider both possibilities for every element. When the size of current set becomes n/2, we check whether
this solutions is better than the best solution available so far. If it is, then we update the best solution.
Following is C++ implementation for Tug of War problem. It prints the required arrays.
#include <iostream>
#include <stdlib.h>
#include <limits.h>
using namespace std;
// function that tries every possible solution by calling itself recursively
void TOWUtil(int* arr, int n, bool* curr_elements, int no_of_selected_elements,
bool* soln, int* min_diff, int sum, int curr_sum, int curr_position)
{
// checks whether the it is going out of bound
if (curr_position == n)
return;
// checks that the numbers of elements left are not less than the
// number of elements required to form the solution
if ((n/2 - no_of_selected_elements) > (n - curr_position))
return;
// consider the cases when current element is not included in the solution
TOWUtil(arr, n, curr_elements, no_of_selected_elements,
soln, min_diff, sum, curr_sum, curr_position+1);
// add the current element to the solution
no_of_selected_elements++;
curr_sum = curr_sum + arr[curr_position];
curr_elements[curr_position] = true;
// checks if a solution is formed
if (no_of_selected_elements == n/2)
{
// checks if the solution formed is better than the best solution so far
if (abs(sum/2 - curr_sum) < *min_diff)
{
*min_diff = abs(sum/2 - curr_sum);
for (int i = 0; i<n; i++)
soln[i] = curr_elements[i];
}
}
else
{
// consider the cases where current element is included in the solution
TOWUtil(arr, n, curr_elements, no_of_selected_elements, soln,
min_diff, sum, curr_sum, curr_position+1);
}
// removes current element before returning to the caller of this function
curr_elements[curr_position] = false;
}
// main function that generate an arr
void tugOfWar(int *arr, int n)
{
// the boolen array that contains the inclusion and exclusion of an element
// in current set. The number excluded automatically form the other set
bool* curr_elements = new bool[n];
// The inclusion/exclusion array for final solution
bool* soln = new bool[n];
Output:
The first subset is: 45 -34 12 98 -1
The second subset is: 23 0 -99 4 189 4
The goal here is to assign each letter a digit from 0 to 9 so that the arithmetic works out correctly. The rules are that all occurrences of a letter must
be assigned the same digit, and no digit can be assigned to more than one letter.
First, create a list of all the characters that need assigning to pass to Solve
If all characters are assigned, return true if puzzle is solved, false otherwise
Otherwise, consider the first unassigned character
for (every possible choice among the digits not in use)
make that choice and then recursively try to assign the rest of the characters
if recursion sucessful, return true
if !successful, unmake assignment and try another digit
If all digits have been tried and nothing worked, return false to trigger backtracking
/* ExhaustiveSolve
* --------------* This is the "not-very-smart" version of cryptarithmetic solver. It takes
* the puzzle itself (with the 3 strings for the two addends and sum) and a
* string of letters as yet unassigned. If no more letters to assign
* then we've hit a base-case, if the current letter-to-digit mapping solves
* the puzzle, we're done, otherwise we return false to trigger backtracking
* If we have letters to assign, we take the first letter from that list, and
* try assigning it the digits from 0 to 9 and then recursively working
* through solving puzzle from here. If we manage to make a good assignment
* that works, we've succeeded, else we need to unassign that choice and try
* another digit. This version is easy to write, since it uses a simple
* approach (quite similar to permutations if you think about it) but it is
* not so smart because it doesn't take into account the structure of the
* puzzle constraints (for example, once the two digits for the addends have
* been assigned, there is no reason to try anything other than the correct
* digit for the sum) yet it tries a lot of useless combos regardless
*/
bool ExhaustiveSolve(puzzleT puzzle, string lettersToAssign)
{
if (lettersToAssign.empty()) // no more choices to make
return PuzzleSolved(puzzle); // checks arithmetic to see if works
for (int digit = 0; digit <= 9; digit++) // try all digits
{
if (AssignLetterToDigit(lettersToAssign[0], digit))
{
if (ExhaustiveSolve(puzzle, lettersToAssign.substr(1)))
return true;
UnassignLetterFromDigit(lettersToAssign[0], digit);
}
}
return false; // nothing worked, need to backtrack
}
The algorithm above actually has a lot in common with the permutations algorithm, it pretty much just creates all arrangements of the mapping from
characters to digits and tries each until one works or all have been successfully tried. For a large puzzle, this could take a while.
A smarter algorithm could take into account the structure of the puzzle and avoid going down dead-end paths. For example, if we assign the
characters starting from the ones place and moving to the left, at each stage, we can verify the correctness of what we have so far before we
continue onwards. This definitely complicates the code but leads to a tremendous improvement in efficiency, making it much more feasible to solve
large puzzles.
Below pseudocode in this case has more special cases, but the same general design
Start by examining the rightmost digit of the topmost row, with a carry of 0
If we are beyond the leftmost digit of the puzzle, return true if no carry, false otherwise
If we are currently trying to assign a char in one of the addends
If char already assigned, just recur on row beneath this one, adding value into sum
If not assigned, then
for (every possible choice among the digits not in use)
make that choice and then on row beneath this one, if successful, return true
if !successful, unmake assignment and try another digit
return false if no assignment worked to trigger backtracking
Else if trying to assign a char in the sum
If char assigned & matches correct,
recur on next column to the left with carry, if success return true,
If char assigned & doesnt match, return false
If char unassigned & correct digit already used, return false
If char unassigned & correct digit unused,
assign it and recur on next column to left with carry, if success return true
return false to trigger backtracking
Source:
http://see.stanford.edu/materials/icspacs106b/H19-RecBacktrackExamples.pdf
return 0;
}
Example:
ar1[] = {1, 12, 15, 26, 38}
ar2[] = {2, 13, 17, 30, 45}
Implementation:
#include<stdio.h>
int max(int, int); /* to get maximum of two integers */
int min(int, int); /* to get minimum of two integeres */
int median(int [], int); /* to get median of a sorted array */
Example:
ar1[] = {1, 5, 7, 10, 13}
ar2[] = {11, 15, 23, 30, 45}
Middle element of ar1[] is 7. Let us compare 7 with 23 and 30, since 7 smaller than both 23 and 30, move to right in ar1[]. Do binary search in
{10, 13}, this step will pick 10. Now compare 10 with 15 and 23. Since 10 is smaller than both 15 and 23, again move to right. Only 13 is there
in right side now. Since 13 is greater than 11 and smaller than 15, terminate here. We have got the median as 12 (average of 11 and 13)
Implementation:
#include<stdio.h>
int getMedianRec(int ar1[], int ar2[], int left, int right, int n);
/* This function returns median of ar1[] and ar2[].
Assumptions in this function:
Both ar1[] and ar2[] are sorted arrays
Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
return getMedianRec(ar1, ar2, 0, n-1, n);
}
/* A recursive function to get the median of ar1[] and ar2[]
using binary search */
int getMedianRec(int ar1[], int ar2[], int left, int right, int n)
{
int i, j;
/* We have reached at the end (left or right) of ar1[] */
if (left > right)
return getMedianRec(ar2, ar1, 0, n-1, n);
i = (left + right)/2;
j = n - i - 1; /* Index of ar2[] */
/* Recursion terminates here.*/
if (ar1[i] > ar2[j] && (j == n-1 || ar1[i] <= ar2[j+1]))
{
/* ar1[i] is decided as median 2, now select the median 1
(element just before ar1[i] in merged array) to get the
average of both*/
if (i == 0 || ar2[j] > ar1[i-1])
return (ar1[i] + ar2[j])/2;
else
return (ar1[i] + ar1[i-1])/2;
}
/*Search in left half of ar1[]*/
else if (ar1[i] > ar2[j] && j != n-1 && ar1[i] > ar2[j+1])
return getMedianRec(ar1, ar2, left, i-1, n);
/*Search in right half of ar1[]*/
else /* ar1[i] is smaller than both ar2[j] and ar2[j+1]*/
return getMedianRec(ar1, ar2, i+1, right, n);
}
/* Driver program to test above function */
int main()
{
int ar1[] = {1, 12, 15, 26, 38};
int ar2[] = {2, 13, 17, 30, 45};
int n1 = sizeof(ar1)/sizeof(ar1[0]);
int n2 = sizeof(ar2)/sizeof(ar2[0]);
if (n1 == n2)
printf("Median is %d", getMedian(ar1, ar2, n1));
else
printf("Doesn't work for arrays of unequal size");
getchar();
return 0;
}
References:
http://en.wikipedia.org/wiki/Median
http://ocw.alfaisal.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-046JFall-2005/30C68118-E436-4FE3-8C796BAFBB07D935/0/ps9sol.pdf ds3etph5wn
Asked by Snehal
Implementation:
#include <stdio.h>
#include <stdlib.h>
int _mergeSort(int arr[], int temp[], int left, int right);
int merge(int arr[], int temp[], int left, int mid, int right);
/* This function sorts the input array and returns the
number of inversions in the array */
int mergeSort(int arr[], int array_size)
{
int *temp = (int *)malloc(sizeof(int)*array_size);
return _mergeSort(arr, temp, 0, array_size - 1);
}
/* An auxiliary recursive function that sorts the input array and
returns the number of inversions in the array. */
int _mergeSort(int arr[], int temp[], int left, int right)
{
int mid, inv_count = 0;
if (right > left)
{
/* Divide the array into two parts and call _mergeSortAndCountInv()
for each of the parts */
mid = (right + left)/2;
/* Inversion
and number
inv_count =
inv_count +=
}
/* This funt merges two sorted arrays and returns inversion count in
the arrays.*/
int merge(int arr[], int temp[], int left, int mid, int right)
{
int i, j, k;
int inv_count = 0;
i = left; /* i is index for
j = mid; /* i is index for
k = left; /* i is index for
while ((i <= mid - 1) && (j
{
if (arr[i] <= arr[j])
{
temp[k++] = arr[i++];
}
else
{
temp[k++] = arr[j++];
left subarray*/
right subarray*/
resultant merged subarray*/
<= right))
Note that above code modifies (or sorts) the input array. If we want to count only inversions then we need to create a copy of original array and
call mergeSort() on copy.
Time Complexity: O(nlogn)
Algorithmic Paradigm: Divide and Conquer
References:
http://www.cs.umd.edu/class/fall2009/cmsc451/lectures/Lec08-inversions.pdf
http://www.cp.eng.chula.ac.th/~piak/teaching/algo/algo2008/count-inv.htm
The Brute force solution is O(n^2), compute the distance between each pair and return the smallest. We can calculate the smallest distance in
O(nLogn) time using Divide and Conquer strategy. In this post, a O(n x (Logn)^2) approach is discussed. We will be discussing a O(nLogn)
approach in a separate post.
Algorithm
Following are the detailed steps of a O(n (Logn)^2) algortihm.
Input: An array of n points P[]
Output: The smallest distance between two points in the given array.
As a pre-processing step, input array is sorted according to x coordinates.
1) Find the middle point in the sorted array, we can take P[n/2] as middle point.
2) Divide the given array in two halves. The first subarray contains points from P[0] to P[n/2]. The second subarray contains points from P[n/2+1]
to P[n-1].
3) Recursively find the smallest distances in both subarrays. Let the distances be dl and dr. Find the minimum of dl and dr. Let the minimum be d.
4) From above 3 steps, we have an upper bound d of minimum distance. Now we need to consider the pairs such that one point in pair is from left
half and other is from right half. Consider the vertical line passing through passing through P[n/2] and find all points whose x coordinate is closer
than d to the middle vertical line. Build an array strip[] of all such points.
5) Sort the array strip[] according to y coordinates. This step is O(nLogn). It can be optimized to O(n) by recursively sorting and merging.
6) Find the smallest distance in strip[]. This is tricky. From first look, it seems to be a O(n^2) step, but it is actually O(n). It can be proved
geometrically that for every point in strip, we only need to check at most 7 points after it (note that strip is sorted according to Y coordinate). See
this for more analysis.
7) Finally return the minimum of d and distance calculated in above step (step 6)
Implementation
Following is C/C++ implementation of the above algorithm.
// A divide and conquer program in C/C++ to find the smallest distance from a
// given set of points.
#include <stdio.h>
#include <float.h>
#include <stdlib.h>
#include <math.h>
// A structure to represent a Point in 2D plane
struct Point
{
int x, y;
};
/* Following two functions are needed for library function qsort().
Refer: http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/ */
// Needed to sort array of points according to X coordinate
int compareX(const void* a, const void* b)
{
Point *p1 = (Point *)a, *p2 = (Point *)b;
return (p1->x - p2->x);
}
// Needed to sort array of points according to Y coordinate
int compareY(const void* a, const void* b)
{
Point *p1 = (Point *)a, *p2 = (Point *)b;
return (p1->y - p2->y);
}
// A utility function to find the distance between two points
float dist(Point p1, Point p2)
{
return sqrt( (p1.x - p2.x)*(p1.x - p2.x) +
(p1.y - p2.y)*(p1.y - p2.y)
);
}
// A Brute Force method to return the smallest distance between two points
// in P[] of size n
float bruteForce(Point P[], int n)
{
float min = FLT_MAX;
for (int i = 0; i < n; ++i)
for (int j = i+1; j < n; ++j)
if (dist(P[i], P[j]) < min)
min = dist(P[i], P[j]);
return min;
}
// A utility function to find minimum of two float values
float min(float x, float y)
{
return (x < y)? x : y;
}
// A utility function to find the distance beween the closest points of
// strip of given size. All points in strip[] are sorted accordint to
// y coordinate. They all have an upper bound on minimum distance as d.
// Note that this method seems to be a O(n^2) method, but it's a O(n)
// method as the inner loop runs at most 6 times
float stripClosest(Point strip[], int size, float d)
{
float min = d; // Initialize the minimum distance as d
qsort(strip, size, sizeof(Point), compareY);
// Pick all points one by one and try the next points till the difference
// between y coordinates is smaller than d.
// This is a proven fact that this loop runs at most 6 times
for (int i = 0; i < size; ++i)
for (int j = i+1; j < size && (strip[j].y - strip[i].y) < min; ++j)
if (dist(strip[i],strip[j]) < min)
min = dist(strip[i], strip[j]);
return min;
}
// A recursive function to find the smallest distance. The array P contains
// all points sorted according to x coordinate
float closestUtil(Point P[], int n)
{
// If there are 2 or 3 points, then use brute force
if (n <= 3)
return bruteForce(P, n);
Output:
The smallest distance is 1.414214
Time Complexity Let Time complexity of above algorithm be T(n). Let us assume that we use a O(nLogn) sorting algorithm. The above algorithm
divides all points in two sets and recursively calls for two sets. After dividing, it finds the strip in O(n) time, sorts the strip in O(nLogn) time and
finally finds the closest points in strip in O(n) time. So T(n) can expressed as follows
T(n) = 2T(n/2) + O(n) + O(nLogn) + O(n)
T(n) = 2T(n/2) + O(nLogn)
T(n) = T(n x Logn x Logn)
Notes
1) Time complexity can be improved to O(nLogn) by optimizing step 5 of the above algorithm. We will soon be discussing the optimized solution
in a separate post.
2) The code finds smallest distance. It can be easily modified to find the points with smallest distance.
3) The code uses quick sort which can be O(n^2) in worst case. To have the upper bound as O(n (Logn)^2), a O(nLogn) sorting algorithm like
merge sort or heap sort can be used
References:
http://www.cs.umd.edu/class/fall2013/cmsc451/Lects/lect10.pdf
http://www.youtube.com/watch?v=vS4Zn1a9KUc
http://www.youtube.com/watch?v=T3T7T8Ym20M
http://en.wikipedia.org/wiki/Closest_pair_of_points_problem
In the above method, we do 8 multiplications for matrices of size N/2 x N/2 and 4 additions. Addition of two matrices takes O(N2) time. So the
time complexity can be written as
T(N) = 8T(N/2) + O(N2)
From Master's Theorem, time complexity of above method is O(N3)
which is unfortunately same as the above naive method.
Simple Divide and Conquer also leads to O(N3), can there be a better way?
In the above divide and conquer method, the main component for high time complexity is 8 recursive calls. The idea of Strassens method is to
reduce the number of recursive calls to 7. Strassens method is similar to above simple divide and conquer method in the sense that this method also
divide matrices to sub-matrices of size N/2 x N/2 as shown in the above diagram, but in Strassens method, the four sub-matrices of result are
calculated using following formulae.
Generally Strassens Method is not preferred for practical applications for following reasons.
1) The constants used in Strassens method are high and for a typical application Naive method works better.
2) For Sparse matrices, there are better methods especially designed for them.
3) The submatrices in recursion take extra space.
4) Because of the limited precision of computer arithmetic on noninteger values, larger errors accumulate in Strassens algorithm than in Naive
Method (Source: CLRS Book)
References:
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
https://www.youtube.com/watch?v=LOLebQ8nKHA
https://www.youtube.com/watch?v=QXY4RskLQcI
// A Brute Force method to return the smallest distance between two points
// in P[] of size n
float bruteForce(Point P[], int n)
{
float min = FLT_MAX;
for (int i = 0; i < n; ++i)
for (int j = i+1; j < n; ++j)
if (dist(P[i], P[j]) < min)
min = dist(P[i], P[j]);
return min;
}
// A utility function to find minimum of two float values
float min(float x, float y)
{
return (x < y)? x : y;
}
// A utility function to find the distance beween the closest points of
// strip of given size. All points in strip[] are sorted accordint to
// y coordinate. They all have an upper bound on minimum distance as d.
// Note that this method seems to be a O(n^2) method, but it's a O(n)
// method as the inner loop runs at most 6 times
float stripClosest(Point strip[], int size, float d)
{
float min = d; // Initialize the minimum distance as d
// Pick all points one by one and try the next points till the difference
// between y coordinates is smaller than d.
// This is a proven fact that this loop runs at most 6 times
for (int i = 0; i < size; ++i)
for (int j = i+1; j < size && (strip[j].y - strip[i].y) < min; ++j)
if (dist(strip[i],strip[j]) < min)
min = dist(strip[i], strip[j]);
return min;
}
// A recursive function to find the smallest distance. The array Px contains
// all points sorted according to x coordinates and Py contains all points
// sorted according to y coordinates
float closestUtil(Point Px[], Point Py[], int n)
{
// If there are 2 or 3 points, then use brute force
if (n <= 3)
return bruteForce(Px, n);
// Find the middle point
int mid = n/2;
Point midPoint = Px[mid];
// Divide points in y sorted array around the vertical line.
// Assumption: All x coordinates are distinct.
Point Pyl[mid+1]; // y sorted points on left of vertical line
Point Pyr[n-mid-1]; // y sorted points on right of vertical line
int li = 0, ri = 0; // indexes of left and right subarrays
for (int i = 0; i < n; i++)
{
if (Py[i].x <= midPoint.x)
Pyl[li++] = Py[i];
else
Pyr[ri++] = Py[i];
}
// Consider the vertical line passing through the middle point
// calculate the smallest distance dl on left of middle point and
// dr on right side
float dl = closestUtil(Px, Pyl, mid);
float dr = closestUtil(Px + mid, Pyr, n-mid);
// Find the smaller of two distances
float d = min(dl, dr);
// Build an array strip[] that contains points close (closer than d)
// to the line passing through the middle point
Point strip[n];
int j = 0;
for (int i = 0; i < n; i++)
if (abs(Py[i].x - midPoint.x) < d)
Output:
The smallest distance is 1.41421
Time Complexity:Let Time complexity of above algorithm be T(n). Let us assume that we use a O(nLogn) sorting algorithm. The above
algorithm divides all points in two sets and recursively calls for two sets. After dividing, it finds the strip in O(n) time. Also, it takes O(n) time to
divide the Py array around the mid vertical line. Finally finds the closest points in strip in O(n) time. So T(n) can expressed as follows
T(n) = 2T(n/2) + O(n) + O(n) + O(n)
T(n) = 2T(n/2) + O(n)
T(n) = T(nLogn)
References:
http://www.cs.umd.edu/class/fall2013/cmsc451/Lects/lect10.pdf
http://www.youtube.com/watch?v=vS4Zn1a9KUc
http://www.youtube.com/watch?v=T3T7T8Ym20M
http://en.wikipedia.org/wiki/Closest_pair_of_points_problem
counterclockwise
clockwise
colinear
The following diagram shows different possible orientations of (a, b, c)
Note the word ordered here. Orientation of (a, b, c) may be different from orientation of (c, b, a).
How is Orientation useful here?
Two segments (p1,q1) and (p2,q2) intersect if and only ifone of the following two conditions is verified
1.General Case:
(p1, q1, p2) and (p1, q1, q2) have differentorientations and
(p2, q2,p1) and (p2, q2,q1) have differentorientations
2. Special Case
(p1, q1, p2), (p1, q1, q2), (p2, q2, p1), and (p2, q2, q1) areall collinear and
the x-projections of (p1, q1) and (p2, q2) intersect
the y-projections of (p1, q1) and (p2, q2) intersect
Examples of General Case:
int x;
int y;
};
// Given three colinear
// point q lies on line
bool onSegment(Point p,
{
if (q.x <= max(p.x,
q.y <= max(p.y,
return true;
return false;
}
// To find orientation of ordered triplet (p, q, r).
// The function returns following values
// 0 --> p, q and r are colinear
// 1 --> Clockwise
// 2 --> Counterclockwise
int orientation(Point p, Point q, Point r)
{
// See 10th slides from following link for derivation of the formula
// http://www.dcs.gla.ac.uk/~pat/52233/slides/Geometry1x1.pdf
int val = (q.y - p.y) * (r.x - q.x) (q.x - p.x) * (r.y - q.y);
if (val == 0) return 0; // colinear
return (val > 0)? 1: 2; // clock or counterclock wise
}
// The main function that returns true if line segment 'p1q1'
// and 'p2q2' intersect.
bool doIntersect(Point p1, Point q1, Point p2, Point q2)
{
// Find the four orientations needed for general and
// special cases
int o1 = orientation(p1, q1, p2);
int o2 = orientation(p1, q1, q2);
int o3 = orientation(p2, q2, p1);
int o4 = orientation(p2, q2, q1);
// General case
if (o1 != o2 && o3 != o4)
return true;
// Special Cases
// p1, q1 and p2 are colinear and p2 lies on segment p1q1
if (o1 == 0 && onSegment(p1, p2, q1)) return true;
// p1, q1 and p2 are colinear and q2 lies on segment p1q1
if (o2 == 0 && onSegment(p1, q2, q1)) return true;
// p2, q2 and p1 are colinear and p1 lies on segment p2q2
if (o3 == 0 && onSegment(p2, p1, q2)) return true;
// p2, q2 and q1 are colinear and q1 lies on segment p2q2
if (o4 == 0 && onSegment(p2, q1, q2)) return true;
return false; // Doesn't fall in any of the above cases
}
// Driver program to test above functions
int main()
{
struct Point p1 = {1, 1}, q1 = {10, 1};
struct Point p2 = {1, 2}, q2 = {10, 2};
doIntersect(p1, q1, p2, q2)? cout << "Yes\n": cout << "No\n";
p1 = {10, 0}, q1 = {0, 10};
p2 = {0, 0}, q2 = {10, 10};
doIntersect(p1, q1, p2, q2)? cout << "Yes\n": cout << "No\n";
p1 = {-5, -5}, q1 = {0, 0};
p2 = {1, 1}, q2 = {10, 10};
doIntersect(p1, q1, p2, q2)? cout << "Yes\n": cout << "No\n";
return 0;
}
Output:
No
Yes
No
Sources:
http://www.dcs.gla.ac.uk/~pat/52233/slides/Geometry1x1.pdf
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
// 1 --> Clockwise
// 2 --> Counterclockwise
int orientation(Point p, Point q, Point r)
{
int val = (q.y - p.y) * (r.x - q.x) (q.x - p.x) * (r.y - q.y);
if (val == 0) return 0; // colinear
return (val > 0)? 1: 2; // clock or counterclock wise
}
// The function that returns true if line segment 'p1q1'
// and 'p2q2' intersect.
bool doIntersect(Point p1, Point q1, Point p2, Point q2)
{
// Find the four orientations needed for general and
// special cases
int o1 = orientation(p1, q1, p2);
int o2 = orientation(p1, q1, q2);
int o3 = orientation(p2, q2, p1);
int o4 = orientation(p2, q2, q1);
// General case
if (o1 != o2 && o3 != o4)
return true;
// Special Cases
// p1, q1 and p2 are colinear and p2 lies on segment p1q1
if (o1 == 0 && onSegment(p1, p2, q1)) return true;
// p1, q1 and p2 are colinear and q2 lies on segment p1q1
if (o2 == 0 && onSegment(p1, q2, q1)) return true;
// p2, q2 and p1 are colinear and p1 lies on segment p2q2
if (o3 == 0 && onSegment(p2, p1, q2)) return true;
// p2, q2 and q1 are colinear and q1 lies on segment p2q2
if (o4 == 0 && onSegment(p2, q1, q2)) return true;
return false; // Doesn't fall in any of the above cases
}
// Returns true if the point p lies inside the polygon[] with n vertices
bool isInside(Point polygon[], int n, Point p)
{
// There must be at least 3 vertices in polygon[]
if (n < 3) return false;
// Create a point for line segment from p to infinite
Point extreme = {INF, p.y};
// Count intersections of the above line with sides of polygon
int count = 0, i = 0;
do
{
int next = (i+1)%n;
// Check if the line segment from 'p' to 'extreme' intersects
// with the line segment from 'polygon[i]' to 'polygon[next]'
if (doIntersect(polygon[i], polygon[next], p, extreme))
{
// If the point 'p' is colinear with line segment 'i-next',
// then check if it lies on segment. If it lies, return true,
// otherwise false
if (orientation(polygon[i], p, polygon[next]) == 0)
return onSegment(polygon[i], p, polygon[next]);
count++;
}
i = next;
} while (i != 0);
// Return true if count is odd, false otherwise
return count&1; // Same as (count%2 == 1)
}
// Driver program to test above functions
int main()
{
Point polygon1[] = {{0, 0}, {10, 0}, {10, 10}, {0, 10}};
int n = sizeof(polygon1)/sizeof(polygon1[0]);
Output:
No
Yes
Yes
Yes
No
No
Time Complexity: O(n) where n is the number of vertices in the given polygon.
Source:
http://www.dcs.gla.ac.uk/~pat/52233/slides/Geometry1x1.pdf
3)
0)
0)
3)
Time Complexity: For every point on the hull we examine all the other points to determine the next point. Time complexity is ?(m * n) where n is
number of input points and m is number of output or hull points (m <= n). In worst case, time complexity is O(n 2). The worst case occurs when all
the points are on the hull (m = n)
We will soon be discussing other algorithms for finding convex hulls.
Sources:
http://www.cs.uiuc.edu/~jeffe/teaching/373/notes/x05-convexhull.pdf
http://www.dcs.gla.ac.uk/~pat/52233/slides/Hull1x1.pdf
What should be the sorting criteria? computation of actual angles would be inefficient since trigonometric functions are not simple to evaluate. The
idea is to use the orientation to compare angles without actually computing them (See the compare() function below)
Phase 2 (Accept or Reject Points): Once we have the closed path, the next step is to traverse the path and remove concave points on this path.
How to decide which point to remove and which to keep? Again, orientation helps here. The first two points in sorted array are always part of
Convex Hull. For remaining points, we keep track of recent three points, and find the angle formed by them. Let the three points be prev(p),
curr(c) and next(n). If orientation of these points (considering them in same order) is not counterclockwise, we discard c, otherwise we keep it.
Following diagram shows step by step process of this phase (Source of these diagrams is Ref 2).
Output:
(0,
(4,
(3,
(0,
3)
4)
1)
0)
Time Complexity: Let n be the number of input points. The algorithm takes O(nLogn) time if we use a O(nLogn) sorting algorithm.
The first step (finding the bottom-most point) takes O(n) time. The second step (sorting points) takes O(nLogn) time. In third step, every element
is pushed and popped at most one time. So the third step to process points one by one takes O(n) time, assuming that the stack operations take
O(1) time. Overall complexity is O(n) + O(nLogn) + O(n) which is O(nLogn)
Example:
Let us consider the following example taken from here. There are 5 line segments 1, 2, 3, 4 and 5. The dotted green lines show sweep lines.
Following are steps followed by the algorithm. All points from left to right are processed one by one. We maintain a self-balancing binary search
tree.
Left end point of line segment1 is processed: 1 is inserted into the Tree. The tree contains 1. No intersection.
Left end point of line segment2 is processed: Intersection of 1 and 2 is checked. 2 is inserted into the Tree. No intersection.The tree contains 1,
2.
Left end point of line segment 3is processed: Intersection of 3 with 1 is checked. No intersection.3 is inserted into the Tree. The tree contains 2,
1, 3.
Right end point of line segment1 is processed: 1 is deleted from the Tree. Intersection of 2 and 3 is checked. Intersection of 2 and 3 is
reported. The tree contains 2, 3. Note that the above pseudocode returns at this point. We can continue from here to report all intersection points.
Left end point of line segment4 is processed: Intersections of line 4 with lines 2 and 3 are checked. No intersection. 4 is inserted into the Tree.
The tree contains 2, 4, 3.
Left end point of line segment5 is processed: Intersection of 5 with 3 is checked. No intersection. 4 is inserted into the Tree. The tree contains
2,4, 3, 5.
Right end point of line segment5 is processed:5 is deleted from the Tree. The tree contains 2, 4, 3.
Right end point of line segment4 is processed:4 is deleted from the Tree. The tree contains 2, 4, 3. Intersection of 2 with3 is checked.
Intersection of 2 with 3 is reported. The tree contains 2, 3. Note that the intersection of 2 and 3 is reported again. We can add some logic to
check for duplicates.
Right end point of line segment2 and 3 are processed: Both are deleted from tree and tree becomes empty.
Time Complexity: The first step is sorting which takes O(nLogn) time. The second step process 2n points and for processing every point, it takes
O(Logn) time. Therefore, overall time complexity is O(nLogn)
References:
http://www.cs.uiuc.edu/~jeffe/teaching/373/notes/x06-sweepline.pdf
http://courses.csail.mit.edu/6.006/spring11/lectures/lec24.pdf
http://www.youtube.com/watch?v=dePDHVovJlE
http://www.eecs.wsu.edu/~cook/aa/lectures/l25/node10.html
Ouptut:
Inside
Exercise: Given coordinates of four corners of a rectangle, and a point P. Write a function to check whether P lies inside the given rectangle or
not.
The idea is to pick any point and calculate its distance from rest of the points. Let the picked picked point be p. To form a square, distance of two
points must be same from p, let this distance be d. The distance from one point must be different from that d and must be equal to ?2 times d. Let
this point with different distance be q.
The above condition is not good enough as the the point with different distance can be on the other side. We also need to check that q is at same
distance from 2 other points and this distance is same as d.
Below is C++ implementation of above idea.
// A C++ program to check if four given points form a square or not.
#include<iostream>
using namespace std;
// Structure of a point in 2D space
struct Point
{
int x, y;
};
// A utility function to find square of distance
// from point 'p' to poitn 'q'
int distSq(Point p, Point q)
{
return (p.x - q.x)*(p.x - q.x) +
(p.y - q.y)*(p.y - q.y);
}
// This function returns true
// square, otherwise false
bool isSquare(Point p1, Point
{
int d2 = distSq(p1, p2);
int d3 = distSq(p1, p3);
int d4 = distSq(p1, p4);
//
//
//
//
//
if
{
}
// The below two cases are similar to above case
if (d3 == d4 && 2*d3 == d2)
{
int d = distSq(p2, p3);
return (d == distSq(p2, p4) && d == d3);
}
if (d2 == d4 && 2*d2 == d3)
{
int d = distSq(p2, p3);
return (d == distSq(p3, p4) && d == d2);
}
return false;
}
Output:
Yes
Proof:
Above can be proved by taking the example of 11 in decimal numbers. (In this context 11 in decimal numbers is same as 3 in binary numbers)
If difference between sum of odd digits and even digits is multiple of 11 then decimal number is multiple of 11. Lets see how.
Lets take the example of 2 digit numbers in decimal
AB = 11A -A + B = 11A + (B A)
So if (B A) is a multiple of 11 then is AB.
Let us take 3 digit numbers.
ABC = 99A + A + 11B B + C = (99A + 11B) + (A + C B)
So if (A + C B) is a multiple of 11 then is (A+C-B)
Let us take 4 digit numbers now.
ABCD = 1001A + D + 11C C + 999B + B A
= (1001A 999B + 11C) + (D + B A -C )
So, if (B + D A C) is a multiple of 11 then is ABCD.
This can be continued for all decimal numbers.
Above concept can be proved for 3 in binary numbers in the same way.
Time Complexity: O(logn)
Program:
#include<stdio.h>
/* Fnction to check if n is a multiple of 3*/
int isMultipleOf3(int n)
{
int odd_count = 0;
int even_count = 0;
/* Make no positive if +n is multiple of 3
then is -n. We are doing this to avoid
stack overflow in recursion*/
if(n < 0) n = -n;
if(n == 0) return 1;
if(n == 1) return 0;
while(n)
{
/* If odd bit is set then
C/C++
// C program to print all permutations with duplicates allowed
#include <stdio.h>
#include <string.h>
/* Function to swap values at two pointers */
void swap(char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
/* Function to print permutations of string
This function takes three parameters:
1. String
2. Starting index of the string
3. Ending index of the string. */
void permute(char *a, int l, int r)
{
int i;
if (l == r)
printf("%s\n", a);
else
{
for (i = l; i <= r; i++)
{
swap((a+l), (a+i));
permute(a, l+1, r);
swap((a+l), (a+i)); //backtrack
}
}
}
/* Driver program to test above functions */
int main()
{
char str[] = "ABC";
int n = strlen(str);
permute(str, 0, n-1);
return 0;
}
Python
# Python program to print all permutations with
# duplicates allowed
ABC
ACB
BAC
BCA
CBA
CAB
Lucky Numbers
Lucky numbers are subset of integers. Rather than going into much theory, let us see the process of arriving at lucky numbers,
Take the set of integers
1,2,3,4,5,6,7,8,9,10,11,12,14,15,16,17,18,19,
First, delete every second number, we get following reduced set.
1,3,5,7,9,11,13,15,17,19,
Now, delete every third number, we get
1, 3, 7, 9, 13, 15, 19,..
Continue this process indefinitely
Any number that does NOT get deleted due to above process is called lucky.
Therefore, set of lucky numbers is 1, 3, 7, 13,
Now, given an integer n, write a function to say whether this number is lucky or not.
bool isLucky(int n)
Algorithm:
Before every iteration, if we calculate position of the given no, then in a given iteration, we can determine if the no will be deleted. Suppose
calculated position for the given no. is P before some iteration, and each Ith no. is going to be removed in this iteration, if P < I then input no is
lucky, if P is such that P%I == 0 (I is a divisor of P), then input no is not lucky. Recursive Way:
#include <stdio.h>
#define bool int
/* Returns 1 if n is a lucky no. ohterwise returns 0*/
bool isLucky(int n)
{
static int counter = 2;
/*variable next_position is just for readability of
the program we can remove it and use n only */
int next_position = n;
if(counter > n)
return 1;
if(n%counter == 0)
return 0;
/*calculate next position of input no*/
next_position -= next_position/counter;
counter++;
return isLucky(next_position);
}
/*Driver function to test above function*/
int main()
{
int x = 5;
if( isLucky(x) )
printf("%d is a lucky no.", x);
else
printf("%d is not a lucky no.", x);
getchar();
}
Example:
Lets us take an example of 19
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,15,17,18,19,20,21,
1,3,5,7,9,11,13,15,17,19,..
1,3,7,9,13,15,19,.
1,3,7,13,15,19,
1,3,7,13,19,
In next step every 6th no .in sequence will be deleted. 19 will not be deleted after this step because position of 19 is 5th after this step. Therefore,
19 is lucky. Lets see how above C code finds out:
Current function call
Next Call
isLucky(19 )
10
isLucky(10)
isLucky(10)
isLucky(7)
isLucky(6)
7
6
5
4
5
6
isLucky(7)
isLucky(6)
isLucky(5)
Method 2
Just add the numbers in base 14 in same way we add in base 10. Add numerals of both numbers one by one from right to left. If there is a carry
while adding two numerals, consider the carry for adding next numerals.
Let us consider the presentation of base 14 numbers same as hexadecimal numbers
A
B
C
D
-->
-->
-->
-->
10
11
12
13
Example:
num1 =
num2 =
1 2 A
C D 3
Implementation of Method 2
# include <stdio.h>
# include <stdlib.h>
# define bool int
int getNumeralValue(char );
char getNumeral(int );
/* Function to add two numbers in base 14 */
char *sumBase14(char *num1, char *num2)
{
int l1 = strlen(num1);
int l2 = strlen(num2);
char *res;
int i;
int nml1, nml2, res_nml;
bool carry = 0;
if(l1 != l2)
{
printf("Function doesn't support numbers of different"
" lengths. If you want to add such numbers then"
" prefix smaller number with required no. of zeroes");
getchar();
assert(0);
}
/* Note the size of the allocated memory is one
more than i/p lenghts for the cases where we
have carry at the last like adding D1 and A1 */
res = (char *)malloc(sizeof(char)*(l1 + 1));
/* Add all numerals from right to left */
for(i = l1-1; i >= 0; i--)
{
/* Get decimal values of the numerals of
i/p numbers*/
nml1 = getNumeralValue(num1[i]);
nml2 = getNumeralValue(num2[i]);
/* Add decimal values of numerals and carry */
res_nml = carry + nml1 + nml2;
/* Check if we have carry for next addition
of numerals */
if(res_nml >= 14)
{
carry = 1;
res_nml -= 14;
}
else
{
carry = 0;
}
res[i+1] = getNumeral(res_nml);
}
/* if there is no carry after last iteration
then result should not include 0th character
of the resultant string */
if(carry == 0)
return (res + 1);
/* if we have carry after last iteration then
result should include 0th character */
res[0] = '1';
return res;
}
/* Function to get value of a numeral
For example it returns 10 for input 'A'
1 for '1', etc */
int getNumeralValue(char num)
{
if( num >= '0' && num <= '9')
return (num - '0');
if( num >= 'A' && num <= 'D')
return (num - 'A' + 10);
/* If we reach this line caller is giving
invalid character so we assert and fail*/
assert(0);
}
/* Function to get numeral for a value.
For example it returns 'A' for input 10
'1' for 1, etc */
char getNumeral(int val)
{
if( val >= 0 && val <= 9)
return (val + '0');
if( val >= 10 && val <= 14)
return (val + 'A' - 10);
/* If we reach this line caller is giving
invalid no. so we assert and fail*/
assert(0);
}
/*Driver program to test above functions*/
int main()
{
char *num1 = "DC2";
char *num2 = "0A3";
printf("Result is %s", sumBase14(num1, num2));
getchar();
return 0;
}
Notes:
Above approach can be used to add numbers in any base. We dont have to do string operations if base is smaller than 10.
You can try extending the above program for numbers of different lengths.
Please comment if you find any bug in the program or a better approach to do the same.
Implementation:
/*Returns the square root of n. Note that the function */
float squareRoot(float n)
{
/*We are using n itself as initial approximation
This can definitely be improved */
float x = n;
float y = 1;
float e = 0.000001; /* e decides the accuracy level*/
while(x - y > e)
{
x = (x + y)/2;
y = n/x;
}
return x;
}
/* Driver program to test above function*/
int main()
{
int n = 50;
printf ("Square root of %d is %f", n, squareRoot(n));
getchar();
}
Example:
n = 4 /*n itself is used for initial approximation*/
Initialize x = 4, y = 1
Next Approximation x = (x + y)/2 (= 2.500000),
y = n/x (=1.600000)
Next Approximation x = 2.050000,
y = 1.951220
Next Approximation x = 2.000610,
y = 1.999390
Next Approximation x = 2.000000,
y = 2.000000
Terminate as (x - y) > e now.
If we are sure that n is a perfect square, then we can use following method. The method can go in infinite loop for non-perfect-square numbers.
For example, for 3 the below while loop will never terminate.
/*Returns the square root of n. Note that the function
will not work for numbers which are not perfect squares*/
unsigned int squareRoot(int n)
{
int x = n;
int y = 1;
while(x > y)
{
x = (x + y)/2;
y = n/x;
}
return x;
}
/* Driver program to test above function*/
int main()
{
int n = 49;
printf (" root of %d is %d", n, squareRoot(n));
getchar();
}
References;
http://en.wikipedia.org/wiki/Square_root
http://en.wikipedia.org/wiki/Babylonian_method#Babylonian_method
Asked by Snehal
Multiply two integers without using multiplication, division and bitwise operators,
and no loops
Asked by Kapil
By making use of recursion, we can multiply two integers with the given constraints.
To multiply x and y, recursively add x y times.
Thanks to geek4u for suggesting this method.
#include<stdio.h>
/* function to multiply two numbers x and y*/
int multiply(int x, int y)
{
/* 0 multiplied with anything gives 0 */
if(y == 0)
return 0;
/* Add x one by one */
if(y > 0 )
return (x + multiply(x, y-1));
/* the case where y is negative */
if(y < 0 )
return -multiply(x, -y);
}
int main()
{
printf("\n %d", multiply(5, -11));
getchar();
return 0;
}
/* UTILITY FUNCTIONS */
/* Utility function to print array arr[] */
void printArray(int arr[], int arr_size)
{
int i;
for (i = 0; i < arr_size; i++)
printf("%d ", arr[i]);
printf("\n");
}
/* Driver function to test above functions */
int main()
{
int n = 5;
printf("Differnt compositions formed by 1, 2 and 3 of %d are\n", n);
printCompositions(n, 0);
getchar();
return 0;
}
Asked by Aloe
Write you own Power without using multiplication(*) and division(/) operators
Method 1 (Using Nested Loops)
We can calculate power by using repeated addition.
For example to calculate 5^6.
1) First 5 times add 5, we get 25. (5^2)
2) Then 5 times add 25, we get 125. (5^3)
3) Then 5 time add 125, we get 625 (5^4)
4) Then 5 times add 625, we get 3125 (5^5)
5) Then 5 times add 3125, we get 15625 (5^6)
/* Works only if a >= 0 and b >= 0 */
int pow(int a, int b)
{
if (b == 0)
return 1;
int answer = a;
int increment = a;
int i, j;
for(i = 1; i < b; i++)
{
for(j = 1; j < a; j++)
{
answer += increment;
}
increment = answer;
}
return answer;
}
/* driver program to test above function */
int main()
{
printf("\n %d", pow(5, 3));
getchar();
return 0;
}
Write a function int fib(int n) that returns Fn. For example, if n = 0, then fib() should return 0. If n = 1, then it should return 1. For n > 1, it should
return Fn-1 + Fn-2
Following are different methods to get the nth Fibonacci number.
Method 1 ( Use recursion )
A simple method that is a direct recusrive implementation mathematical recurance relation given above.
#include<stdio.h>
int fib(int n)
{
if (n <= 1)
return n;
return fib(n-1) + fib(n-2);
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}
Extra Space: O(n) if we consider the function call stack size, otherwise O(1).
Method 2 ( Use Dynamic Programming )
We can avoid the repeated work done is the method 1 by storing the Fibonacci numbers calculated so far.
#include<stdio.h>
int fib(int n)
{
/* Declare an array to store Fibonacci numbers. */
int f[n+1];
int i;
/* 0th and 1st number of the series are 0 and 1*/
f[0] = 0;
f[1] = 1;
for (i = 2; i <= n; i++)
{
/* Add the previous 2 numbers in the series
and store it */
f[i] = f[i-1] + f[i-2];
}
return f[n];
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}
#include <stdio.h>
/* Helper function that multiplies 2 matricies F and M of size 2*2, and
puts the multiplication result back to F[][] */
void multiply(int F[2][2], int M[2][2]);
/* Helper function that calculates F[][] raise to the power n and puts the
result in F[][]
Note that this function is desinged only for fib() and won't work as general
power function */
void power(int F[2][2], int n);
int fib(int n)
{
int F[2][2] = {{1,1},{1,0}};
if (n == 0)
return 0;
power(F, n-1);
return F[0][0];
}
void multiply(int F[2][2], int M[2][2])
{
int x = F[0][0]*M[0][0] + F[0][1]*M[1][0];
int y = F[0][0]*M[0][1] + F[0][1]*M[1][1];
=
=
=
=
x;
y;
z;
w;
}
void power(int F[2][2], int n)
{
int i;
int M[2][2] = {{1,1},{1,0}};
// n - 1 times multiply the matrix to {{1,0},{0,1}}
for (i = 2; i <= n; i++)
multiply(F, M);
}
/* Driver program to test above function */
int main()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}
=
=
=
=
int M[2][2])
+
+
+
+
F[0][1]*M[1][0];
F[0][1]*M[1][1];
F[1][1]*M[1][0];
F[1][1]*M[1][1];
x;
y;
z;
w;
}
/* Driver program to test above function */
int main()
{
int n = 9;
printf("%d", fib(9));
getchar();
return 0;
}
is
is
is
is
is
is
10.00
15.00
20.00
25.00
30.00
35.00
To print mean of a stream, we need to find out how to find average when a new number is being added to the stream. To do this, all we need is
count of numbers seen so far in the stream, previous average and new number. Let n be the count, prev_avg be the previous average and x be the
new number being added. The average after including x number can be written as (prev_avg*n + x)/(n+1).
#include <stdio.h>
// Returns the new average after including x
float getAvg(float prev_avg, int x, int n)
{
return (prev_avg*n + x)/(n+1);
}
// Prints average of a stream of numbers
void streamAvg(float arr[], int n)
{
float avg = 0;
for(int i = 0; i < n; i++)
{
avg = getAvg(avg, arr[i], i);
printf("Average of %d numbers is %f \n", i+1, avg);
}
return;
}
// Driver program to test above functions
int main()
{
float arr[] = {10, 20, 30, 40, 50, 60};
int n = sizeof(arr)/sizeof(arr[0]);
streamAvg(arr, n);
return 0;
}
The above function getAvg() can be optimized using following changes. We can avoid the use of prev_avg and number of elements by using static
variables (Assuming that only this function is called for average of stream). Following is the oprimnized version.
#include <stdio.h>
// Returns the new average after including x
float getAvg (int x)
{
static int sum, n;
sum += x;
return (((float)sum)/++n);
}
// Prints average of a stream of numbers
void streamAvg(float arr[], int n)
{
float avg = 0;
for(int i = 0; i < n; i++)
{
avg = getAvg(arr[i]);
printf("Average of %d numbers is %f \n", i+1, avg);
}
return;
}
// Driver program to test above functions
int main()
{
float arr[] = {10, 20, 30, 40, 50, 60};
int n = sizeof(arr)/sizeof(arr[0]);
streamAvg(arr, n);
return 0;
}
Solution:
We can solve it recursively. Let count(n) be the function that counts such numbers.
'msd' --> the most significant digit in n
'd' --> number of digits in n.
count(n) = n if n < 3
count(n) = n - 1 if 3 <= n < 10
count(n) = count(msd) * count(10^(d-1) - 1) +
count(msd) +
count(n % (10^(d-1)))
if n > 10 and msd is not 3
count(n) = count( msd * (10^(d-1)) - 1)
if n > 10 and msd is 3
Let us understand the solution with n = 578.
count(578) = 4*count(99) + 4 + count(78)
The middle term 4 is added to include numbers 100, 200, 400 and 500.
Let us take n = 35 as another example.
count(35) = count (3*10 - 1) = count(29)
#include <stdio.h>
/* returns count of numbers which are in range from 1 to n and don't contain 3
as a digit */
int count(int n)
{
// Base cases (Assuming n is not negative)
if (n < 3)
return n;
if (n >= 3 && n < 10)
return n-1;
// Calculate 10^(d-1) (10 raise to the power d-1) where d is
// number of digits in n. po will be 100 for n = 578
int po = 1;
while (n/po > 9)
po = po*10;
// find the most significant digit (msd is 5 for 578)
int msd = n/po;
if (msd != 3)
// For 578, total will be 4*count(10^2 - 1) + 4 + count(78)
return count(msd)*count(po - 1) + count(msd) + count(n%po);
else
// For 35, total will be equal to count(29)
return count(msd*po - 1);
}
// Driver program to test above function
int main()
{
printf ("%d ", count(578));
return 0;
}
Output:
385
Magic Square
A magic square of order n is an arrangement of n^2 numbers, usually distinct integers, in a square, such that the n numbers in all rows, all columns,
and both diagonals sum to the same constant. A magic square contains the integers from 1 to n^2.
The constant sum in every row, column and diagonal is called the magic constant or magic sum, M. The magic constant of a normal magic square
depends only on n and has the following value:
M = n(n^2+1)/2
For normal magic squares of order n = 3, 4, 5, , the magic constants are: 15, 34, 65, 111, 175, 260,
In this post, we will discuss how programmatically we can generate a magic square of size n. Before we go further, consider the below examples:
Magic Square of size 3
----------------------2 7 6
9 5 1
4 3 8
Sum in each row & each column = 3*(3^2+1)/2 = 15
Magic Square of size 5
---------------------9 3 22 16 15
2 21 20 14 8
25 19 13 7 1
18 12 6 5 24
11 10 4 23 17
Sum in each row & each column = 5*(5^2+1)/2 = 65
Magic Square of size 7
---------------------20 12 4 45 37 29 28
11 3 44 36 35 27 19
2 43 42 34 26 18 10
49 41 33 25 17 9 1
40 32 24 16 8 7 48
31 23 15 14 6 47 39
22 21 13 5 46 38 30
Sum in each row & each column = 7*(7^2+1)/2 = 175
Did you find any pattern in which the numbers are stored?
In any magic square, the first number i.e. 1 is stored at position (n/2, n-1). Let this position be (i,j). The next number is stored at position (i-1, j+1)
where we can consider each row & column as circular array i.e. they wrap around.
Three conditions hold:
1. The position of next number is calculated by decrementing row number of previous number by 1, and incrementing the column number of
previous number by 1. At any time, if the calculated row position becomes -1, it will wrap around to n-1. Similarly, if the calculated column
position becomes n, it will wrap around to 0.
2. If the magic square already contains a number at the calculated position, calculated column position will be decremented by 2, and calculated
row position will be incremented by 1.
3. If the calculated row position is -1 & calculated column position is n, the new position would be: (0, n-2).
Example:
Magic Square of size 3
---------------------2 7 6
9 5 1
4 3 8
Steps:
1. position of number 1 = (3/2, 3-1) = (1, 2)
2. position of number 2 = (1-1, 2+1) = (0, 0)
3. position of number 3 = (0-1, 0+1) = (3-1, 1) = (2, 1)
4. position of number 4 = (2-1, 1+1) = (1, 2)
Since, at this position, 1 is there. So, apply condition 2.
new position=(1+1,2-2)=(2,0)
5. position of number 5=(2-1,0+1)=(1,1)
6. position of number 6=(1-1,1+1)=(0,2)
7. position of number 7 = (0-1, 2+1) = (-1,3) // this is tricky, see condition 3
new position = (0, 3-2) = (0,1)
8. position of number 8=(0-1,1+1)=(-1,2)=(2,2) //wrap around
Output:
The Magic Square for n=7:
Sum of each row or column 175:
20
11
2
49
40
31
22
12
3
43
41
32
23
21
4
44
42
33
24
15
13
45
36
34
25
16
14
5
37
35
26
17
8
6
46
29
27
18
9
7
47
38
28
19
10
1
48
39
30
Sieve of Eratosthenes
Given a number n, print all primes smaller than or equal to n. It is also given that n is a small number.
For example, if n is 10, the output should be 2, 3, 5, 7?. If n is 20, the output should be 2, 3, 5, 7, 11, 13, 17, 19?.
The sieve of Eratosthenes is one of the most efficient ways to find all primes smaller than n when n is smaller than 10 million or so (Ref Wiki).
Following is the algorithm to find all the prime numbers less than or equal to a given integernby Eratosthenes method:
1. Create a list of consecutive integers from 2 ton: (2, 3, 4, ,n).
2. Initially, letpequal 2, the first prime number.
3. Starting fromp, count up in increments ofpand mark each of these numbers greater thanpitself in the list. These numbers will be 2p, 3p, 4p,
etc.; note that some of them may have already been marked.
4. Find the first number greater thanpin the list that is not marked. If there was no such number, stop. Otherwise, letpnow equal this number
(which is the next prime), and repeat from step 3.
When the algorithm terminates, all the numbers in the list that are not marked are prime.
Explanation with Example:
Let us take an example when n = 50. So we need to print all print numbers smaller than or equal to 50.
We create a list of all numbers from 2 to 50.
According to the algorithm we will mark all the numbers which are divisible by 2.
Now we move to our next unmarked number 3 and mark all the numbers which are multiples of 3.
We continue this process and our final table will look like below:
So the prime numbers are the unmarked ones: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47.
Thanks to Krishan Kumar for providing above explanation.
Implementation:
Following is C++ implementation of the above algorithm. In the following implementation, a boolean array arr[] of size n is used to mark multiples
of prime numbers.
#include <stdio.h>
#include <string.h>
// marks all mutiples of 'a' ( greater than 'a' but less than equal to 'n') as 1.
void markMultiples(bool arr[], int a, int n)
{
int i = 2, num;
while ( (num = i*a) <= n )
{
Output:
Following are the prime numbers below 30
2 3 5 7 11 13 17 19 23 29
References:
http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
Output: 1 (Monday)
See this for explanation of the above function.
References:
http://en.wikipedia.org/wiki/Determination_of_the_day_of_the_week
Output:
Not Divisible: Remainder is 2
DFA based division can be useful if we have a binary stream as input and we want to check for divisibility of the decimal value of stream at any
time.
Output:
1
2
.
.
24
25
The below program depicts how we can use foo() to return 1 to 7 with equal probability.
#include <stdio.h>
int foo() // given method that returns 1 to 5 with equal probability
{
// some code here
}
int my_rand() // returns 1 to 7 with equal probability
{
int i;
i = 5*foo() + foo() - 5;
if (i < 22)
return i%7 + 1;
return my_rand();
}
int main()
{
printf ("%d ", my_rand());
return 0;
}
Output:
Next palindrome is:
9 4 1 8 8 0 8 8 1 4 9
References:
http://en.wikipedia.org/wiki/Fair_coin#Fair_results_from_a_biased_coin
Check divisibility by 7
Given a number, check if it is divisible by 7. You are not allowed to use modulo operator, floating point arithmetic is also not allowed.
A simple method is repeated subtraction. Following is another interesting method.
Divisibility by 7 can be checked by a recursive method. A number of the form 10a + b is divisible by 7 if and only if a 2b is divisible by 7. In other
words, subtract twice the last digit from the number formed by the remaining digits. Continue to do this until a small number.
Example: the number 371: 37 (21) = 37 2 = 35; 3 (2 5) = 3 10 = -7; thus, since -7 is divisible by 7, 371 is divisible by 7.
Following is C implementation of the above method
// A Program to check whether a number is divisible by 7
#include <stdio.h>
int isDivisibleBy7( int num )
{
// If number is negative, make it positive
if( num < 0 )
return isDivisibleBy7( -num );
// Base cases
if( num == 0 || num == 7 )
return 1;
if( num < 10 )
return 0;
// Recur for ( num / 10 - 2 * num % 10 )
return isDivisibleBy7( num / 10 - 2 * ( num - num / 10 * 10 ) );
}
// Driver program to test above function
int main()
{
int num = 616;
if( isDivisibleBy7(num ) )
printf( "Divisible" );
else
printf( "Not Divisible" );
return 0;
}
Output:
Divisible
How does this work? Let b be the last digit of a number n and let a be the number we get when we split off b.
The representation of the number may also be multiplied by any number relatively prime to the divisor without changing its divisibility. After
observing that 7 divides 21, we can perform the following:
10.a + b
and then
21.a - a + 2.b
There are other interesting methods to check divisibility by 7 and other numbers. See following Wiki page for details.
References:
http://en.wikipedia.org/wiki/Divisibility_rule
Since (10^x)%3 is 1 for any x, the above expression gives the same remainder as following expression
1.a + 1.b + c
return top;
}
// Driver program to test above functions
int main()
{
int arr[] = {8, 1, 7, 6, 0};
int size = sizeof(arr)/sizeof(arr[0]);
if (findMaxMultupleOf3( arr, size ) == 0)
printf( "Not Possible" );
return 0;
}
Output
598
The time complexity of the above solution is O(n^2). We can reduce the time complexity to O(n) by creating an auxiliary array of size 256. See
following code.
// A O(n) solution for finding rank of string
#include <stdio.h>
#include <string.h>
#define MAX_CHAR 256
// A utility function to find factorial of n
int fact(int n)
{
return (n <= 1)? 1 :n * fact(n-1);
}
// Construct a count array where value at every index
// contains count of smaller characters in whole string
void populateAndIncreaseCount (int* count, char* str)
{
int i;
for( i = 0; str[i]; ++i )
++count[ str[i] ];
for( i = 1; i < 256; ++i )
count[i] += count[i-1];
}
// Removes a character ch from count[] array
// constructed by populateAndIncreaseCount()
void updatecount (int* count, char ch)
{
int i;
for( i = ch; i < MAX_CHAR; ++i )
--count[i];
}
// A function to find rank of a string in all permutations
// of characters
int findRank (char* str)
{
int len = strlen(str);
int mul = fact(len);
int rank = 1, i;
int count[MAX_CHAR] = {0}; // all elements of count[] are initialized with 0
// Populate the count array such that count[i] contains count of
// characters which are present in str and are smaller than i
populateAndIncreaseCount( count, str );
for (i = 0; i < len; ++i)
{
mul /= len - i;
// count number of chars smaller than str[i]
// fron str[i+1] to str[len-1]
rank += count[ str[i] - 1] * mul;
// Reduce count of characters greater than str[i]
updatecount (count, str[i]);
}
return rank;
}
The above programs dont work for duplicate characters. To make them work for duplicate characters, find all the characters that are smaller
(include equal this time also), do the same as above but, this time divide the rank so formed by p! where p is the count of occurrences of the
repeating character.
while ( ! isFinished )
{
// print this permutation
printf ("%s \n", str);
// Find the rightmost character which is smaller than its next
// character. Let us call it 'first char'
int i;
for ( i = size - 2; i >= 0; --i )
if (str[i] < str[i+1])
break;
// If there is no such chracter, all are sorted in decreasing order,
// means we just printed the last permutation and we are done.
if ( i == -1 )
isFinished = true;
else
{
// Find the ceil of 'first char' in right of first character.
// Ceil of a character is the smallest character greater than it
int ceilIndex = findCeil( str, str[i], i + 1, size - 1 );
// Swap first and second characters
swap( &str[i], &str[ceilIndex] );
// Sort the string on right of 'first char'
qsort( str + i + 1, size - i - 1, sizeof(str[0]), compare );
}
}
}
// Driver program to test above function
int main()
{
char str[] = "ABCD";
sortedPermutations( str );
return 0;
}
Output:
ABCD
ABDC
....
....
DCAB
DCBA
The upper bound on time complexity of the above program is O(n^2 x n!). We can optimize step 4 of the above algorithm for finding next
permutation. Instead of sorting the subarray after the first character, we can reverse the subarray, because the subarray we get after swapping is
always sorted in non-increasing order. This optimization makes the time complexity as O(n x n!). See following optimized code.
// An optimized version that uses reverse instead of sort for
// finding the next permutation
// A utility function to reverse a string str[l..h]
void reverse(char str[], int l, int h)
{
while (l < h)
{
swap(&str[l], &str[h]);
l++;
h--;
}
}
// Print all permutations of str in sorted order
void sortedPermutations ( char str[] )
{
// Get size of string
int size = strlen(str);
// Sort the string in increasing order
qsort( str, size, sizeof( str[0] ), compare );
// Print permutations one by one
bool isFinished = false;
while ( ! isFinished )
{
// print this permutation
The above programs print duplicate permutation when characters are repeated. We can avoid it by keeping track of the previous permutation.
While printing, if the current permutation is same as previous permutation, we wont print it.
Let the given array be arr[]. A simple solution is to create an auxiliary array temp[] which is initially a copy of arr[]. Randomly select an element
from temp[], copy the randomly selected element to arr[0] and remove the selected element from temp[]. Repeat the same process n times and
keep copying elements to arr[1], arr[2], . The time complexity of this solution will be O(n^2).
FisherYates shuffle Algorithm works in O(n) time complexity. The assumption here is, we are given a function rand() that generates random
number in O(1) time.
The idea is to start from the last element, swap it with a randomly selected element from the whole array (including last). Now consider the array
from 0 to n-2 (size reduced by 1), and repeat the process till we hit the first element.
Following is the detailed algorithm
To shuffle an array a of n elements (indices 0..n-1):
for i from n - 1 downto 1 do
j = random integer with 0 <= j <= i
exchange a[j] and a[i]
Output:
7 8 4 6 3 1 2 5
Reservoir Sampling
Reservoir sampling is a family of randomized algorithms for randomly choosing k samples from a list of n items, where n is either a very large or
unknown number. Typically n is large enough that the list doesnt fit into main memory. For example, a list of search queries in Google and
Facebook.
So we are given a big array (or stream) of numbers (to simplify), and we need to write an efficient function to randomly select k numbers where 1
<= k <= n. Let the input array be stream[].
A simple solution is to create an array reservoir[] of maximum size k. One by one randomly select an item from stream[0..n-1]. If the selected
item is not previously selected, then put it in reservoir[]. To check if an item is previously selected or not, we need to search the item in
reservoir[]. The time complexity of this algorithm will be O(k^2). This can be costly if k is big. Also, this is not efficient if the input is in the form of
a stream.
It can be solved in O(n) time. The solution also suits well for input in the form of stream. The idea is similar to this post. Following are the steps.
1) Create an array reservoir[0..k-1] and copy first k items of stream[] to it.
2) Now one by one consider all items from (k+1)th item to nth item.
a) Generate a random number from 0 to i where i is index of current item in stream[]. Let the generated random number is j.
b) If j is in range 0 to k-1, replace reservoir[j] with arr[i]
Following is C implementation of the above algorithm.
// An efficient program to randomly select k items from a stream of items
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// A utility function to print an array
void printArray(int stream[], int n)
{
for (int i = 0; i < n; i++)
printf("%d ", stream[i]);
printf("\n");
}
// A function to randomly select k items from stream[0..n-1].
void selectKItems(int stream[], int n, int k)
{
int i; // index for elements in stream[]
// reservoir[] is the output array. Initialize it with
// first k elements from stream[]
int reservoir[k];
for (i = 0; i < k; i++)
reservoir[i] = stream[i];
// Use a different seed value so that we don't get
// same result each time we run this program
srand(time(NULL));
// Iterate from the (k+1)th element to nth element
for (; i < n; i++)
{
// Pick a random index from 0 to i.
int j = rand() % (i+1);
// If the randomly picked index is smaller than k, then replace
// the element present at the index with new element from stream
if (j < k)
reservoir[j] = stream[i];
}
printf("Following are k randomly selected items \n");
printArray(reservoir, k);
}
// Driver program to test above function.
int main()
{
int stream[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
int n = sizeof(stream)/sizeof(stream[0]);
int k = 5;
selectKItems(stream, n, k);
return 0;
}
Output:
Following are k randomly selected items
6 2 11 8 12
Pascals Triangle
Pascals triangle is a triangular array of the binomial coefficients. Write a function that takes an integer value n as input and prints first n lines of the
Pascals triangle. Following are the first 6 rows of Pascals Triangle.
1
1
1
1
1
1
1
2
3
4
5
1
3 1
6 4 1
10 10 5 1
= line! / ( (line-i)! * i! )
A simple method is to run two loops and calculate the value of Binomial Coefficient in inner loop.
// A simple O(n^3) program for Pascal's Triangle
#include <stdio.h>
// See http://www.geeksforgeeks.org/archives/25621 for details of this function
int binomialCoeff(int n, int k);
// Function to print first n lines of Pascal's Triangle
void printPascal(int n)
{
// Iterate through every line and print entries in it
for (int line = 0; line < n; line++)
{
// Every line has number of integers equal to line number
for (int i = 0; i <= line; i++)
printf("%d ", binomialCoeff(line, i));
printf("\n");
}
}
// See http://www.geeksforgeeks.org/archives/25621 for details of this function
int binomialCoeff(int n, int k)
{
int res = 1;
if (k > n - k)
k = n - k;
for (int i = 0; i < k; ++i)
{
res *= (n - i);
res /= (i + 1);
}
return res;
}
// Driver program to test above function
int main()
{
int n = 7;
printPascal(n);
return 0;
}
// A O(n^2) time and O(n^2) extra space method for Pascal's Triangle
void printPascal(int n)
{
int arr[n][n]; // An auxiliary array to store generated pscal triangle values
// Iterate through every line and print integer(s) in it
for (int line = 0; line < n; line++)
{
// Every line has number of integers equal to line number
for (int i = 0; i <= line; i++)
{
// First and last values in every row are 1
if (line == i || i == 0)
arr[line][i] = 1;
else // Other values are sum of values just above and left of above
arr[line][i] = arr[line-1][i-1] + arr[line-1][i];
printf("%d ", arr[line][i]);
}
printf("\n");
}
}
This method can be optimized to use O(n) extra space as we need values only from previous row. So we can create an auxiliary array of size n
and overwrite values. Following is another method uses only O(1) extra space.
So method 3 is the best method among all, but it may cause integer overflow for large values of n as it multiplies two integers to obtain values.
Output:
Random
Random
Random
Random
number
number
number
number
from
from
from
from
first
first
first
first
1
2
3
4
numbers
numbers
numbers
numbers
is
is
is
is
1
1
3
4
Let the sum needs to be calculated for n terms, we can calculate sum using following loop.
for (i = n - 1, sum = 1; i > 0; --i )
sum = 1 + x * sum / i;
Output:
e^x = 2.718282
Measure one litre using two vessels and infinite water supply
There are two vessels of capacities a and b respectively. We have infinite water supply. Give an efficient algorithm to make exactly 1 litre of water
in one of the vessels. You can throw all the water from any vessel any point of time. Assume that a and b are Coprimes.
Following are the steps:
Let V1 be the vessel of capacity a and V2 be the vessel of capacity b and a is smaller than b.
1) Do following while the amount of water in V1 is not 1.
.a) If V1 is empty, then completely fill V1
.b) Transfer water from V1 to V2. If V2 becomes full, then keep the remaining water in V1 and empty V2
2) V1 will have 1 litre after termination of loop in step 1. Return.
Following is C++ implementation of the above algorithm.
/*
1.
2.
2.
3.
4.
5.
6.
7.
3 and
3, V2
3, V2
3, V2
2, V2
3, V2
3, V2
1, V2
V2 with capacity 7
= 0
= 3
= 6
= 0
= 2
= 5
= 0
Note that V2 was made empty in steps 3 and 6 because it became full */
#include <iostream>
using namespace std;
// A utility function to get GCD of two numbers
int gcd(int a, int b) { return b? gcd(b, a % b) : a; }
// Class to represent a Vessel
class Vessel
{
// A vessel has capacity, and current amount of water in it
int capacity, current;
public:
// Constructor: initializes capacity as given, and current as 0
Vessel(int capacity) { this->capacity = capacity; current = 0; }
// The main function to fill one litre in this vessel. Capacity of V2
// must be greater than this vessel and two capacities must be co-prime
void makeOneLitre(Vessel &V2);
// Fills vessel with given amount and returns the amount of water
// transferred to it. If the vessel becomes full, then the vessel
// is made empty.
int transfer(int amount);
};
// The main function to fill one litre in this vessel. Capacity
// of V2 must be greater than this vessel and two capacities
// must be coprime
void Vessel:: makeOneLitre(Vessel &V2)
{
// solution exists iff a and b are co-prime
if (gcd(capacity, V2.capacity) != 1)
return;
while (current != 1)
{
// fill A (smaller vessel)
if (current == 0)
current = capacity;
cout << "Vessel 1: " << current << "
<< V2.current << endl;
Vessel 2: "
// Fills vessel with given amount and returns the amount of water
// transferred to it. If the vessel becomes full, then the vessel
// is made empty
int Vessel::transfer(int amount)
{
// If the vessel can accommodate the given amount
if (current + amount < capacity)
{
current += amount;
return amount;
}
// If the vessel cannot accommodate the given amount, then
// store the amount of water transferred
int transferred = capacity - current;
// Since the vessel becomes full, make the vessel
// empty so that it can be filled again
current = 0;
return transferred;
}
// Driver program to test above function
int main()
{
int a = 3, b = 7; // a must be smaller than b
// Create two vessels of capacities a and b
Vessel V1(a), V2(b);
// Get 1 litre in first vessel
V1.makeOneLitre(V2);
return 0;
}
Output:
Vessel
Vessel
Vessel
Vessel
Vessel
Vessel
Vessel
1:
1:
1:
1:
1:
1:
1:
3
3
3
2
3
3
1
Vessel
Vessel
Vessel
Vessel
Vessel
Vessel
Vessel
2:
2:
2:
2:
2:
2:
2:
0
3
6
0
2
5
0
Output:
3 3 5 7
C
// Program to print all combination of size r in an array of size n
#include <stdio.h>
void combinationUtil(int arr[], int data[], int start, int end,
int index, int r);
// The main function that prints all combinations of size r
// in arr[] of size n. This function mainly uses combinationUtil()
void printCombination(int arr[], int n, int r)
{
// A temporary array to store all combination one by one
int data[r];
// Print all combination using temprary array 'data[]'
combinationUtil(arr, data, 0, n-1, 0, r);
}
/* arr[] ---> Input Array
data[] ---> Temporary array to store current combination
start & end ---> Staring and Ending indexes in arr[]
index ---> Current index in data[]
r ---> Size of a combination to be printed */
void combinationUtil(int arr[], int data[], int start, int end,
int index, int r)
{
// Current combination is ready to be printed, print it
if (index == r)
{
for (int j=0; j<r; j++)
printf("%d ", data[j]);
printf("\n");
return;
}
// replace index with all possible elements. The condition
// "end-i+1 >= r-index" makes sure that including one element
// at index will make a combination with remaining elements
// at remaining positions
for (int i=start; i<=end && end-i+1 >= r-index; i++)
{
data[index] = arr[i];
combinationUtil(arr, data, i+1, end, index+1, r);
}
}
// Driver program to test above functions
int main()
{
int arr[] = {1, 2, 3, 4, 5};
int r = 3;
int n = sizeof(arr)/sizeof(arr[0]);
printCombination(arr, n, r);
}
Java
// Java program to print all combination of size r in an array of size n
import java.io.*;
class Permutation {
/* arr[] ---> Input Array
data[] ---> Temporary array to store current combination
start & end ---> Staring and Ending indexes in arr[]
index ---> Current index in data[]
r ---> Size of a combination to be printed */
static void combinationUtil(int arr[], int data[], int start,
int end, int index, int r)
{
// Current combination is ready to be printed, print it
if (index == r)
{
for (int j=0; j<r; j++)
System.out.print(data[j]+" ");
System.out.println("");
return;
}
// replace index with all possible elements. The condition
// "end-i+1 >= r-index" makes sure that including one element
// at index will make a combination with remaining elements
// at remaining positions
for (int i=start; i<=end && end-i+1 >= r-index; i++)
{
data[index] = arr[i];
combinationUtil(arr, data, i+1, end, index+1, r);
}
}
// The main function that prints all combinations of size r
// in arr[] of size n. This function mainly uses combinationUtil()
static void printCombination(int arr[], int n, int r)
{
// A temporary array to store all combination one by one
int data[]=new int[r];
// Print all combination using temprary array 'data[]'
combinationUtil(arr, data, 0, n-1, 0, r);
}
/*Driver function to check for above function*/
public static void main (String[] args) {
int arr[] = {1, 2, 3, 4, 5};
int r = 3;
int n = arr.length;
printCombination(arr, n, r);
}
}
/* This code is contributed by Devesh Agrawal */
1
1
1
1
1
1
2
2
2
3
2
2
2
3
3
4
3
3
4
4
3
4
5
4
5
5
4
5
5
5
as two different combinations. We can avoid duplicates by adding following two additional things to above code.
1) Add code to sort the array before calling combinationUtil() in printCombination()
2) Add following lines at the end of for loop in combinationUtil()
// Since the elements are sorted, all occurrences of an element
// must be together
while (arr[i] == arr[i+1])
i++;
C
// Program to print all combination of size r in an array of size n
#include<stdio.h>
void combinationUtil(int arr[],int n,int r,int index,int data[],int i);
// The main function that prints all combinations of size r
// in arr[] of size n. This function mainly uses combinationUtil()
void printCombination(int arr[], int n, int r)
{
// A temporary array to store all combination one by one
int data[r];
// Print all combination using temprary array 'data[]'
combinationUtil(arr, n, r, 0, data, 0);
}
/* arr[] ---> Input Array
n
---> Size of input array
r
---> Size of a combination to be printed
index ---> Current index in data[]
data[] ---> Temporary array to store current combination
i
---> index of current element in arr[]
*/
void combinationUtil(int arr[], int n, int r, int index, int data[], int i)
{
// Current cobination is ready, print it
if (index == r)
{
for (int j=0; j<r; j++)
printf("%d ",data[j]);
printf("\n");
return;
}
// When no more elements are there to put in data[]
if (i >= n)
return;
// current is included, put next at next location
data[index] = arr[i];
combinationUtil(arr, n, r, index+1, data, i+1);
// current is excluded, replace it with next (Note that
// i+1 is passed, but index is not changed)
combinationUtil(arr, n, r, index, data, i+1);
}
// Driver program to test above functions
int main()
{
Java
// Java program to print all combination of size r in an array of size n
import java.io.*;
class Permutation {
/* arr[] ---> Input Array
data[] ---> Temporary array to store current combination
start & end ---> Staring and Ending indexes in arr[]
index ---> Current index in data[]
r ---> Size of a combination to be printed */
static void combinationUtil(int arr[], int n, int r, int index,
int data[], int i)
{
// Current combination is ready to be printed, print it
if (index == r)
{
for (int j=0; j<r; j++)
System.out.print(data[j]+" ");
System.out.println("");
return;
}
// When no more elements are there to put in data[]
if (i >= n)
return;
// current is included, put next at next location
data[index] = arr[i];
combinationUtil(arr, n, r, index+1, data, i+1);
// current is excluded, replace it with next (Note that
// i+1 is passed, but index is not changed)
combinationUtil(arr, n, r, index, data, i+1);
}
// The main function that prints all combinations of size r
// in arr[] of size n. This function mainly uses combinationUtil()
static void printCombination(int arr[], int n, int r)
{
// A temporary array to store all combination one by one
int data[]=new int[r];
// Print all combination using temprary array 'data[]'
combinationUtil(arr, n, r, 0, data, 0);
}
/*Driver function to check for above function*/
public static void main (String[] args) {
int arr[] = {1, 2, 3, 4, 5};
int r = 3;
int n = arr.length;
printCombination(arr, n, r);
}
}
/* This code is contributed by Devesh Agrawal */
1
1
1
1
1
1
2
2
2
3
2
2
2
3
3
4
3
3
4
4
3
4
5
4
5
5
4
5
5
5
1/10
6/10
2/10
1/10
It is quite clear that the simple random number generator wont work here as it doesnt keep track of the frequency of occurrence.
We need to somehow transform the problem into a problem whose solution is known to us.
One simple method is to take an auxiliary array (say aux[]) and duplicate the numbers according to their frequency of occurrence. Generate a
random number(say r) between 0 to Sum-1(including both), where Sum represents summation of frequency array (freq[] in above example).
Return the random number aux[r] (Implementation of this method is left as an exercise to the readers).
The limitation of the above method discussed above is huge memory consumption when frequency of occurrence is high. If the input is 997, 8761
and 1, this method is clearly not efficient.
How can we reduce the memory consumption? Following is detailed algorithm that uses O(n) extra space where n is number of elements in input
arrays.
1. Take an auxiliary array (say prefix[]) of size n.
2. Populate it with prefix sum, such that prefix[i] represents sum of numbers from 0 to i.
3. Generate a random number(say r) between 1 to Sum(including both), where Sum represents summation of input frequency array.
4. Find index of Ceil of random number generated in step #3 in the prefix array. Let the index be indexc.
5. Return the random number arr[indexc], where arr[] contains the input n numbers.
Before we go to the implementation part, let us have quick look at the algorithm with an example:
arr[]: {10, 20, 30}
freq[]: {2, 3, 1}
Prefix[]: {2, 5, 6}
Since last entry in prefix is 6, all possible values of r are [1, 2, 3, 4, 5, 6]
1: Ceil is 2. Random number generated is 10.
2: Ceil is 2. Random number generated is 10.
3: Ceil is 5. Random number generated is 20.
4: Ceil is 5. Random number generated is 20.
5: Ceil is 5. Random number generated is 20.
6. Ceil is 6. Random number generated is 30.
In the above example
10 is generated with probability 2/6.
20 is generated with probability 3/6.
30 is generated with probability 1/6.
How does this work?
Any number input[i] is generated as many times as its frequency of occurrence because there exists count of integers in range(prefix[i 1], prefix[i]]
is input[i]. Like in the above example 3 is generated thrice, as there exists 3 integers 3, 4 and 5 whose ceil is 5.
//C program to generate random numbers according to given frequency distribution
#include <stdio.h>
#include <stdlib.h>
// Utility function to find ceiling of r in arr[l..h]
int findCeil(int arr[], int r, int l, int h)
{
int mid;
while (l < h)
{
mid = l + ((h - l) >> 1); // Same as mid = (l+h)/2
(r > arr[mid]) ? (l = mid + 1) : (h = mid);
}
return (arr[l] >= r) ? l : -1;
}
// The main function that returns a random number from arr[] according to
// distribution array defined by freq[]. n is size of arrays.
int myRand(int arr[], int freq[], int n)
{
// Create and fill prefix array
int prefix[n], i;
prefix[0] = freq[0];
for (i = 1; i < n; ++i)
prefix[i] = prefix[i - 1] + freq[i];
// prefix[n-1] is sum of all frequencies. Generate a random number
// with value from 1 to this sum
int r = (rand() % prefix[n - 1]) + 1;
// Find index of ceiling of r in prefix arrat
int indexc = findCeil(prefix, r, 0, n - 1);
return arr[indexc];
}
// Driver program to test above functions
int main()
{
int arr[] = {1, 2, 3, 4};
int freq[] = {10, 5, 20, 100};
int i, n = sizeof(arr) / sizeof(arr[0]);
// Use a different seed value for every run.
srand(time(NULL));
// Let us generate 10 random numbers accroding to
// given distribution
for (i = 0; i < 5; i++)
printf("%d\n", myRand(arr, freq, n));
return 0;
}
Output:
1 is a Fibonacci Number
2 is a Fibonacci Number
3 is a Fibonacci Number
4 is a not Fibonacci Number
5 is a Fibonacci Number
6 is a not Fibonacci Number
7 is a not Fibonacci Number
8 is a Fibonacci Number
9 is a not Fibonacci Number
10 is a not Fibonacci Number
Output:
18
240
The idea is to see remainder of every element when divided by 3. A set of elements can form a group only if sun of their remainders is multiple of
3. Since the task is to enumerate groups, we count all elements with different remainders.
1. Hash all elements in a count array based on remainder, i.e,
for all elements a[i], do c[a[i]%3]++;
2. Now c[0] contains the number of elements which when divided
by 3 leave remainder 0 and similarly c[1] for remainder 1
and c[2] for 2.
3. Now for group of 2, we have 2 possibilities
a. 2 elements of remainder 0 group. Such possibilities are
c[0]*(c[0]-1)/2
b. 1 element of remainder 1 and 1 from remainder 2 group
Such groups are c[1]*c[2].
4. Now for group of 3,we have 4 possibilities
a. 3 elements from remainder group 0.
No. of such groups are c[0]C3
b. 3 elements from remainder group 1.
No. of such groups are c[1]C3
c. 3 elements from remainder group 2.
No. of such groups are c[2]C3
d. 1 element from each of 3 groups.
No. of such groups are c[0]*c[1]*c[2].
5. Add all the groups in steps 3 and 4 to obtain the result.
#include<stdio.h>
// Returns count of all possible groups that can be formed from elements
// of a[].
int findgroups(int arr[], int n)
{
// Create an array C[3] to store counts of elements with remainder
// 0, 1 and 2. c[i] would store count of elements with remainder i
int c[3] = {0}, i;
int res = 0; // To store the result
// Count elements with remainder 0, 1 and 2
for (i=0; i<n; i++)
c[arr[i]%3]++;
// Case 3.a: Count groups of size 2 from 0 remainder elements
res += ((c[0]*(c[0]-1))>>1);
// Case 3.b: Count groups of size 2 with one element with 1
// remainder and other with 2 remainder
res += c[1] * c[2];
// Case 4.a: Count groups of size 3 with all 0 remainder elements
res += (c[0] * (c[0]-1) * (c[0]-2))/6;
// Case 4.b: Count groups of size 3 with all 1 remainder elements
res += (c[1] * (c[1]-1) * (c[1]-2))/6;
// Case 4.c: Count groups of size 3 with all 2 remainder elements
res += ((c[2]*(c[2]-1)*(c[2]-2))/6);
// Case 4.c: Count groups of size 3 with different remainders
res += c[0]*c[1]*c[2];
// Return total count stored in res
return res;
}
// Driver program to test above functions
int main()
{
int arr[] = {3, 6, 7, 2, 9};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Required number of groups are %d\n", findgroups(arr,n));
return 0;
}
Output:
Required number of groups are 8
Output:
Move
Move
Move
Move
Move
Move
Move
Move
Move
Move
Move
Move
Move
Move
Move
disk
disk
disk
disk
disk
disk
disk
disk
disk
disk
disk
disk
disk
disk
disk
1
2
1
3
1
2
1
4
1
2
1
3
1
2
1
from
from
from
from
from
from
from
from
from
from
from
from
from
from
from
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
A
A
B
A
C
C
A
A
B
B
C
B
A
A
B
to
to
to
to
to
to
to
to
to
to
to
to
to
to
to
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
rod
B
C
C
B
A
B
B
C
C
A
A
C
B
C
C
A naive way to evaluate a polynomial is to one by one evaluate all terms. First calculate xn, multiply the value with cn, repeat the same steps for
other terms and return the sum. Time complexity of this approach is O(n2) if we use a simple loop for evaluation of xn. Time complexity can be
improved to O(nLogn) if we use O(Logn) approach for evaluation of xn.
Horners method can be used to evaluate polynomial in O(n) time. To understand the method, let us consider the example of 2x3 6x2 + 2x 1. The
polynomial can be evaluated as ((2x 6)x + 2)x 1. The idea is to initialize result as coefficient of xn which is 2 in this case, repeatedly multiply result
with x and add next coefficient to result. Finally return result.
Following is C++ implementation of Horners Method.
#include <iostream>
using namespace std;
// returns value of poly[0]x(n-1) + poly[1]x(n-2) + .. + poly[n-1]
int horner(int poly[], int n, int x)
{
int result = poly[0]; // Initialize result
// Evaluate value of polynomial using Horner's method
for (int i=1; i<n; i++)
result = result*x + poly[i];
return result;
}
// Driver program to test above function.
int main()
{
// Let us evaluate value of 2x3 - 6x2 + 2x - 1 for x = 3
int poly[] = {2, -6, 2, -1};
int x = 3;
int n = sizeof(poly)/sizeof(poly[0]);
cout << "Value of polynomial is " << horner(poly, n, x);
return 0;
}
Output:
Value of polynomial is 5
A simple method is to first calculate factorial of n, then count trailing 0s in the result (We can count trailing 0s by repeatedly dividing the factorial by
10 till the remainder is 0).
The above method can cause overflow for a slightly bigger numbers as factorial of a number is a big number (See factorial of 20 given in above
examples). The idea is to consider prime factors of a factorial n. A trailing zero is always produced by prime factors 2 and 5. If we can count the
number of 5s and 2s, our task is done. Consider the following examples.
n = 5: There is one 5 and 3 2s in prime factors of 5! (2 * 2 * 2 * 3 * 5). So count of trailing 0s is 1.
n = 11: There are two 5s and three 2s in prime factors of 11! (2 8 * 34 * 52 * 7). So count of trailing 0s is 2.
We can easily observe that the number of 2s in prime factors is always more than or equal to the number of 5s. So if we count 5s in prime factors,
we are done. How to count total number of 5s in prime factors of n!? A simple way is to calculate floor(n/5). For example, 7! has one 5, 10!
has two 5s. It is done yet, there is one more thing to consider. Numbers like 25, 125, etc have more than one 5. For example if we consider 28!,
we get one extra 5 and number of 0s become 6. Handling this is simple, first divide n by 5 and remove all single 5s, then divide by 25 to remove
extra 5s and so on. Following is the summarized formula for counting trailing 0s.
Trailing 0s in n! = Count of 5s in prime factors of n!
= floor(n/5) + floor(n/25) + floor(n/125) + ....
Output:
Count of trailing 0s in 100! is 24
Output :
1 1 2 5 14 42 132 429 1430 4862
Output:
1 1 2 5 14 42 132 429 1430 4862
Output:
1 1 2 5 14 42 132 429 1430 4862
This function will solve the purpose of generating 3 numbers with given three probabilities.
Output
Z
AY
AZ
CB
YZ
ZZ
AAC
printString(51);
printString(52);
printString(80);
printString(676);
printString(702);
printString(705);
return 0;
}
Output
Z
AY
AZ
CB
YZ
ZZ
AAC
// II) Find the smallest digit on right side of (i-1)'th digit that is
// greater than number[i-1]
int x = number[i-1], smallest = i;
for (j = i+1; j < n; j++)
if (number[j] > x && number[j] < number[smallest])
smallest = j;
// III) Swap the above found smallest digit with number[i-1]
swap(&number[smallest], &number[i-1]);
// IV) Sort the digits after (i-1) in ascending order
sort(number + i, number + n);
cout << "Next number with same set of digits is " << number;
return;
}
// Driver program to test above function
int main()
{
char digits[] = "534976";
int n = strlen(digits);
findNext(digits, n);
return 0;
}
Output:
Next number with same set of digits is 536479
An empty digit sequence is considered to have one decoding. It may be assumed that the input contains valid digits from 0 to 9 and there are no
leading 0s, no extra trailing 0s and no two or more consecutive 0s.
This problem is recursive and can be broken in sub-problems. We start from end of the given digit sequence. We initialize the total count of
decodings as 0. We recur for two subproblems.
1) If the last digit is non-zero, recur for remaining (n-1) digits and add the result to total count.
2) If the last two digits form a valid character (or smaller than 27), recur for remaining (n-2) digits and add the result to total count.
Following is C++ implementation of the above approach.
// A naive recursive C++ implementation to count number of decodings
// that can be formed from a given digit sequence
#include <iostream>
#include <cstring>
using namespace std;
// Given a digit sequence of length n, returns count of possible
// decodings by replacing 1 with A, 2 woth B, ... 26 with Z
int countDecoding(char *digits, int n)
{
// base cases
if (n == 0 || n == 1)
return 1;
int count = 0; // Initialize count
// If the last digit is not 0, then last digit must add to
// the number of words
if (digits[n-1] > '0')
count = countDecoding(digits, n-1);
// If the last two digits form a number smaller than or equal to 26,
// then consider last two digits and recur
if (digits[n-2] < '2' || (digits[n-2] == '2' && digits[n-1] < '7') )
count += countDecoding(digits, n-2);
return count;
}
// Driver program to test above function
int main()
{
char digits[] = "1234";
int n = strlen(digits);
cout << "Count is " << countDecoding(digits, n);
return 0;
}
Output:
Count is 3
The time complexity of above the code is exponential. If we take a closer look at the above program, we can observe that the recursive solution is
similar to Fibonacci Numbers. Therefore, we can optimize the above solution to work in O(n) time using Dynamic Programming. Following is C++
implementation for the same.
// A Dynamic Programming based C++ implementation to count decodings
#include <iostream>
#include <cstring>
using namespace std;
// A Dynamic Programming based function to count decodings
int countDecodingDP(char *digits, int n)
{
int count[n+1]; // A table to store results of subproblems
count[0] = 1;
count[1] = 1;
for (int i = 2; i <= n; i++)
{
count[i] = 0;
// If the last digit is not 0, then last digit must add to
// the number of words
if (digits[i-1] > '0')
count[i] = count[i-1];
// If second last digit is smaller than 2 and last digit is
// smaller than 7, then last two digits form a valid character
if (digits[i-2] < '2' || (digits[i-2] == '2' && digits[i-1] < '7') )
count[i] += count[i-2];
}
return count[n];
}
// Driver program to test above function
int main()
{
char digits[] = "1234";
int n = strlen(digits);
cout << "Count is " << countDecodingDP(digits, n);
return 0;
}
Output:
Count is 3
Time Complexity of the above solution is O(n) and it requires O(n) auxiliary space. We can reduce auxiliary space to O(1) by using space
optimized version discussed in the Fibonacci Number Post.
The idea is to take 12:00 (h = 12, m = 0) as a reference. Following are detailed steps.
1) Calculate the angle made by hour hand with respect to 12:00 in h hours and m minutes.
2) Calculate the angle made by minute hand with respect to 12:00 in h hours and m minutes.
3) The difference between two angles is the angle between two hands.
How to calculate the two angles with respect to 12:00?
The minute hand moves 360 degree in 60 minute(or 6 degree in one minute) and hour hand moves 360 degree in 12 hours(or 0.5 degree in 1
minute). In h hours and m minutes, the minute hand would move (h*60 + m)*6 and hour hand would move (h*60 + m)*0.5.
// C program to find angle between hour and minute hands
#include <stdio.h>
#include <stdlib.h>
// Utility function to find minimum of two integers
int min(int x, int y) { return (x < y)? x: y; }
int calcAngle(double h, double m)
{
// validate the input
if (h <0 || m < 0 || h >12 || m > 60)
printf("Wrong input");
if (h == 12) h = 0;
if (m == 60) m = 0;
// Calculate the angles moved by hour and minute hands
// with reference to 12:00
int hour_angle = 0.5 * (h*60 + m);
int minute_angle = 6*m;
// Find the difference between two angles
int angle = abs(hour_angle - minute_angle);
// Return the smaller angle of two possible angles
angle = min(360-angle, angle);
return angle;
}
// Driver program to test above function
int main()
{
printf("%d \n", calcAngle(9, 60));
printf("%d \n", calcAngle(3, 30));
return 0;
}
Output:
90
75
Exercise: Find all times when hour and minute hands get superimposed.
This problem can be solved using Dynamic Programming. Let a[i] be the number of binary strings of length i which do not contain any two
consecutive 1s and which end in 0. Similarly, let b[i] be the number of such strings which end in 1. We can append either 0 or 1 to a string ending
in 0, but we can only append 0 to a string ending in 1. This yields the recurrence relation:
a[i] = a[i - 1] + b[i - 1]
b[i] = a[i - 1]
The base cases of above recurrence are a[1] = b[1] = 1. The total number of strings of length i is just a[i] + b[i].
Following is C++ implementation of above solution. In the following implementation, indexes start from 0. So a[i] represents the number of binary
strings for input length i+1. Similarly, b[i] represents binary strings for input length i+1.
// C++ program to count all distinct binary strings
// without two consecutive 1's
#include <iostream>
using namespace std;
int countStrings(int n)
{
int a[n], b[n];
a[0] = b[0] = 1;
for (int i = 1; i < n; i++)
{
a[i] = a[i-1] + b[i-1];
b[i] = a[i-1];
}
return a[n-1] + b[n-1];
}
// Driver program to test above functions
int main()
{
cout << countStrings(3) << endl;
return 0;
}
Output:
5
Source:
courses.csail.mit.edu/6.006/oldquizzes/solutions/q2-f2009-sol.pdf
printf("\n");
findSmallest(36);
printf("\n");
findSmallest(13);
printf("\n");
findSmallest(100);
return 0;
}
Output:
17
49
Not possible
455
Output:
2
_
6
3
5
7
Following is a simple C++ program to check whether a given instance of 8 puzzle is solvable or not. The idea is simple, we count inversions in the
given 8 puzzle.
// C++ program to check if a given instance of 8 puzzle is solvable or not
#include <iostream>
using namespace std;
// A utility function to count inversions in given array 'arr[]'
int getInvCount(int arr[])
{
int inv_count = 0;
for (int i = 0; i < 9 - 1; i++)
for (int j = i+1; j < 9; j++)
// Value 0 is used for empty space
if (arr[j] && arr[i] && arr[i] > arr[j])
inv_count++;
return inv_count;
}
// This function returns true if given 8 puzzle is solvable.
bool isSolvable(int puzzle[3][3])
{
// Count inversions in given 8 puzzle
int invCount = getInvCount((int *)puzzle);
// return true if inversion count is even.
return (invCount%2 == 0);
}
Output:
Solvable
Note that the above implementation uses simple algorithm for inversion count. It is done this way for simplicity. The code can be optimized to
O(nLogn) using the merge sort based algorithm for inversion count.
How does this work?
The idea is based on the fact the parity of inversions remains same after a set of moves, i.e., if the inversion count is odd in initial stage, then it
remain odd after any sequence of moves and if the inversion count is even, then it remains even after any sequence of moves. In the goal state,
there are 0 inversions. So we can reach goal state only from a state which has even inversion count.
How parity of inversion count is invariant?
When we slide a tile, we either make a row move (moving a left or right tile into the blank space), or make a column move (moving a up or down
tile to the blank space).
a) A row move doesnt change the inversion count. See following example
1 2 3
Row Move
4 _ 5 ---------->
8 6 7
Inversion count remains 2
1 2 3
_ 4 5
8 6 7
after the move
1 2 3
Row Move
4 _ 5 ---------->
8 6 7
Inversion count remains 2
1 2 3
4 5 _
8 6 7
after the move
3 Column Move
5 ----------->
7
count increases by 2
1 _ 3
4 2 5
8 6 7
(changes from 2 to 4)
4
Column Move
6 ------------>
8
count decreases by 2
1 3
5 2
7 _
(changes
4
6
8
from 5 to 3)
So if a move either increases/decreases inversion count by 2, or keeps the inversion count same, then it is not possible to change parity of a state
by any sequence of row/column moves.
Exercise: How to check if a given instance of 15 puzzle is solvable or not. In a 15 puzzle, we have 44 board where 15 tiles have a number and
one empty space. Note that the above simple rules of inversion count dont directly work for 15 puzzle, the rules need to be modified for 15 puzzle.
Birthday Paradox
How many people must be there in a room to make the probability 100% that two people in the room have same birthday?
Answer: 367 (since there are 366 possible birthdays, including February 29).
The above question was simple. Try the below question yourself.
How many people must be there in a room to make the probability 50% that two people in the room have same birthday?
Answer: 23
The number is surprisingly very low. In fact, we need only 70 people to make the probability 99.9 %.
Let us discuss the generalized formula.
What is the probability that two persons among n have same birthday?
Let the probability that two people in a room with n have same birthday be P(same). P(Same) can be easily evaluated in terms of P(different)
where P(different) is the probability that all of them have different birthday.
P(same) = 1 P(different)
P(different) can be written as 1 x (364/365) x (363/365) x (362/365) x . x (1 (n-1)/365)
How did we get the above expression?
Persons from first to last can get birthdays in following order for all birthdays to be distinct:
The first person can have any birthday among 365
The second person should have a birthday which is not same as first person
The third person should have a birthday which is not same as first two persons.
.
The nth person should have a birthday which is not same as any of the earlier considered (n-1) persons.
Approximation of above expression
The above expression can be approximated using Taylors Series.
provides a first-order approximation for ex for x << 1:
To apply this approximation to the first expression derived for p(different), set x = -a / 365. Thus,
Therefore,
p(same) = 1- p(different)
An even coarser approximation is given by
p(same)
By taking Log on both sides, we get the reverse formula.
Using the above approximate formula, we can approximate number of people for a given probability. For example the following C++ function
find() returns the smallest n for which the probability is greater than the given p.
C++ Implementation of approximate formula.
The following is C++ program to approximate number of people for a given probability.
Output:
30
Source:
http://en.wikipedia.org/wiki/Birthday_problem
Applications:
1) Birthday Paradox is generally discussed with hashing to show importance of collision handling even for a small set of keys.
2) Birthday Attack
A simple solution is to one by one consider every term of first polynomial and multiply it with every term of second polynomial. Following is
algorithm of this simple method.
multiply(A[0..m-1], B[0..n01])
1) Create a product array prod[] of size m+n-1.
2) Initialize all entries in prod[] as 0.
3) Travers array A[] and do following for every element A[i]
...(3.a) Traverse array B[] and do following for every element B[j]
prod[i+j] = prod[i+j] + A[i] * B[j]
4) Return prod[].
Output
First polynomial is
5 + 0x^1 + 10x^2 + 6x^3
Second polynomial is
1 + 2x^1 + 4x^2
Product polynomial is
5 + 10x^1 + 30x^2 + 26x^3 + 52x^4 + 24x^5
Time complexity of the above solution is O(mn). If size of two polynomials same, then time complexity is O(n2).
Can we do better?
There are methods to do multiplication faster than O(n2) time. These methods are mainly based on divide and conquer. Following is one simple
method that divides the given polynomial (of degree n) into two polynomials one containing lower degree terms(lower than n/2) and other
containing higher degree terns (higher than or equal to n/2)
Let the two given polynomials be A and B.
For simplicity, Let us assume that the given two polynomials are of
same degree and have degree in powers of 2, i.e., n = 2i
The polynomial 'A' can be written as A0 + A1*xn/2
The polynomial 'B' can be written as B0 + B1*xn/2
For example 1 + 10x + 6x2 - 4x3 + 5x4 can be
written as (1 + 10x) + (6 - 4x + 5x2)*x2
A * B = (A0 + A1*xn/2) * (B0 + B1*xn/2)
= A0*B0 + A0*B1*xn/2 + A1*B0*xn/2 + A1*B1*xn
= A0*B0 + (A0*B1 + A1*B0)xn/2 + A1*B1*xn
So the above divide and conquer approach requires 4 multiplications and O(n) time to add all 4 results. Therefore the time complexity is T(n) =
4T(n/2) + O(n). The solution of the recurrence is O(n2) which is same as the above simple solution.
The idea is to reduce number of multiplications to 3 and make the recurrence as T(n) = 3T(n/2) + O(n)
How to reduce number of multiplications?
This requires a little trick similar to Strassens Matrix Multiplication. We do following 3 multiplications.
X = (A0 + A1)*(B0 + B1) // First Multiplication
Y = A0B0 // Second
Z = A1B1 // Third
The missing middle term in above multiplication equation A0*B0 + (A0*B1 +
A1*B0)xn/2 + A1*B1*xn can obtained using below.
A0B1 + A1B0 = X - Y - Z
Count Distinct Non-Negative Integer Pairs (x, y) that Satisfy the Inequality x*x + y*y
<n
Given a positive number n, count all distinct Non-Negative Integer pairs (x, y) that satisfy the inequality x*x + y*y < n.
Examples:
Input: n = 5
Output: 6
The pairs are (0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (0, 2)
Input: n = 6
Output: 8
The pairs are (0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (0, 2),
(1, 2), (2, 1)
A Simple Solution is to run two loops. The outer loop goes for all possible values of x (from 0 to ?n). The inner loops picks all possible values of
y for current value of x (picked by outer loop). Following is C++ implementation of simple solution.
#include <iostream>
using namespace std;
// This function counts number of pairs (x, y) that satisfy
// the inequality x*x + y*y < n.
int countSolutions(int n)
{
int res = 0;
for (int x = 0; x*x < n; x++)
for (int y = 0; x*x + y*y < n; y++)
res++;
return res;
}
// Driver program to test above function
int main()
{
cout << "Total Number of distinct Non-Negative pairs is "
<< countSolutions(6) << endl;
return 0;
}
Output:
Total Number of distinct Non-Negative pairs is 8
An upper bound for time complexity of the above solution is O(n). The outer loop runs ?n times. The inner loop runs at most ?n times.
Using an Efficient Solution, we can find the count in O(?n) time. The idea is to first find the count of all y values corresponding the 0 value of x.
Let count of distinct y values be yCount. We can find yCount by running a loop and comparing yCount*yCount with n.
After we have initial yCount, we can one by one increase value of x and find the next value of yCount by reducing yCount.
// An efficient C program to find different (x, y) pairs that
// satisfy x*x + y*y < n.
#include <iostream>
using namespace std;
// This function counts number of pairs (x, y) that satisfy
// the inequality x*x + y*y < n.
int countSolutions(int n)
{
int x = 0, yCount, res = 0;
// Find the count of different y values for x = 0.
for (yCount = 0; yCount*yCount < n; yCount++) ;
// One by one increase value of x, and find yCount for
// current x. If yCount becomes 0, then we have reached
// maximum possible value of x.
while (yCount != 0)
{
// Add yCount (count of different possible values of y
// for current x) to result
res += yCount;
// Increment x
x++;
Output:
Total Number of distinct Non-Negative pairs is 8
Time Complexity of the above solution seems more but if we take a closer look, we can see that it is O(?n). In every step inside the inner loop,
value of yCount is decremented by 1. The value yCount can decrement at most O(?n) times as yCount is count y values for x = 0. In the outer
loop, the value of x is incremented. The value of x can also increment at most O(?n) times as the last x is for yCount equals to 1.
Consider the example shown in diagram. The value of n is 3. There are 3 ways to reach the top. The diagram is taken from Easier Fibonacci
puzzles
More Examples:
Input: n = 1
Output: 1
There is only one way to climb 1 stair
Input: n = 2
Output: 2
There are two ways: (1, 1) and (2)
Input: n = 4
Output: 5
(1, 1, 1, 1), (1, 1, 2), (2, 1, 1), (1, 2, 1), (2, 2)
We can easily find recursive nature in above problem. The person can reach nth stair from either (n-1)th stair or from (n-2)th stair. Let the total
number of ways to reach nt stair be ways(n). The value of ways(n) can be written as following.
ways(n) = ways(n-1) + ways(n-2)
The above expression is actually the expression for Fibonacci numbers, but there is one thing to notice, the value of ways(n) is equal to
fibonacci(n+1).
ways(1) = fib(2) = 1
ways(2) = fib(3) = 2
ways(3) = fib(4) = 3
So we can use function for fibonacci numbers to find the value of ways(n). Following is C++ implementation of the above idea.
// A C program to count number of ways to reach n't stair when
// a person can climb 1, 2, ..m stairs at a time.
#include<stdio.h>
// A simple recursive program to find n'th fibonacci number
int fib(int n)
{
if (n <= 1)
return n;
return fib(n-1) + fib(n-2);
}
// Returns number of ways to reach s'th stair
int countWays(int s)
{
return fib(s + 1);
}
// Driver program to test above functions
int main ()
{
int s = 4;
printf("Number of ways = %d", countWays(s));
getchar();
return 0;
}
Output:
Number of ways = 5
The time complexity of the above implementation is exponential (golden ratio raised to power n). It can be optimized to work in O(Logn) time
using the previously discussed Fibonacci function optimizations.
Generalization of the above problem
How to count number of ways if the person can climb up to m stairs for a given value m? For example if m is 4, the person can climb 1 stair or 2
stairs or 3 stairs or 4 stairs at a time.
We can write the recurrence as following.
ways(n, m) = ways(n-1, m) + ways(n-2, m) + ... ways(n-m, m)
Output:
Number of ways = 5
The time complexity of above solution is exponential. It can be optimized to O(mn) by using dynamic programming. Following is dynamic
programming based solution. We build a table res[] in bottom up manner.
// A C program to count number of ways to reach n't stair when
// a person can climb 1, 2, ..m stairs at a time
#include<stdio.h>
// A recursive function used by countWays
int countWaysUtil(int n, int m)
{
int res[n];
res[0] = 1; res[1] = 1;
for (int i=2; i<n; i++)
{
res[i] = 0;
for (int j=1; j<=m && j<=i; j++)
res[i] += res[i-j];
}
return res[n-1];
}
Output:
Number of ways = 5
Output:
15125
Output:
First polynomial is
5 + 0x^1 + 10x^2 + 6x^3
Second polynomial is
1 + 2x^1 + 4x^2
Sum polynomial is
6 + 2x^1 + 14x^2 + 6x^3
Time complexity of the above algorithm and program is O(m+n) where m and n are orders of two given polynomials.
Let us consider an example n = 7, k = 3. The first digit of 1/7 is 1, it can be obtained by doing integer value of 10/7. Remainder of 10/7 is 3. Next
digit is 4 which can be obtained by taking integer value of 30/7. Remainder of 30/7 is 2. Next digits is 2 which can be obtained by taking integer
value of 20/7
#include <iostream>
using namespace std;
// Function to print first k digits after dot in value
// of 1/n. n is assumed to be a positive integer.
void print(int n, int k)
{
int rem = 1; // Initialize remainder
// Run a loop k times to print k digits
for (int i = 0; i < k; i++)
{
// The next digit can always be obtained as
// doing (10*rem)/10
cout << (10 * rem) / n;
// Update remainder
rem = (10*rem) % n;
}
}
// Driver program to test above function
int main()
{
int n = 7, k = 3;
print(n, k);
cout << endl;
n = 21, k = 4;
print(n, k);
return 0;
}
Output:
142
0476
Reference:
Algorithms And Programming: Problems And Solutions by Alexander Shen
All digits of a number recursively add up to 9, if only if the number is multiple of 9. We basically need to check for s%9 for all substrings s. One
trick used in below program is to do modular arithmetic to avoid overflow for big strings.
Following is a simple implementation based on this approach. The implementation assumes that there are no leading 0s in input number.
// C++ program to count substrings with recursive sum equal to 9
#include <iostream>
#include <cstring>
using namespace std;
int count9s(char number[])
{
int count = 0; // To store result
int n = strlen(number);
// Consider every character as beginning of substring
for (int i = 0; i < n; i++)
{
int sum = number[i] - '0'; //sum of digits in current substring
if (number[i] == '9') count++;
// One by one choose every character as an ending character
for (int j = i+1; j < n; j++)
{
// Add current digit to sum, if sum becomes multiple of 5
// then increment count. Let us do modular arithmetic to
// avoid overflow for big strings
sum = (sum + number[j] - '0')%9;
if (sum == 0)
count++;
}
}
return count;
}
// driver program to test above function
int main()
{
cout << count9s("4189") << endl;
cout << count9s("1809");
return 0;
}
Output:
3
5
Time complexity of the above program is O(n2). Please let me know if there is a better solution.
We can use sorting to do it in O(nLogn) time. We can also use hashing, but the worst case time complexity of hashing may be more than O(n) and
hashing requires extra space.
The idea is to use bitwise operators for a solution that is O(n) time and uses O(1) extra space. The solution is not easy like other XOR based
solutions, because all elements appear odd number of times here. The idea is taken from here.
Run a loop for all elements in array. At the end of every iteration, maintain following two values.
ones: The bits that have appeared 1st time or 4th time or 7th time .. etc.
twos: The bits that have appeared 2nd time or 5th time or 8th time .. etc.
Finally, we return the value of ones
How to maintain the values of ones and twos?
ones and twos are initialized as 0. For every new element in array, find out the common set bits in the new element and previous value of ones.
These common set bits are actually the bits that should be added to twos. So do bitwise OR of the common set bits with twos. twos also gets
some extra bits that appear third time. These extra bits are removed later.
Update ones by doing XOR of new element with previous value of ones. There may be some bits which appear 3rd time. These extra bits are also
removed later.
Both ones and twos contain those extra bits which appear 3rd time. Remove these extra bits by finding out common set bits in ones and twos.
#include <stdio.h>
int getSingle(int arr[], int n)
{
int ones = 0, twos = 0 ;
int common_bit_mask;
// Let us take the example of {3, 3, 2, 3} to understand this
for( int i=0; i< n; i++ )
{
/* The expression "one & arr[i]" gives the bits that are
there in both 'ones' and new element from arr[]. We
add these bits to 'twos' using bitwise OR
Value of 'twos' will be set as 0, 3, 3 and 1 after 1st,
2nd, 3rd and 4th iterations respectively */
twos = twos | (ones & arr[i]);
/* XOR the new bits with previous 'ones' to get all bits
appearing odd number of times
Value of 'ones' will be set as 3, 0, 2 and 3 after 1st,
2nd, 3rd and 4th iterations respectively */
ones = ones ^ arr[i];
/* The common bits are those bits which appear third time
So these bits should not be there in both 'ones' and 'twos'.
common_bit_mask contains all these bits as 0, so that the bits can
be removed from 'ones' and 'twos'
Value of 'common_bit_mask' will be set as 00, 00, 01 and 10
after 1st, 2nd, 3rd and 4th iterations respectively */
common_bit_mask = ~(ones & twos);
/* Remove common bits (the bits that appear third time) from 'ones'
Value of 'ones' will be set as 3, 0, 0 and 2 after 1st,
2nd, 3rd and 4th iterations respectively */
ones &= common_bit_mask;
/* Remove common bits (the bits that appear third time) from 'twos'
Value of 'twos' will be set as 0, 3, 1 and 0 after 1st,
2nd, 3rd and 4th itearations respectively */
twos &= common_bit_mask;
// uncomment this code to see intermediate values
//printf (" %d %d \n", ones, twos);
}
return ones;
}
int main()
{
int arr[] = {3, 3, 2, 3};
int n = sizeof(arr) / sizeof(arr[0]);
printf("The element with single occurrence is %d ",
getSingle(arr, n));
return 0;
}
Output:
2
}
7
Output:
Signs are opposite
The first method is more efficient. The first method uses a bitwise XOR and a comparison operator. The second method uses two comparison
operators and a bitwise XOR operation is more efficient compared to a comparison operation.
We can also use following method. It doesnt use any comparison operator. The method is suggested by Hongliang and improved by gaurav.
bool oppositeSigns(int x, int y)
{
return ((x ^ y) >> 31);
}
The function is written only for compilers where size of an integer is 32 bit. The expression basically checks sign of (x^y) using bitwise operator
>>. As mentioned above, the sign bit for negative numbers is always 1. The sign bit is the leftmost bit in binary representation. So we need to
checks whether the 32th bit (or leftmost bit) of x^y is 1 or not. We do it by right shifting the value of x^y by 31, so that the sign bit becomes the
least significant bit. If sign bit is 1, then the value of (x^y)>>31 will be 1, otherwise 0.
Output:
Total set bit count is 6
Method 2 (Tricky)
If the input number is of the form 2^b -1 e.g., 1,3,7,15.. etc, the number of set bits is b * 2^(b-1). This is because for all the numbers 0 to (2^b)1, if you complement and flip the list you end up with the same list (half the bits are on, half off).
If the number does not have all set bits, then some position m is the position of leftmost set bit. The number of set bits in that position is n (1 << m)
+ 1. The remaining set bits are in two parts: 1) The bits in the (m-1) positions down to the point where the leftmost bit becomes 0, and 2) The
2^(m-1) numbers below that point, which is the closed form above. An easy way to look at it is to consider the number 6:
0|0
0|0
0|1
0|1
-|
0
1
0
1
1|0 0
1|0 1
1|1 0
The leftmost set bit is in position 2 (positions are considered starting from 0). If we mask that off what remains is 2 (the 1 0? in the right part of the
last row.) So the number of bits in the 2nd position (the lower left box) is 3 (that is, 2 + 1). The set bits from 0-3 (the upper right box above) is
2*2^(2-1) = 4. The box in the lower right is the remaining bits we havent yet counted, and is the number of set bits for all the numbers up to 2 (the
value of the last entry in the lower right box) which can be figured recursively.
// A O(Logn) complexity program to count set bits in all numbers from 1 to n
#include <stdio.h>
/* Returns position of leftmost set bit. The rightmost
position is considered as 0 */
unsigned int getLeftmostBit (int n)
{
int m = 0;
while (n > 1)
{
n = n >> 1;
m++;
}
return m;
}
/* Given the position of previous leftmost set bit in n (or an upper
bound on leftmost position) returns the new position of leftmost
set bit in n */
unsigned int getNextLeftmostBit (int n, int m)
{
unsigned int temp = 1 << m;
while (n < temp)
{
temp = temp >> 1;
m--;
}
return m;
}
// The main recursive function used by countSetBits()
unsigned int _countSetBits(unsigned int n, int m);
// Returns count of set bits present in all numbers from 1 to n
unsigned int countSetBits(unsigned int n)
{
// Get the position of leftmost set bit in n. This will be
// used as an upper bound for next set bit function
int m = getLeftmostBit (n);
// Use the position
return _countSetBits (n, m);
}
unsigned int _countSetBits(unsigned int n, int m)
{
// Base Case: if n is 0, then set bit count is 0
if (n == 0)
return 0;
/* get position of next leftmost set bit */
m = getNextLeftmostBit(n, m);
//
//
//
if
Time Complexity: O(Logn). From the first look at the implementation, time complexity looks more. But if we take a closer look, statements inside
while loop of getNextLeftmostBit() are executed for all 0 bits in n. And the number of times recursion is executed is less than or equal to set bits in
n. In other words, if the control goes inside while loop of getNextLeftmostBit(), then it skips those many bits in recursion.
Thanks to agatsu and IC for suggesting this solution.
See this for another solution suggested by Piyush Kapoor.
Solution
We need to swap two sets of bits. XOR can be used in a similar way as it is used to swap 2 numbers. Following is the algorithm.
1) Move all bits of first set to rightmost side
set1 = (x >> p1) & ((1U << n) - 1)
Here the expression (1U << n) - 1 gives a number that
contains last n bits set and other bits as 0. We do &
with this expression so that bits other than the last
n bits become 0.
2) Move all bits of second set to rightmost side
set2 = (x >> p2) & ((1U << n) - 1)
3) XOR the two sets of bits
xor = (set1 ^ set2)
4) Put the xor bits back to their original positions.
xor = (xor << p1) | (xor << p2)
5) Finally, XOR the xor with original number so
that the two sets are swapped.
result = x ^ xor
Implementation:
#include<stdio.h>
int swapBits(unsigned int x, unsigned int p1, unsigned int p2, unsigned int n)
{
/* Move all bits of first set to rightmost side */
unsigned int set1 = (x >> p1) & ((1U << n) - 1);
/* Moce all bits of second set to rightmost side */
unsigned int set2 = (x >> p2) & ((1U << n) - 1);
/* XOR the two sets */
unsigned int xor = (set1 ^ set2);
/* Put the xor bits back to their original positions */
xor = (xor << p1) | (xor << p2);
/* XOR the 'xor' with the original number so that the
two sets are swapped */
unsigned int result = x ^ xor;
return result;
}
/* Drier program to test above function*/
int main()
{
Output:
Result = 7
References:
Swapping individual bits with XOR
This methid doesnt work for negative numbers. Method 2 works for negative nnumbers also.
Method 2
void changeToZero(int a[2])
{
a[ !a[0] ] = a[ !a[1] ]
}
Method 3
This method doesnt even need complement.
void changeToZero(int a[2])
{
a[ a[1] ] = a[ a[0] ]
}
Method 4
Thanks to purvi for suggesting this method.
void changeToZero(int a[2])
{
a[0] = a[a[0]];
a[1] = a[0];
}
2
3
3
4
3
4
4
5
---------------------------------------------------------
GROUP_A(0)
GROUP_A(1)
GROUP_A(1)
GROUP_A(2)
GROUP_A(1)
GROUP_A(2)
GROUP_A(2)
GROUP_A(3) ... so on
From the table, there is a patten emerging in multiples of 4, both in the table as well as in the groupparameter. The sequence can be generalized as
shown in the code.
Complexity:
All the operations takes O(1) except iterating over the array. The time complexity is O(n) where n is size of array. Space complexity depends on
the meta program that generates look up.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
/* Size of array 64 K */
#define SIZE (1 << 16)
/* Meta program that generates set bit count
array of first 256 integers */
/* GROUP_A - When combined with META_LOOK_UP
generates count for 4x4 elements */
#define GROUP_A(x) x, x + 1, x + 1, x + 2
/* GROUP_B - When combined with META_LOOK_UP
generates count for 4x4x4 elements */
#define GROUP_B(x) GROUP_A(x), GROUP_A(x+1), GROUP_A(x+1), GROUP_A(x+2)
/* GROUP_C - When combined with META_LOOK_UP
generates count for 4x4x4x4 elements */
#define GROUP_C(x) GROUP_B(x), GROUP_B(x+1), GROUP_B(x+1), GROUP_B(x+2)
/* Provide appropriate letter to generate the table */
#define META_LOOK_UP(PARAMETER) \
GROUP_##PARAMETER(0), \
GROUP_##PARAMETER(1), \
GROUP_##PARAMETER(1), \
GROUP_##PARAMETER(2) \
int countSetBits(int array[], size_t array_size)
{
int count = 0;
/* META_LOOK_UP(C) - generates a table of 256 integers whose
sequence will be number of bits in i-th position
where 0 <= i < 256
*/
/* A static table will be much faster to access */
static unsigned char const look_up[] = { META_LOOK_UP(C) };
/* No shifting funda (for better readability) */
unsigned char *pData = NULL;
for(size_t index = 0; index < array_size; index++)
{
/* It is fine, bypass the type system */
pData = (unsigned char *)&array[index];
/* Count
count +=
count +=
count +=
count +=
}
return count;
}
/* Driver program, generates table of random 64 K numbers */
int main()
{
int index;
int random[SIZE];
/* Seed to the random-number generator */
srand((unsigned)time(0));
/* Generate random numbers. */
for( index = 0; index < SIZE; index++ )
{
random[index] = rand();
}
printf("Total number of bits = %d\n", countSetBits(random, SIZE));
return 0;
}
x = 10011100
(2)
10011100
00011100- right most string of 1's in x
00000011 - right shifted pattern except left most bit ------> [A]
00010000 - isolated left most bit of right most 1's pattern
00100000 - shiftleft-ed the isolated bit by one position ------> [B]
10000000 - left part of x, excluding right most 1's pattern ------> [C]
10100000 - add B and C (OR operation) ------> [D]
10100011 - add A and D which is required number 163
(10)
After practicing with few examples, it easy to understand. Use the below given program for generating more sets.
Program Design:
We need to note few facts of binary numbers.The expression x & -x will isolate right most set bit in x (ensuring x will use 2s complement form for
negative numbers). If we add the result to x, right most string of 1s in x will be reset, and the immediate 0 left to this pattern of 1s will be set, which
is part [B] of above explanation. For example if x = 156, x & -x will result in 00000100, adding this result to x yields 10100000 (see part D). We
left with the right shifting part of pattern of 1s (part A of above explanation).
There are different ways to achieve part A. Right shifting is essentiallya division operation. What should be our divisor? Clearly, it should be
multiple of 2 (avoids 0.5 error in right shifting), and it should shift the right most 1s pattern to right extreme. The expression (x & -x) will serve the
purpose of divisor. An EX-OR operation between the number X and expression which is used to reset right most bits, will isolate the rightmost 1s
pattern.
A Correction Factor:
Note that we are adding right most set bit to the bit pattern. The addition operation causes a shift in the bit positions. The weight of binary system is
2, one shift causes an increase by a factor of 2. Since the increased number (rightOnesPattern in the code) being used twice, the error
propagates twice. The error needs to be corrected. A right shift by 2 positions will correct the result.
The popular name for this program is same number of one bits.
#include<iostream>
using namespace std;
typedef unsigned int uint_t;
// this function returns next higher number with same number of set bits as x.
uint_t snoob(uint_t x)
{
uint_t rightOne;
uint_t nextHigherOneBit;
uint_t rightOnesPattern;
uint_t next = 0;
if(x)
{
// right most set bit
rightOne = x & -(signed)x;
Modulus operation over exact powers of 2 is simple and faster bitwise ANDing. This is the reason, programmers usually make buffer length as
powers of 2.
Note that the technique will work only for divisors that are powers of 2.
An Example:
Implementation of circular queue (ring buffer) using an array. Omitting one position in the circular bufferimplementationcan make it easy to
distinguish between full and empty conditions. When the buffer reaches SIZE-1, it needs to wrap back to initial position. The wrap back
operation can be simple AND operation if the buffer size is power of 2. If we use any other size, we would need to use modulus operation.
Note:
Per experts comments, premature optimization is an evil. The optimization techniques provided are to fine tune your code
afterfinalizingdesignstrategy, algorithm, data structures and implementation. Werecommendto avoid them at the start of code development. Code
readability is key formaintenance.
Method 2
We know that the negative number is represented in 2s complement form on most of the architectures. We have the following lemma hold for 2s
complement representation of signed numbers.
Say, x is numerical value of a number, then
~x = -(x+1) [ ~ is for bitwise complement ]
(x + 1) is due to addition of 1 in 2s complement conversion
To get (x + 1) apply negation once again. So, the final expression becomes (-(~x)).
int addOne(int x)
{
return (-(~x));
}
/* Driver program to test above functions*/
int main()
{
printf("%d", addOne(13));
getchar();
return 0;
}
Example, assume the machine word length is one *nibble* for simplicity.
And x = 2 (0010),
~x = ~2 = 1101 (13 numerical)
-~x = -1101
Interpreting bits 1101 in 2s complement form yields numerical value as -(2^4 13) = -3. Applying - on the result leaves 3. Same analogy holds for
decrement. See this comment for implementation of decrement.
Note that this method works only if the numbers are stored in 2s complement form.
Thanks to Venki for suggesting this method.
2. Another way of doing this could be (8*x x)/2 (See below code). Thanks to ajaym for suggesting this.
#include <stdio.h>
int multiplyWith3Point5(int x)
{
return ((x<<3) - x)>>1;
}
Let the input number be n. n-1 would have all the bits flipped after the rightmost set bit (including the set bit). So, doing n&(n-1) would give us the
required result.
#include<stdio.h>
/* unsets the rightmost set bit of n and returns the result */
int fun(unsigned int n)
{
return n&(n-1);
}
/* Driver program to test above function */
int main()
{
int n = 7;
printf("The number after unsetting the rightmost set bit %d", fun(n));
getchar();
return 0;
}
2) For negative numbers, above step sets mask as 1 1 1 1 1 1 1 1 and 0 0 0 0 0 0 0 0 for positive numbers. Add the mask to the given number.
mask + n
Implementation:
#include <stdio.h>
#define CHAR_BIT 8
/* This function will return absoulte value of n*/
unsigned int getAbs(int n)
{
int const mask = n >> (sizeof(int) * CHAR_BIT - 1);
return ((n + mask) ^ mask);
}
/* Driver program to test above function */
int main()
{
int n = -6;
printf("Absoute value of %d is %u", n, getAbs(n));
getchar();
return 0;
}
Method 2:
1) Set the mask as right shift of integer by 31 (assuming integers are stored using 32 bits).
mask = n>>31
Implementation:
/* This function will return absoulte value of n*/
unsigned int getAbs(int n)
{
int const mask = n >> (sizeof(int) * CHAR_BIT - 1);
return ((n ^ mask) - mask);
}
On machines where branching is expensive, the above expression can be faster than the obvious approach, r = (v < 0) ? -(unsigned)v : v, even
though the number of operations is the same. Please see this for more details about the above two methods.
References:
http://graphics.stanford.edu/~seander/bithacks.html#IntegerAbs
References:
http://graphics.stanford.edu/~seander/bithacks.html#ModulusDivisionEasy
Below are the methods to get minimum(or maximum) without using branching. Typically, the obvious approach is best, though.
Method 1(Use XOR and comparison operator)
Minimum of x and y will be
y ^ ((x ^ y) & -(x < y))
It works because if x < y, then -(x < y) will be all ones, so r = y ^ (x ^ y) & ~0 = y ^ x ^ y = x. Otherwise, if x >= y, then -(x < y) will be all
zeros, so r = y ^ ((x ^ y) & 0) = y. On some machines, evaluating (x < y) as 0 or 1 requires a branch instruction, so there may be no advantage.
To find the maximum, use
x ^ ((x ^ y) & -(x < y));
#include<stdio.h>
/*Function to find minimum of x and y*/
int min(int x, int y)
{
return y ^ ((x ^ y) & -(x < y));
}
/*Function to find maximum of x and y*/
int max(int x, int y)
{
return x ^ ((x ^ y) & -(x < y));
}
/* Driver program to test above functions */
int main()
{
int x = 15;
int y = 6;
printf("Minimum of %d and %d is ", x, y);
printf("%d", min(x, y));
printf("\nMaximum of %d and %d is ", x, y);
printf("%d", max(x, y));
getchar();
}
, then we can use the following, which are faster because (x - y) only needs to be evaluated once.
Minimum of x and y will be
y + ((x - y) & ((x - y) >>(sizeof(int) * CHAR_BIT - 1)))
This method shifts the subtraction of x and y by 31 (if size of integer is 32). If (x-y) is smaller than 0, then (x -y)>>31 will be 1. If (x-y) is greater
than or equal to 0, then (x -y)>>31 will be 0.
So if x >= y, we get minimum as y + (x-y)&0 which is y.
If x < y, we get minimum as y + (x-y)&1 which is x. Similarly, to find the maximum use
x - ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1)))
#include<stdio.h>
#define CHAR_BIT 8
/*Function to find minimum of x and y*/
int min(int x, int y)
{
return y + ((x - y) & ((x - y) >>
(sizeof(int) * CHAR_BIT - 1)));
}
Note that the 1989 ANSI C specification doesn't specify the result of signed right-shift, so above method is not portable. If exceptions are thrown
on overflows, then the values of x and y should be unsigned or cast to unsigned for the subtractions to avoid unnecessarily throwing an exception,
however the right-shift needs a signed operand to produce all one bits when negative, so cast to signed there.
Source:
http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
All the bits that are set in xor will be set in one non-repeating element (x or y) and not in other. So if we take any set bit of xor and divide the
elements of the array in two sets one set of elements with same bit set and other set with same bit not set. By doing so, we will get x in one set and
y in another set. Now if we do XOR of all the elements in first set, we will get first non-repeating element, and by doing same in other set we will
get the second non-repeating element.
Let us see an example.
arr[] = {2, 4, 7, 9, 2, 4}
1) Get the XOR of all the elements.
xor = 2^4^7^9^2^4 = 14 (1110)
2) Get a number which has only one set bit of the xor.
Since we can easily get the rightmost set bit, let us use it.
set_bit_no = xor & ~(xor-1) = (1110) & ~(1101) = 0010
Now set_bit_no will have only set as rightmost set bit of xor.
3) Now divide the elements in two sets and do xor of
elements in each set, and we get the non-repeating
elements 7 and 9. Please see implementation for this
step.
Implementation:
#include <stdio.h>
#include <stdlib.h>
/* This finction sets the values of *x and *y to nonr-epeating
elements in an array arr[] of size n*/
void get2NonRepeatingNos(int arr[], int n, int *x, int *y)
{
int xor = arr[0]; /* Will hold xor of all elements */
int set_bit_no; /* Will have only single set bit of xor */
int i;
*x = 0;
*y = 0;
/* Get the xor of all elements */
for(i = 1; i < n; i++)
xor ^= arr[i];
/* Get the rightmost set bit in set_bit_no */
set_bit_no = xor & ~(xor-1);
/* Now divide elements in two sets by comparing rightmost set
bit of xor with bit at same position in each element. */
for(i = 0; i < n; i++)
{
if(arr[i] & set_bit_no)
*x = *x ^ arr[i]; /*XOR of first set */
else
*y = *y ^ arr[i]; /*XOR of second set*/
}
}
/* Driver program to test above function */
int main()
{
int arr[] = {2, 3, 7, 9, 11, 2, 3, 11};
int *x = (int *)malloc(sizeof(int));
int *y = (int *)malloc(sizeof(int));
get2NonRepeatingNos(arr, 8, x, y);
printf("The non-repeating elements are %d and %d", *x, *y);
getchar();
}
C/C++
#include <stdio.h>
int getOddOccurrence(int ar[], int ar_size)
{
int i;
int res = 0;
for (i=0; i < ar_size; i++)
res = res ^ ar[i];
return res;
}
/* Diver function to test above function */
int main()
{
int ar[] = {2, 3, 5, 4, 5, 2, 4, 3, 5, 2, 4, 4, 2};
int n = sizeof(ar)/sizeof(ar[0]);
printf("%d", getOddOccurrence(ar, n));
return 0;
}
Python
# Python program to find the element occurring odd number of times
def getOddOccurrence(arr):
# Initialize result
res = 0
# Traverse the array
for element in arr:
# XOR with the result
res = res ^ element
return res
# Test array
arr = [ 2, 3, 5, 4, 5, 2, 4, 3, 5, 2, 4, 4, 2]
print "%d" % getOddOccurrence(arr)
Output:
5
Method 2
Thanks to Himanshu Aggarwal for adding this method. This method doesnt modify *result if there us an overflow.
#include<stdio.h>
#include<limits.h>
#include<stdlib.h>
int addOvf(int* result, int a, int b)
{
if( a > INT_MAX - b)
return -1;
else
{
*result = a + b;
return 0;
}
}
int main()
{
int *res = (int *)malloc(sizeof(int));
int x = 2147483640;
int y = 10;
printf("%d", addOvf(res, x, y));
printf("\n %d", *res);
getchar();
return 0;
}
Memory representation of integer ox01234567 inside Big and little endian machines
How to see memory representation of multibyte data types on your machine?
Here is a sample C code that shows the byte representation of int, float and pointer.
#include <stdio.h>
/* function to show bytes in memory, from location start to start+n*/
void show_mem_rep(char *start, int n)
{
int i;
for (i = 0; i < n; i++)
printf(" %.2x", start[i]);
printf("\n");
}
/*Main function to call above function for 0x01234567*/
int main()
{
int i = 0x01234567;
show_mem_rep((char *)&i, sizeof(i));
getchar();
return 0;
}
When above program is run on little endian machine, gives 67 45 23 01? as output , while if it is run on endian machine, gives 01 23 45 67? as
output.
Is there a quick way to determine endianness of your machine?
There are n no. of ways for determining endianness of your machine. Here is one quick way of doing the same.
#include <stdio.h>
int main()
{
unsigned int i = 1;
char *c = (char*)&i;
if (*c)
printf("Little endian");
else
printf("Big endian");
getchar();
return 0;
}
In the above program, a character pointer c is pointing to an integer i. Since size of character is 1 byte when the character pointer is de-referenced
it will contain only first byte of integer. If machine is little endian then *c will be 1 (because last byte is stored first) and if machine is big endian then
*c will be 0.
Does endianness matter for programmers?
Most of the times compiler takes care of endianness, however, endianness becomes an issue in following cases.
It matters in network programming: Suppose you write integers to file on a little endian machine and you transfer this file to a big endian machine.
Unless there is little andian to big endian transformation, big endian machine will read the file in reverse order. You can find such a practical
example here.
Standard byte order for networks is big endian, also known as network byte order. Before transferring data on network, data is first converted to
network byte order (big endian).
Sometimes it matters when you are using type casting, below program is an example.
#include <stdio.h>
int main()
{
unsigned char arr[2] = {0x01, 0x00};
unsigned short int x = *(unsigned short int *) arr;
printf("%d", x);
getchar();
return 0;
}
In the above program, a char array is typecasted to an unsigned short integer type. When I run above program on little endian machine, I get 1 as
output, while if I run it on a big endian machine I get 256. To make programs endianness independent, above programming style should be
avoided.
What are bi-endians?
Bi-endian processors can run in both modes little and big endian.
What are the examples of little, big endian and bi-endian machines ?
Intel based processors are little endians. ARM processors were little endians. Current generation ARM processors are bi-endian.
Motorola 68K processors are big endians. PowerPC (by Motorola) and SPARK (by Sun) processors were big endian. Current version of these
processors are bi-endians.
Does endianness effects file formats?
File formats which have 1 byte as a basic unit are independent of endianness e..g., ASCII files . Other file formats use some fixed endianness
forrmat e.g, JPEG files are stored in big endian format.
Which one is better little endian or big endian
The term little and big endian came from Gullivers Travels by Jonathan Swift. Two groups could not agree by which end a egg should be opened a-the little or the big. Just like the egg issue, there is no technological reason to choose one byte ordering convention over the other, hence the
arguments degenerate into bickering about sociopolitical issues. As long as one of the conventions is selected and adhered to consistently, the
choice is arbitrary.
Above program can be optimized by removing the use of variable temp. See below the modified code.
unsigned int reverseBits(unsigned int num)
{
unsigned int NO_OF_BITS = sizeof(num) * 8;
unsigned int reverse_num = 0;
int i;
for (i = 0; i < NO_OF_BITS; i++)
{
if((num & (1 << i)))
reverse_num |= 1 << ((NO_OF_BITS - 1) - i);
}
return reverse_num;
}
XOR of two number will have set bits only at those places where A differs from B.
Example:
A = 1001001
B = 0010101
a_xor_b = 1011100
No of bits need to flipped = set bit count in a_xor_b i.e. 4
To get the set bit count please see another post on this portal http://geeksforgeeks.org/?p=1176
Next Power of 2
Write a function that, for a given no n, finds a number p which is greater than or equal to n and is a power of 2.
IP 5
OP 8
IP 17
OP 32
IP 32
OP 32
There are plenty of solutions for this. Let us take the example of 17 to explain some of them.
Method 1(Using Log of the number)
1. Calculate Position of set bit in p(next power of 2):
pos = ceil(lgn) (ceiling of log n with base 2)
2. Now calculate p:
p = pow(2, pos)
Example
Let us try for 17
pos = 5
p = 32
Example:
Let us try for 17
count = 5
p
= 32
unsigned int nextPowerOf2(unsigned int n)
{
unsigned count = 0;
/* First n in the below condition is for the case where n is 0*/
if (n && !(n&(n-1)))
return n;
while( n != 0)
{
n >>= 1;
count += 1;
}
return 1<<count;
}
/* Driver program to test above function */
int main()
{
unsigned int n = 0;
printf("%d", nextPowerOf2(n));
getchar();
return 0;
}
Example:
Steps 1 & 3 of above algorithm are to handle cases
of power of 2 numbers e.g., 1, 2, 4, 8, 16,
Let us
step 1
n =
step 2
n =
n =
n =
n =
n =
n =
n =
n =
n =
n =
n =
n =
n =
n =
n =
>> 1
| 01000
>> 2
| 00110
>> 4
| 00001
>> 8
| 00000
>> 16
| 00000
Program:
# include <stdio.h>
/* Finds next power of two for n. If n itself
is a power of two then returns n*/
Proof:
Above can be proved by taking the example of 11 in decimal numbers. (In this context 11 in decimal numbers is same as 3 in binary numbers)
If difference between sum of odd digits and even digits is multiple of 11 then decimal number is multiple of 11. Lets see how.
Lets take the example of 2 digit numbers in decimal
AB = 11A -A + B = 11A + (B A)
So if (B A) is a multiple of 11 then is AB.
Let us take 3 digit numbers.
ABC = 99A + A + 11B B + C = (99A + 11B) + (A + C B)
So if (A + C B) is a multiple of 11 then is (A+C-B)
Let us take 4 digit numbers now.
ABCD = 1001A + D + 11C C + 999B + B A
= (1001A 999B + 11C) + (D + B A -C )
So, if (B + D A C) is a multiple of 11 then is ABCD.
This can be continued for all decimal numbers.
Above concept can be proved for 3 in binary numbers in the same way.
Time Complexity: O(logn)
Program:
#include<stdio.h>
/* Fnction to check if n is a multiple of 3*/
int isMultipleOf3(int n)
{
int odd_count = 0;
int even_count = 0;
/* Make no positive if +n is multiple of 3
then is -n. We are doing this to avoid
stack overflow in recursion*/
if(n < 0) n = -n;
if(n == 0) return 1;
if(n == 1) return 0;
while(n)
{
/* If odd bit is set then
parity = 0
Program:
# include <stdio.h>
# define bool int
/* Function to get parity of number n. It returns 1
if n has odd parity, and returns 0 if n has even
parity */
bool getParity(unsigned int n)
{
bool parity = 0;
while (n)
{
parity = !parity;
n
= n & (n - 1);
}
return parity;
}
/* Driver program to test getParity() */
int main()
{
unsigned int n = 7;
printf("Parity of no %d = %s", n,
(getParity(n)? "odd": "even"));
getchar();
return 0;
}
Above solution can be optimized by using lookup table. Please refer to Bit Twiddle Hacks[1st reference] for details.
Time Complexity: The time taken by above algorithm is proportional to the number of bits set. Worst case complexity is O(Logn).
Uses: Parity is used in error detection and cryptography.
References:
http://graphics.stanford.edu/~seander/bithacks.html#ParityNaive last checked on 30 May 2009.
Output:
No
No
Yes
Yes
No
Yes
3. All power of two numbers have only one bit set. So count the no. of set bits and if you get 1 then number is a power of 2. Please see
http://geeksforgeeks.org/?p=1176 for counting set bits.
4. If we subtract a power of 2 numbers by 1 then all unset bits after the only set bit become set; and the set bit become unset.
For example for 4 ( 100) and 16(10000), we get following after subtracting 1
3 > 011
15 > 01111
So, if a number n is a power of 2 then bitwise & of n and n-1 will be zero. We can say n is a power of 2 or not based on value of n&(n-1). The
expression n&(n-1) will not work when n is 0. To handle this case also, our expression will become n& (!n&(n-1)) (thanks to Mohammad for
adding this case).
Below is the implementation of this method.
#include<stdio.h>
#define bool int
/* Function to check if x is power of 2*/
bool isPowerOfTwo (int x)
{
/* First x in the below expression is for the case when x is 0 */
return x && (!(x&(x-1)));
}
/*Driver program to test above function*/
int main()
{
isPowerOfTwo(31)? printf("Yes\n"): printf("No\n");
isPowerOfTwo(17)? printf("Yes\n"): printf("No\n");
isPowerOfTwo(16)? printf("Yes\n"): printf("No\n");
isPowerOfTwo(2)? printf("Yes\n"): printf("No\n");
isPowerOfTwo(18)? printf("Yes\n"): printf("No\n");
Output:
No
No
Yes
Yes
No
Yes
18,
2
19,
1
Program:
#include<stdio.h>
#include<math.h>
unsigned int getFirstSetBitPos(int n)
{
return log2(n&-n)+1;
}
int main()
{
int n = 12;
printf("%u", getFirstSetBitPos(n));
getchar();
return 0;
}
Let us take unsigned integer (32 bit), which consist of 0-31 bits. To print binary representation of unsigned integer, start from 31th bit, check
whether 31th bit is ON or OFF, if it is ON print 1 else print 0. Now check whether 30th bit is ON or OFF, if it is ON print 1 else print 0, do this
for all bits from 31 to 0, finally we will get binary representation of number.
void bin(unsigned n)
{
unsigned i;
for (i = 1 << 31; i > 0; i = i / 2)
(n & i)? printf("1"): printf("0");
}
int main(void)
{
bin(7);
printf("\n");
bin(4);
}
Method 2: Recursive
Following is recursive method to print binary representation of NUM.
step 1) if NUM > 1
a) push NUM on stack
b) recursively call function with 'NUM / 2'
step 2)
a) pop NUM from stack, divide it by 2 and print it's remainder.
void bin(unsigned n)
{
/* step 1 */
if (n > 1)
bin(n/2);
/* step 2 */
printf("%d", n % 2);
}
int main(void)
{
bin(7);
printf("\n");
bin(4);
}
Output:
43
Output:
n = 16, Position 5
n = 12, Invalid number
n = 128, Position 8
Following is another method for this problem. The idea is to one by one right shift the set bit of given number n until n becomes 0. Count how
many times we shifted to make n zero. The final count is position of the set bit.
// C program to find position of only set bit in a given number
#include <stdio.h>
// A utility function to check whether n is power of 2 or not
int isPowerOfTwo(unsigned n)
{ return n && (! (n & (n-1)) ); }
// Returns position of the only set bit in 'n'
int findPosition(unsigned n)
{
if (!isPowerOfTwo(n))
return -1;
unsigned count = 0;
// One by one move the only set bit to right till it reaches end
while (n)
{
n = n >> 1;
// increment count of shifts
++count;
}
return count;
}
// Driver program to test above function
int main(void)
{
int n = 0;
int pos = findPosition(n);
(pos == -1)? printf("n = %d, Invalid number\n", n):
printf("n = %d, Position %d \n", n, pos);
n = 12;
pos = findPosition(n);
(pos == -1)? printf("n = %d, Invalid number\n", n):
printf("n = %d, Position %d \n", n, pos);
n = 128;
pos = findPosition(n);
(pos == -1)? printf("n = %d, Invalid number\n", n):
printf("n = %d, Position %d \n", n, pos);
return 0;
}
Output:
n = 0, Invalid number
n = 12, Invalid number
n = 128, Position 8
We can also use log base 2 to find the position. Thanks to Arunkumar for suggesting this solution.
#include <stdio.h>
unsigned int Log2n(unsigned int n)
{
return (n > 1)? 1 + Log2n(n/2): 0;
}
int isPowerOfTwo(unsigned n)
{
return n && (! (n & (n-1)) );
}
int findPosition(unsigned n)
{
if (!isPowerOfTwo(n))
return -1;
return Log2n(n) + 1;
}
// Driver program to test above function
int main(void)
{
int n = 0;
int pos = findPosition(n);
(pos == -1)? printf("n = %d, Invalid number\n", n):
printf("n = %d, Position %d \n", n, pos);
n = 12;
pos = findPosition(n);
Output:
n = 0, Invalid number
n = 12, Invalid number
n = 128, Position 8
Using Divide and Conquer, we can multiply two integers in less time complexity. We divide the given numbers in two halves. Let the given
numbers be X and Y.
For simplicity let us assume that n is even
X = Xl*2n/2 + Xr
Y = Yl*2n/2 + Yr
If we take a look at the above formula, there are four multiplications of size n/2, so we basically divided the problem of size n into for subproblems of size n/2. But that doesnt help because solution of recurrence T(n) = 4T(n/2) + O(n) is O(n^2). The tricky part of this algorithm is to
change the middle two terms to some other form so that only one extra multiplication would be sufficient. The following is tricky expression for
middle two terms.
XlYr + XrYl = (Xl + Xr)(Yl + Yr) - XlYl- XrYr
With above trick, the recurrence becomes T(n) = 3T(n/2) + O(n) and solution of this recurrence is O(n1.59).
What if the lengths of input strings are different and are not even? To handle the different length case, we append 0s in the beginning. To
handle odd length, we put floor(n/2) bits in left half and ceil(n/2) bits in right half. So the expression for XY changes to following.
XY = 22ceil(n/2) XlYl + 2ceil(n/2) * [(Xl + Xr)(Yl + Yr) - XlYl - XrYr] + XrYr
The above algorithm is called Karatsuba algorithm and it can be used for any base.
Following is C++ implementation of above algorithm.
// C++ implementation of Karatsuba algorithm for bit string multiplication.
#include<iostream>
#include<stdio.h>
using namespace std;
// FOLLOWING TWO FUNCTIONS ARE COPIED FROM http://goo.gl/q0OhZ
// Helper method: given two unequal sized bit strings, converts them to
// same length by adding leading 0s in the smaller string. Returns the
// the new length
int makeEqualLength(string &str1, string &str2)
{
int len1 = str1.size();
int len2 = str2.size();
if (len1 < len2)
{
for (int i = 0 ; i < len2 - len1 ; i++)
str1 = '0' + str1;
return len2;
}
else if (len1 > len2)
{
for (int i = 0 ; i < len1 - len2 ; i++)
str2 = '0' + str2;
}
return len1; // If len1 >= len2
}
// The main function that adds two bit sequences and returns the addition
string addBitStrings( string first, string second )
{
string result; // To store the sum bits
// make the lengths same before adding
int length = makeEqualLength(first, second);
int carry = 0; // Initialize carry
// Add all bits one by one
for (int i = length-1 ; i >= 0 ; i--)
{
int firstBit = first.at(i) - '0';
int secondBit = second.at(i) - '0';
// boolean expression for sum of 3 bits
int sum = (firstBit ^ secondBit ^ carry)+'0';
result = (char)sum + result;
// boolean expression for 3-bit addition
carry = (firstBit&secondBit) | (secondBit&carry) | (firstBit&carry);
}
// if overflow, then add a leading 1
if (carry) result = '1' + result;
return result;
}
// A utility function to multiply single bits of strings a and b
int multiplyiSingleBit(string a, string b)
{ return (a[0] - '0')*(b[0] - '0'); }
// The main function that multiplies two bit strings X and Y and returns
// result as long integer
long int multiply(string X, string Y)
{
// Find the maximum of lengths of x and Y and make length
// of smaller string same as that of larger string
int n = makeEqualLength(X, Y);
// Base cases
if (n == 0) return 0;
if (n == 1) return multiplyiSingleBit(X, Y);
int fh = n/2; // First half of string, floor(n/2)
int sh = (n-fh); // Second half of string, ceil(n/2)
// Find the first half and second half of first string.
// Refer http://goo.gl/lLmgn for substr method
string Xl = X.substr(0, fh);
string Xr = X.substr(fh, sh);
// Find the first half and second half of second string
string Yl = Y.substr(0, fh);
string Yr = Y.substr(fh, sh);
// Recursively calculate the three products of inputs of size n/2
long int P1 = multiply(Xl, Yl);
long int P2 = multiply(Xr, Yr);
long int P3 = multiply(addBitStrings(Xl, Xr), addBitStrings(Yl, Yr));
// Combine the three products to get the final result.
return P1*(1<<(2*sh)) + (P3 - P1 - P2)*(1<<sh) + P2;
}
// Driver program to test aboev functions
int main()
{
printf ("%ld\n", multiply("1100", "1010"));
printf ("%ld\n", multiply("110", "1010"));
printf
printf
printf
printf
printf
("%ld\n",
("%ld\n",
("%ld\n",
("%ld\n",
("%ld\n",
multiply("11", "1010"));
multiply("1", "1010"));
multiply("0", "1010"));
multiply("111", "111"));
multiply("11", "11"));
Output:
120
60
30
10
0
49
9
to
y;
y;
y;
Output:
After Swapping: x = 5, y = 10
to
y;
y;
y;
Output:
After Swapping: x = 5, y = 10
to
y;
y;
y;
Output:
After Swapping: x = 5, y = 10
3) When we use pointers to variable and make a function swap, all of the above methods fail when both pointers point to the same variable. Lets
take a look what will happen in this case if both are pointing to the same variable.
// Bitwise XOR based method
x = x ^ x; // x becomes 0
x = x ^ x; // x remains 0
x = x ^ x; // x remains 0
// Arithmetic based method
x = x + x; // x becomes 2x
x = x x; // x becomes 0
x = x x; // x remains 0
Let us see the following program.
#include <stdio.h>
void swap(int *xp, int *yp)
{
*xp = *xp ^ *yp;
*yp = *xp ^ *yp;
*xp = *xp ^ *yp;
}
int main()
{
int x = 10;
swap(&x, &x);
printf("After swap(&x, &x): x = %d", x);
return 0;
}
Output:
After swap(&x, &x): x = 0
Swapping a variable with itself may needed in many standard algorithms. For example see this implementation of QuickSort where we may swap a
variable with itself. The above problem can be avoided by putting a condition before the swapping.
#include <stdio.h>
void swap(int *xp, int *yp)
{
if (xp == yp) // Check if the two addresses are same
return;
*xp = *xp + *yp;
*yp = *xp - *yp;
*xp = *xp - *yp;
}
int main()
{
int x = 10;
swap(&x, &x);
printf("After swap(&x, &x): x = %d", x);
return 0;
}
Output:
After swap(&x, &x): x = 10
Output:
0 9 18 27 36 45 54 63 72 81 90 99
Since we need to use bitwise operators, we get the value of floor(n/8) using n>>3 and get value of n%8 using n&7. We need to write above
expression in terms of floor(n/8) and n%8.
n/8 is equal to floor(n/8) + (n%8)/8?. Let us write the above expression in terms of floor(n/8) and n%8
n/9 = floor(n/8) + (n%8)/8 - [floor(n/8) + (n%8)/8]/9
n/9 = floor(n/8) - [floor(n/8) - 9(n%8)/8 + (n%8)/8]/9
n/9 = floor(n/8) - [floor(n/8) - n%8]/9
From above equation, n is a multiple of 9 only if the expression floor(n/8) [floor(n/8) n%8]/9 is an integer. This expression can only be an integer
if the sub-expression [floor(n/8) n%8]/9 is an integer. The subexpression can only be an integer if [floor(n/8) n%8] is a multiple of 9. So the
problem reduces to a smaller value which can be written in terms of bitwise operators.
Output:
70
Explanation:
100 is 01100100 in binary. The operation can be split mainly in two parts
1) The expression x & 0x0F gives us last 4 bits of x. For x = 100, the result is 00000100. Using bitwise <<' operator, we shift the last four bits to
the left 4 times and make the new last four bits as 0. The result after shift is 01000000. 2) The expression x & 0xF0 gives us first four bits of x.
For x = 100, the result is 01100000. Using bitwise >> operator, we shift the digit to the right 4 times and make the first four bits as 0. The result
after shift is 00000110.
At the end we use the bitwise OR | operation of the two expressions explained above. The OR operator places first nibble to the end and last
nibble to first. For x = 100, the value of (01000000) OR (00000110) gives the result 01000110 which is equal to 70 in decimal.
The idea is to use bitwise <<, & and ~ operators. Using expression "~(1 << (k - 1)), we get a number which has all bits set, except the kth bit. If
we do bitwise & of this expression with n, we get a number which has all bits same as n except the kth bit which is 0.
Following is C++ implementation of this.
#include <iostream>
using namespace std;
// Returns a number that has all bits same as n
// except the k'th bit which is made 0
int turnOffK(int n, int k)
{
// k must be greater than 0
if (k <= 0) return n;
// Do & of n with a number with all set bits except
// the k'th bit
return (n & ~(1 << (k - 1)));
}
// Driver program to test above function
int main()
{
int n = 15;
int k = 4;
cout << turnOffK(n, k);
return 0;
}
Output:
7
Exercise: Write a function turnOnK() that turns the kth bit on.
Output:
1
1
Adjacency Matrix
Representation of the above
graph
Pros: Representation is easier to implement and follow. Removing an edge takes O(1) time. Queries like whether there is an edge from vertex u to
vertex v are efficient and can be done O(1).
Cons: Consumes more space O(V^2). Even if the graph is sparse(contains less number of edges), it consumes the same space. Adding a vertex is
O(V^2) time.
Adjacency List:
An array of linked lists is used. Size of the array is equal to number of vertices. Let the array be array[]. An entry array[i] represents the linked list
of vertices adjacent to the ith vertex. This representation can also be used to represent a weighted graph. The weights of edges can be stored in
nodes of linked lists. Following is adjacency list representation of the above graph.
}
// A utility function to print the adjacenncy list representation of graph
void printGraph(struct Graph* graph)
{
int v;
for (v = 0; v < graph->V; ++v)
{
struct AdjListNode* pCrawl = graph->array[v].head;
printf("\n Adjacency list of vertex %d\n head ", v);
while (pCrawl)
{
printf("-> %d", pCrawl->dest);
pCrawl = pCrawl->next;
}
printf("\n");
}
}
// Driver program to test above functions
int main()
{
// create the graph given in above fugure
int V = 5;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1);
addEdge(graph, 0, 4);
addEdge(graph, 1, 2);
addEdge(graph, 1, 3);
addEdge(graph, 1, 4);
addEdge(graph, 2, 3);
addEdge(graph, 3, 4);
// print the adjacency list representation of the above graph
printGraph(graph);
return 0;
}
Output:
Adjacency list of vertex 0
head -> 4-> 1
Adjacency list of vertex 1
head -> 4-> 3-> 2-> 0
Adjacency list of vertex 2
head -> 3-> 1
Adjacency list of vertex 3
head -> 4-> 2-> 1
Adjacency list of vertex 4
head -> 3-> 1-> 0
Pros: Saves space O(|V|+|E|) . In the worst case, there can be C(V, 2) number of edges in a graph thus consuming O(V^2) space. Adding a
vertex is easier.
Cons: Queries like whether there is an edge from vertex u to vertex v are not efficient and can be done O(V).
Reference:
http://en.wikipedia.org/wiki/Graph_%28abstract_data_type%29
Following are C++ and Java implementations of simple Breadth First Traversal from a given source.
The C++ implementation uses adjacency list representation of graphs. STLs list container is used to store lists of adjacent nodes and queue of
nodes needed for BFS traversal.
C++
// Program to print BFS traversal from a given source vertex. BFS(int s)
// traverses vertices reachable from s.
#include<iostream>
#include <list>
using namespace std;
// This class represents a directed graph using adjacency list representation
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// Pointer to an array containing adjacency lists
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void BFS(int s); // prints BFS traversal from a given source s
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
void Graph::BFS(int s)
{
// Mark all the vertices as not visited
bool *visited = new bool[V];
for(int i = 0; i < V; i++)
visited[i] = false;
// Create a queue for BFS
list<int> queue;
// Mark the current node as visited and enqueue it
visited[s] = true;
queue.push_back(s);
// 'i' will be used to get all adjacent vertices of a vertex
list<int>::iterator i;
while(!queue.empty())
{
// Dequeue a vertex from queue and print it
s = queue.front();
cout << s << " ";
queue.pop_front();
// Get all adjacent vertices of the dequeued vertex s
// If a adjacent has not been visited, then mark it visited
// and enqueue it
for(i = adj[s].begin(); i != adj[s].end(); ++i)
{
if(!visited[*i])
{
visited[*i] = true;
queue.push_back(*i);
}
}
}
}
// Driver program to test methods of graph class
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
cout << "Following is Breadth First Traversal (starting from vertex 2) \n";
g.BFS(2);
return 0;
}
Java
// Java program to print BFS traversal from a given source vertex.
// BFS(int s) traverses vertices reachable from s.
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; //Adjacency Lists
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
// Function to add an edge into the graph
void addEdge(int v,int w)
{
adj[v].add(w);
}
// prints BFS traversal from a given source s
void BFS(int s)
{
// Mark all the vertices as not visited(By default
// set as false)
boolean visited[] = new boolean[V];
// Create a queue for BFS
LinkedList<Integer> queue = new LinkedList<Integer>();
// Mark the current node as visited and enqueue it
visited[s]=true;
queue.add(s);
while (queue.size() != 0)
{
// Dequeue a vertex from queue and print it
s = queue.poll();
System.out.print(s+" ");
// Get all adjacent vertices of the dequeued vertex s
// If a adjacent has not been visited, then mark it
// visited and enqueue it
Iterator<Integer> i = adj[s].listIterator();
while (i.hasNext())
{
int n = i.next();
if (!visited[n])
{
visited[n] = true;
queue.add(n);
}
}
}
}
// Driver method to
public static void main(String args[])
{
Graph g = new Graph(4);
g.addEdge(0,
g.addEdge(0,
g.addEdge(1,
g.addEdge(2,
g.addEdge(2,
g.addEdge(3,
1);
2);
2);
0);
3);
3);
Note that the above code traverses only the vertices reachable from a given source vertex. All the vertices may not be reachable from a given
vertex (example Disconnected graph). To print all the vertices, we can modify the BFS function to do traversal starting from all nodes one by one
(Like the DFS modified version) .
Time Complexity: O(V+E) where V is number of vertices in the graph and E is number of edges in the graph.
Also see Depth First Traversal
C++
// C++ program to print DFS traversal from a given vertex in a given graph
#include<iostream>
#include <list>
using namespace std;
// Graph class represents a directed graph using adjacency list representation
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// Pointer to an array containing adjacency lists
void DFSUtil(int v, bool visited[]); // A function used by DFS
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void DFS(int v);
// DFS traversal of the vertices reachable from v
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
void Graph::DFSUtil(int v, bool visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
cout << v << " ";
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
if (!visited[*i])
DFSUtil(*i, visited);
}
// DFS traversal of the vertices reachable from v. It uses recursive DFSUtil()
void Graph::DFS(int v)
{
// Mark all the vertices as not visited
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to print DFS traversal
DFSUtil(v, visited);
}
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
cout << "Following is Depth First Traversal (starting from vertex 2) \n";
g.DFS(2);
return 0;
}
Java
// Java program to print DFS traversal from a given given graph
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
}
// A function used by DFS
void DFSUtil(int v,boolean visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
System.out.print(v+" ");
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> i = adj[v].listIterator();
while (i.hasNext())
{
int n = i.next();
if (!visited[n])
DFSUtil(n, visited);
}
}
// The function to do DFS traversal. It uses recursive DFSUtil()
void DFS(int v)
{
// Mark all the vertices as not visited(set as
// false by default in java)
boolean visited[] = new boolean[V];
// Call the recursive helper function to print DFS traversal
DFSUtil(v, visited);
}
public static void main(String args[])
{
1);
2);
2);
0);
3);
3);
Note that the above code traverses only the vertices reachable from a given source vertex. All the vertices may not be reachable from a given
vertex (example Disconnected graph). To do complete DFS traversal of such graphs, we must call DFSUtil() for every vertex. Also, before calling
DFSUtil(), we should check if it is already printed by some other call of DFSUtil(). Following implementation does the complete graph traversal
even if the nodes are unreachable. The differences from the above code are highlighted in the below code.
C++
// C++ program to print DFS traversal for a given given graph
#include<iostream>
#include <list>
using namespace std;
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// Pointer to an array containing adjacency lists
void DFSUtil(int v, bool visited[]); // A function used by DFS
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void DFS();
// prints DFS traversal of the complete graph
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
void Graph::DFSUtil(int v, bool visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
cout << v << " ";
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
if(!visited[*i])
DFSUtil(*i, visited);
}
// The function to do DFS traversal. It uses recursive DFSUtil()
void Graph::DFS()
{
// Mark all the vertices as not visited
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to print DFS traversal
// starting from all vertices one by one
Java
// Java program to print DFS traversal from a given given graph
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
}
// A function used by DFS
void DFSUtil(int v,boolean visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
System.out.print(v+" ");
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> i = adj[v].listIterator();
while (i.hasNext())
{
int n = i.next();
if (!visited[n])
DFSUtil(n,visited);
}
}
// The function to do DFS traversal. It uses recursive DFSUtil()
void DFS()
{
// Mark all the vertices as not visited(set as
// false by default in java)
boolean visited[] = new boolean[V];
// Call the recursive helper function to print DFS traversal
// starting from all vertices one by one
for (int i=0; i<V; ++i)
if (visited[i] == false)
DFSUtil(i, visited);
}
public static void main(String args[])
{
Graph g = new Graph(4);
g.addEdge(0,
g.addEdge(0,
g.addEdge(1,
g.addEdge(2,
g.addEdge(2,
g.addEdge(3,
1);
2);
2);
0);
3);
3);
Time Complexity: O(V+E) where V is number of vertices in the graph and E is number of edges in the graph.
Breadth First Traversal for a Graph
For a disconnected graph, we get the DFS forrest as output. To detect cycle, we can check for cycle in individual trees by checking back edges.
To detect a back edge, we can keep track of vertices currently in recursion stack of function for DFS traversal. If we reach a vertex that is already
in the recursion stack, then there is a cycle in the tree. The edge that connects current vertex to the vertex in the recursion stack is back edge. We
have used recStack[] array to keep track of vertices in the recursion stack.
// A C++ Program to detect cycle in a graph
#include<iostream>
#include <list>
#include <limits.h>
using namespace std;
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// Pointer to an array containing adjacency lists
bool isCyclicUtil(int v, bool visited[], bool *rs); // used by isCyclic()
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // to add an edge to graph
bool isCyclic();
// returns true if there is a cycle in this graph
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
// This function is a variation of DFSUytil() in http://www.geeksforgeeks.org/archives/18212
bool Graph::isCyclicUtil(int v, bool visited[], bool *recStack)
{
if(visited[v] == false)
{
// Mark the current node as visited and part of recursion stack
visited[v] = true;
recStack[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
{
if ( !visited[*i] && isCyclicUtil(*i, visited, recStack) )
return true;
else if (recStack[*i])
return true;
}
}
recStack[v] = false; // remove the vertex from recursion stack
return false;
}
// Returns true if the graph contains a cycle, else false.
// This function is a variation of DFS() in http://www.geeksforgeeks.org/archives/18212
bool Graph::isCyclic()
{
// Mark all the vertices as not visited and not part of recursion
// stack
bool *visited = new bool[V];
bool *recStack = new bool[V];
for(int i = 0; i < V; i++)
{
visited[i] = false;
recStack[i] = false;
}
// Call the recursive helper function to detect cycle in different
// DFS trees
for(int i = 0; i < V; i++)
if (isCyclicUtil(i, visited, recStack))
return true;
return false;
}
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
if(g.isCyclic())
cout << "Graph contains cycle";
else
cout << "Graph doesn't contain cycle";
return 0;
}
Output:
Graph contains cycle
Time Complexity of this method is same as time complexity of DFS traversal which is O(V+E).
For each edge, make subsets using both the vertices of the edge. If both the vertices are in the same subset, a cycle is found.
Initially, all slots of parent array are initialized to -1 (means there is only one item in every subset).
0 1 2
-1 -1 -1
1 2
2 -1
Edge 0-2: 0 is in subset 2 and 2 is also in subset 2. Hence, including this edge forms a cycle.
How subset of 0 is same as 2?
0->1->2 // 1 is parent of 0 and 2 is parent of 1
Based on the above explanation, below are implementations:
C/C++
// A union-find algorithm to detect cycle in a graph
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// a structure to represent an edge in graph
struct Edge
{
int src, dest;
};
// a structure to represent a graph
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges
struct Edge* edge;
};
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
Java
// Java Program for union-find algorithm to detect cycle in a graph
import java.util.*;
import java.lang.*;
import java.io.*;
class Graph
{
int V, E;
// V-> no. of vertices & E->no.of edges
Edge edge[]; // /collection of all edges
class Edge
{
int src, dest;
};
// Creates a graph with V vertices and E edges
Graph(int v,int e)
{
V = v;
E = e;
edge = new Edge[E];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// A utility function to find the subset of an element i
int find(int parent[], int i)
{
if (parent[i] == -1)
return i;
return find(parent, parent[i]);
}
// A utility function to do union of two subsets
void Union(int parent[], int x, int y)
{
int xset = find(parent, x);
int yset = find(parent, y);
parent[xset] = yset;
}
// The main function to check whether a given graph
// contains cycle or not
int isCycle( Graph graph)
{
// Allocate memory for creating V subsets
int parent[] = new int[graph.V];
// Initialize all subsets as single element sets
for(int i=0; i<graph.V; ++i)
parent[i]=-1;
// Iterate through all edges of graph, find subset of both
// vertices of every edge, if both subsets are same, then
// there is cycle in graph.
for (int i = 0; i < graph.E; ++i)
{
int x = graph.find(parent, graph.edge[i].src);
int y = graph.find(parent, graph.edge[i].dest);
if (x == y)
return 1;
graph.Union(parent, x, y);
}
return 0;
}
// Driver Method
public static void main (String[] args)
{
/* Let us create following graph
0
| \
|
\
1-----2 */
Graph graph = new Graph(3,3);
// add edge 0-1
graph.edge[0].src = 0;
graph.edge[0].dest = 1;
// add edge 1-2
graph.edge[1].src = 1;
graph.edge[1].dest = 2;
// add edge 0-2
graph.edge[2].src = 0;
graph.edge[2].dest = 2;
if (graph.isCycle(graph)==1)
System.out.println( "Graph contains cycle" );
else
System.out.println( "Graph doesn't contain cycle" );
}
}
Note that the implementation of union() and find() is naive and takes O(n) time in worst case. These methods can be improved to O(Logn) using
Union by Rank or Height. We will soon be discussing Union by Rank in a separate post.
We have discussed cycle detection for directed graph. We have also discussed a union-find algorithm for cycle detection in undirected graphs. The
time complexity of the union-find algorithm is O(ELogV). Like directed graphs, we can use DFS to detect cycle in an undirected graph in O(V+E)
time. We do a DFS traversal of the given graph. For every visited vertex v, if there is an adjacent u such that u is already visited and u is not parent
of v, then there is a cycle in graph. If we dont find such an adjacent for any vertex, we say that there is no cycle. The assumption of this approach
is that there are no parallel edges between any two vertices.
C++
// A C++ Program to detect cycle in an undirected graph
#include<iostream>
#include <list>
#include <limits.h>
using namespace std;
// Class for an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// Pointer to an array containing adjacency lists
bool isCyclicUtil(int v, bool visited[], int parent);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // to add an edge to graph
bool isCyclic(); // returns true if there is a cycle
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
adj[w].push_back(v); // Add v to ws list.
}
// A recursive function that uses visited[] and parent to detect
// cycle in subgraph reachable from vertex v.
bool Graph::isCyclicUtil(int v, bool visited[], int parent)
{
// Mark the current node as visited
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
{
// If an adjacent is not visited, then recur for that adjacent
if (!visited[*i])
{
if (isCyclicUtil(*i, visited, v))
return true;
}
// If an adjacent is visited and not parent of current vertex,
// then there is a cycle.
else if (*i != parent)
return true;
}
return false;
}
// Returns true if the graph contains a cycle, else false.
bool Graph::isCyclic()
{
// Mark all the vertices as not visited and not part of recursion
// stack
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to detect cycle in different
// DFS trees
for (int u = 0; u < V; u++)
if (!visited[u]) // Don't recur for u if it is already visited
if (isCyclicUtil(u, visited, -1))
return true;
return false;
}
// Driver program to test above functions
int main()
{
Graph g1(5);
g1.addEdge(1, 0);
g1.addEdge(0, 2);
g1.addEdge(2, 0);
g1.addEdge(0, 3);
g1.addEdge(3, 4);
g1.isCyclic()? cout << "Graph contains cycle\n":
cout << "Graph doesn't contain cycle\n";
Graph g2(3);
g2.addEdge(0, 1);
g2.addEdge(1, 2);
g2.isCyclic()? cout << "Graph contains cycle\n":
cout << "Graph doesn't contain cycle\n";
return 0;
}
Java
// A Java Program to detect cycle in an undirected graph
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; // Adjacency List Represntation
// Constructor
Graph(int v) {
V = v;
adj = new LinkedList[v];
for(int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
// Function to add an edge into the graph
void addEdge(int v,int w) {
adj[v].add(w);
adj[w].add(v);
}
// A recursive function that uses visited[] and parent to detect
// cycle in subgraph reachable from vertex v.
Boolean isCyclicUtil(int v, Boolean visited[], int parent)
{
// Mark the current node as visited
visited[v] = true;
Integer i;
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> it = adj[v].iterator();
while (it.hasNext())
{
i = it.next();
Output:
Graph contains cycle
Graph doesn't contain cycle
Time Complexity: The program does a simple DFS Traversal of graph and graph is represented using adjacency list. So the time complexity is
O(V+E)
Exercise: Can we use BFS to detect cycle in an undirected graph in O(V+E) time? What about directed graphs?
public:
AdjListNode(int _v, int _w) { v = _v; weight = _w;}
int getV()
{ return v; }
int getWeight() { return weight; }
};
// Class to represent a graph using adjacency list representation
class Graph
{
int V;
// No. of vertices
// Pointer to an array containing adjacency lists
list<AdjListNode> *adj;
// A function used by longestPath
void topologicalSortUtil(int v, bool visited[], stack<int> &Stack);
public:
Graph(int V); // Constructor
// function to add an edge to graph
void addEdge(int u, int v, int weight);
// Finds longest distances from given source vertex
void longestPath(int s);
};
Graph::Graph(int V) // Constructor
{
this->V = V;
adj = new list<AdjListNode>[V];
}
void Graph::addEdge(int u, int v, int weight)
{
AdjListNode node(v, weight);
adj[u].push_back(node); // Add v to us list
}
// A recursive function used by longestPath. See below link for details
// http://www.geeksforgeeks.org/topological-sorting/
void Graph::topologicalSortUtil(int v, bool visited[], stack<int> &Stack)
{
// Mark the current node as visited
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<AdjListNode>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
{
AdjListNode node = *i;
if (!visited[node.getV()])
topologicalSortUtil(node.getV(), visited, Stack);
}
// Push current vertex to stack which stores topological sort
Stack.push(v);
}
// The function to find longest distances from a given vertex. It uses
// recursive topologicalSortUtil() to get topological sorting.
void Graph::longestPath(int s)
{
stack<int> Stack;
int dist[V];
// Mark all the vertices as not visited
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to store Topological Sort
// starting from all vertices one by one
for (int i = 0; i < V; i++)
if (visited[i] == false)
topologicalSortUtil(i, visited, Stack);
// Initialize distances to all vertices as infinite and distance
// to source as 0
for (int i = 0; i < V; i++)
dist[i] = NINF;
dist[s] = 0;
Output:
Following are longest distances from source vertex 1
INF 0 2 9 8 10
Time Complexity: Time complexity of topological sorting is O(V+E). After finding topological order, the algorithm process all vertices and for
every vertex, it runs a loop for all adjacent vertices. Total adjacent vertices in a graph is O(E). So the inner loop runs O(V+E) times. Therefore,
overall time complexity of this algorithm is O(V+E).
Exercise: The above solution print longest distances, extend the code to print paths also.
Topological Sorting
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge uv, vertex u comes before
v in the ordering. Topological Sorting for a graph is not possible if the graph is not a DAG.
For example, a topological sorting of the following graph is 5 4 2 3 1 0?. There can be more than one topological sorting for a graph. For example,
another topological sorting of the following graph is 4 5 2 3 1 0?. The first vertex in topological sorting is always a vertex with in-degree as 0 (a
vertex with no in-coming edges).
C++
// A C++ program to print topological sorting of a DAG
#include<iostream>
#include <list>
#include <stack>
using namespace std;
// Class to represent a graph
class Graph
{
int V;
// No. of vertices'
// Pointer to an array containing adjacency listsList
list<int> *adj;
// A function used by topologicalSort
void topologicalSortUtil(int v, bool visited[], stack<int> &Stack);
public:
Graph(int V); // Constructor
// function to add an edge to graph
void addEdge(int v, int w);
// prints a Topological Sort of the complete graph
void topologicalSort();
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
Java
// A Java program to print topological sorting of a DAG
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency
// list representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; // Adjacency List
//Constructor
Graph(int v)
{
V = v;
Time Complexity: The above algorithm is simply DFS with an extra stack. So time complexity is same as DFS which is O(V+E).
Applications:
Topological Sorting is mainly used for scheduling jobs from the given dependencies among jobs. In computer science, applications of this type
arise in instruction scheduling, ordering of formula cell evaluation when recomputing formula values in spreadsheets, logic synthesis, determining the
order of compilation tasks to perform in makefiles, data serialization, and resolving symbol dependencies in linkers [2].
References:
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/GraphAlgor/topoSort.htm
http://en.wikipedia.org/wiki/Topological_sorting
A bipartite graph is possible if the graph coloring is possible using two colors such that vertices in a set are colored with the same color. Note that
it is possible to color a cycle graph with even cycle using two colors. For example, see the following graph.
It is not possible to color a cycle graph with odd cycle using two colors.
C++
// C++ program to find out whether a given graph is Bipartite or not
#include <iostream>
#include <queue>
#define V 4
using namespace std;
// This function returns true if graph G[V][V] is Bipartite, else false
bool isBipartite(int G[][V], int src)
{
// Create a color array to store colors assigned to all veritces. Vertex
// number is used as index in this array. The value '-1' of colorArr[i]
// is used to indicate that no color is assigned to vertex 'i'. The value
// 1 is used to indicate first color is assigned and value 0 indicates
// second color is assigned.
int colorArr[V];
for (int i = 0; i < V; ++i)
colorArr[i] = -1;
// Assign first color to source
colorArr[src] = 1;
// Create a queue (FIFO) of vertex numbers and enqueue source vertex
Java
// Java program to find out whether a given graph is Bipartite or not
import java.util.*;
import java.lang.*;
import java.io.*;
class Bipartite
{
final static int V = 4; // No. of Vertices
// This function returns true if graph G[V][V] is Bipartite, else false
boolean isBipartite(int G[][],int src)
{
// Create a color array to store colors assigned to all veritces.
// Vertex number is used as index in this array. The value '-1'
// of colorArr[i] is used to indicate that no color is assigned
// to vertex 'i'. The value 1 is used to indicate first color
// is assigned and value 0 indicates second color is assigned.
int colorArr[] = new int[V];
for (int i=0; i<V; ++i)
colorArr[i] = -1;
// Assign first color to source
colorArr[src] = 1;
// Create a queue (FIFO) of vertex numbers and enqueue
// source vertex for BFS traversal
LinkedList<Integer>q = new LinkedList<Integer>();
q.add(src);
// Run while there are vertices in queue (Similar to BFS)
while (q.size() != 0)
{
// Dequeue a vertex from queue
int u = q.poll();
// Find all non-colored adjacent vertices
for (int v=0; v<V; ++v)
{
// An edge from u to v exists and destination v is
// not colored
if (G[u][v]==1 && colorArr[v]==-1)
{
// Assign alternate color to this adjacent v of u
colorArr[v] = 1-colorArr[u];
q.add(v);
}
// An edge from u to v exists and destination v is
// colored with same color as u
else if (G[u][v]==1 && colorArr[v]==colorArr[u])
return false;
}
}
// If we reach here, then all adjacent vertices can
// be colored with alternate color
return true;
}
// Driver program to test above function
public static void main (String[] args)
{
int G[][] = {{0, 1, 0, 1},
{1, 0, 1, 0},
{0, 1, 0, 1},
{1, 0, 1, 0}
};
Bipartite b = new Bipartite();
if (b.isBipartite(G, 0))
System.out.println("Yes");
else
System.out.println("No");
}
}
// Contributed by Aakash Hasija
Yes
For example consider the board shown on right side (taken from here), the minimum number of dice throws required to reach cell 30 from cell 1 is
3. Following are steps.
a) First throw two on dice to reach cell number 3 and then ladder to reach 22
b) Then throw 6 to reach 28.
c) Finally through 2 to reach 30.
There can be other solutions as well like (2, 2, 6), (2, 4, 4), (2, 3, 5).. etc.
Following is C++ implementation of the above idea. The input is represented by two things, first is N which is number of cells in the given board,
second is an array move[0N-1] of size N. An entry move[i] is -1 if there is no snake and no ladder from i, otherwise move[i] contains index of
destination cell for the snake or the ladder at i.
// C++ program to find minimum number of dice throws required to
// reach last cell from first cell of a given snake and ladder
// board
#include<iostream>
#include <queue>
using namespace std;
// An entry in queue used in BFS
struct queueEntry
{
int v;
// Vertex number
int dist; // Distance of this vertex from source
};
// This function returns minimum number of dice throws required to
// Reach last cell from 0'th cell in a snake and ladder game.
// move[] is an array of size N where N is no. of cells on board
// If there is no snake or ladder from cell i, then move[i] is -1
// Otherwise move[i] contains cell to which snake or ladder at i
// takes to.
int getMinDiceThrows(int move[], int N)
{
// The graph has N vertices. Mark all the vertices as
// not visited
bool *visited = new bool[N];
for (int i = 0; i < N; i++)
visited[i] = false;
// Create a queue for BFS
queue<queueEntry> q;
// Mark the node 0 as visited and enqueue it.
visited[0] = true;
queueEntry s = {0, 0}; // distance of 0't vertex is also 0
q.push(s); // Enqueue 0'th vertex
// Do a BFS starting from vertex at index 0
queueEntry qe; // A queue entry (qe)
while (!q.empty())
{
qe = q.front();
int v = qe.v; // vertex no. of queue entry
=
=
=
=
0;
8;
3;
6;
cout << "Min Dice throws required is " << getMinDiceThrows(moves, N);
return 0;
}
Output:
Min Dice throws required is 3
Time complexity of the above solution is O(N) as every cell is added and removed only once from queue. And a typical enqueue or dequeue
operation takes O(1) time.
Biconnected Components
A biconnected component is a maximal biconnected subgraph.
Biconnected Graph is already discussed here. In this article, we will see how to find biconnected component in a graph using algorithm by John
Hopcroft and Robert Tarjan.
C++
// A C++ program to find biconnected components in a given undirected graph
#include<iostream>
#include <list>
#include <stack>
#define NIL -1
using namespace std;
int count = 0;
class Edge
{
public:
int u;
int v;
Edge(int u, int v);
};
Edge::Edge(int u, int v)
{
this->u = u;
this->v = v;
}
// A class that represents an directed graph
class Graph
{
int V;
// No. of vertices
int E;
// No. of edges
list<int> *adj;
// A dynamic array of adjacency lists
// A Recursive DFS based function used by BCC()
void BCCUtil(int u, int disc[], int low[],
list<Edge> *st, int parent[]);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void BCC();
// prints strongly connected components
};
Graph::Graph(int V)
{
this->V = V;
this->E = 0;
Java
// A Java program to find biconnected components in a given
// undirected graph
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency
// list representation
class Graph
{
private int V, E; // No. of vertices & Edges respectively
private LinkedList<Integer> adj[]; // Adjacency List
// Count is number of biconnected components. time is
// used to find discovery times
static int count = 0, time = 0;
class Edge
{
int u;
int v;
Edge(int u, int v)
{
this.u = u;
this.v = v;
}
};
//Constructor
Graph(int v)
{
V = v;
E = 0;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v,int w)
{
adj[v].add(w);
E++;
}
// A recursive function that finds and prints strongly connected
// components using DFS traversal
// u --> The vertex to be visited next
// disc[] --> Stores discovery times of visited vertices
// low[] -- >> earliest visited vertex (the vertex with minimum
//
discovery time) that can be reached from subtree
//
rooted with current vertex
// *st -- >> To store visited edges
void BCCUtil(int u, int disc[], int low[], LinkedList<Edge>st,
int parent[])
{
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
int children = 0;
// Go through all vertices adjacent to this
Iterator<Integer> it = adj[u].iterator();
while (it.hasNext())
{
int v = it.next(); // v is current adjacent of 'u'
// If v is not visited yet, then recur for it
if (disc[v] == -1)
{
children++;
parent[v] = u;
// store the edge in stack
st.add(new Edge(u,v));
BCCUtil(v, disc, low, st, parent);
//
//
//
if
// If u is an articulation point,
// pop all edges from stack till u -- v
if ( (disc[u] == 1 && children > 1) ||
(disc[u] > 1 && low[v] >= disc[u]) )
{
while (st.getLast().u != u || st.getLast().v != v)
{
System.out.print(st.getLast().u + "--" +
st.getLast().v + " ");
st.removeLast();
}
System.out.println(st.getLast().u + "--" +
st.getLast().v + " ");
st.removeLast();
count++;
}
}
// Update low value of 'u' only of 'v' is still in stack
// (i.e. it's a back edge, not cross edge).
// Case 2 -- per Strongly Connected Components Article
Output:
4--2 3--4
8--9
8--5 7--8
6--0 5--6
10--11
Above are
C++
// A C++ Program to check whether a graph is tree or not
#include<iostream>
#include <list>
#include <limits.h>
using namespace std;
// Class for an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj; // Pointer to an array for adjacency lists
bool isCyclicUtil(int v, bool visited[], int parent);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // to add an edge to graph
bool isTree(); // returns true if graph is tree
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
adj[w].push_back(v); // Add v to ws list.
}
// A recursive function that uses visited[] and parent to
// detect cycle in subgraph reachable from vertex v.
bool Graph::isCyclicUtil(int v, bool visited[], int parent)
{
// Mark the current node as visited
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
{
// If an adjacent is not visited, then recur for
// that adjacent
if (!visited[*i])
{
if (isCyclicUtil(*i, visited, v))
return true;
}
// If an adjacent is visited and not parent of current
// vertex, then there is a cycle.
else if (*i != parent)
return true;
}
return false;
}
// Returns true if the graph is a tree, else false.
bool Graph::isTree()
{
// Mark all the vertices as not visited and not part of
// recursion stack
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
//
//
//
//
if
Java
// A Java Program to check whether a graph is tree or not
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency
// list representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; //Adjacency List
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
// Function to add an edge into the graph
void addEdge(int v,int w)
{
adj[v].add(w);
adj[w].add(v);
}
// A recursive function that uses visited[] and parent
// to detect cycle in subgraph reachable from vertex v.
Boolean isCyclicUtil(int v, Boolean visited[], int parent)
{
// Mark the current node as visited
visited[v] = true;
Integer i;
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> it = adj[v].iterator();
while (it.hasNext())
{
i = it.next();
// If an adjacent is not visited, then recur for
// that adjacent
if (!visited[i])
{
if (isCyclicUtil(i, visited, v))
return true;
}
// If an adjacent is visited and not parent of
// current vertex, then there is a cycle.
else if (i != parent)
return true;
}
return false;
}
// Returns true if the graph is a tree, else false.
Boolean isTree()
{
// Mark all the vertices as not visited and not part
// of recursion stack
Boolean visited[] = new Boolean[V];
for (int i = 0; i < V; i++)
visited[i] = false;
//
//
//
//
if
g1.addEdge(1, 0);
g1.addEdge(0, 2);
g1.addEdge(0, 3);
g1.addEdge(3, 4);
if (g1.isTree())
System.out.println("Graph is Tree");
else
System.out.println("Graph is not Tree");
Graph g2 = new Graph(5);
g2.addEdge(1, 0);
g2.addEdge(0, 2);
g2.addEdge(2, 1);
g2.addEdge(0, 3);
g2.addEdge(3, 4);
if (g2.isTree())
System.out.println("Graph is Tree");
else
System.out.println("Graph is not Tree");
}
}
// This code is contributed by Aakash Hasija
Graph is Tree
Graph is not Tree
The set mstSet is initially empty and keys assigned to vertices are {0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite. Now
pick the vertex with minimum key value. The vertex 0 is picked, include it in mstSet. So mstSet becomes {0}. After including to mstSet, update
key values of adjacent vertices. Adjacent vertices of 0 are 1 and 7. The key values of 1 and 7 are updated as 4 and 8. Following subgraph shows
vertices and their key values, only the vertices with finite key values are shown. The vertices included in MST are shown in green color.
Pick the vertex with minimum key value and not already included in MST (not in mstSET). The vertex 1 is picked and added to mstSet. So mstSet
now becomes {0, 1}. Update the key values of adjacent vertices of 1. The key value of vertex 2 becomes 8.
Pick the vertex with minimum key value and not already included in MST (not in mstSET). We can either pick vertex 7 or vertex 2, let vertex 7 is
picked. So mstSet now becomes {0, 1, 7}. Update the key values of adjacent vertices of 7. The key value of vertex 6 and 8 becomes finite (7
and 1 respectively).
Pick the vertex with minimum key value and not already included in MST (not in mstSET). Vertex 6 is picked. So mstSet now becomes {0, 1, 7,
6}. Update the key values of adjacent vertices of 6. The key value of vertex 5 and 8 are updated.
We repeat the above steps until mstSet includes all vertices of given graph. Finally, we get the following graph.
C/C++
// A C / C++ program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph
#include <stdio.h>
#include <limits.h>
// Number of vertices in the graph
#define V 5
// A utility function to find the vertex with minimum key value, from
// the set of vertices not yet included in MST
int minKey(int key[], bool mstSet[])
{
// Initialize min value
int min = INT_MAX, min_index;
for (int v = 0; v < V; v++)
if (mstSet[v] == false && key[v] < min)
min = key[v], min_index = v;
return min_index;
}
// A utility function to print the constructed MST stored in parent[]
int printMST(int parent[], int n, int graph[V][V])
{
printf("Edge Weight\n");
for (int i = 1; i < V; i++)
printf("%d - %d
%d \n", parent[i], i, graph[i][parent[i]]);
}
// Function to construct and print MST for a graph represented using adjacency
// matrix representation
void primMST(int graph[V][V])
{
int parent[V]; // Array to store constructed MST
int key[V]; // Key values used to pick minimum weight edge in cut
bool mstSet[V]; // To represent set of vertices not yet included in MST
Java
// A Java program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph
import java.util.*;
import java.lang.*;
import java.io.*;
class MST
{
// Number of vertices in the graph
private static final int V=5;
// A utility function to find the vertex with minimum key
// value, from the set of vertices not yet included in MST
int minKey(int key[], Boolean mstSet[])
{
// Initialize min value
int min = Integer.MAX_VALUE, min_index=-1;
{
parent[v] = u;
key[v] = graph[u][v];
}
}
// print the constructed MST
printMST(parent, V, graph);
}
public static void main (String[] args)
{
/* Let us create the following graph
2
3
(0)--(1)--(2)
|
/ \ |
6| 8/ \5 |7
| /
\ |
(3)-------(4)
9
*/
MST t = new MST();
int graph[][] = new int[][] {{0, 2, 0, 6, 0},
{2, 0, 3, 8, 5},
{0, 3, 0, 0, 7},
{6, 8, 0, 0, 9},
{0, 5, 7, 9, 0},
};
// Print the solution
t.primMST(graph);
}
}
// This code is contributed by Aakash Hasija
Edge Weight
0 - 1
2
1 - 2
3
0 - 3
6
1 - 4
5
Time Complexity of the above program is O(V^2). If the input graph is represented using adjacency list, then the time complexity of Prims
algorithm can be reduced to O(E log V) with the help of binary heap. Please see Prims MST for Adjacency List Representation for more details.
Network design.
telephone, electrical, hydraulic, TV cable, computer, road
The standard application is to a problem like phone network design. You have a business with several offices; you want to lease phone lines to
connect them up with each other; and the phone company charges different amounts of money to connect different pairs of cities. You want a set
of lines that connects all your offices with a minimum total cost. It should be a spanning tree, since if a network isnt a tree you can always remove
some edges and save money.
Approximation algorithms for NP-hard problems.
traveling salesperson problem, Steiner tree
A less obvious application is that the minimum spanning tree can be used to approximately solve the traveling salesman problem. A convenient
formal way of defining this problem is to find the shortest path that visits each point at least once.
Note that if you have a path visiting all points exactly once, its a special kind of tree. For instance in the example above, twelve of sixteen spanning
trees are actually paths. If you have a path visiting some vertices more than once, you can always drop some edges to get a tree. So in general the
MST weight is less than the TSP weight, because its a minimization over a strictly larger set.
On the other hand, if you draw a path tracing around the minimum spanning tree, you trace each edge twice and visit all points, so the TSP weight
is less than twice the MST weight. Therefore this tour is within a factor of two of optimal.
Indirect applications.
max bottleneck paths
LDPC codes for error correction
image registration with Renyi entropy
learning salient features for real-time face verification
reducing data storage in sequencing amino acids in a protein
model locality of particle interactions in turbulent fluid flows
autoconfig protocol for Ethernet bridging to avoid cycles in a network
Cluster analysis
k clustering problem can be viewed as finding an MST and deleting the k-1 most
expensive edges.
Sources:
http://www.cs.princeton.edu/courses/archive/spr07/cos226/lectures/mst.pdf
http://www.ics.uci.edu/~eppstein/161/960206.html
Initially, key value of first vertex is 0 and INF (infinite) for all other vertices. So vertex 0 is extracted from Min Heap and key values of vertices
adjacent to 0 (1 and 7) are updated. Min Heap contains all vertices except vertex 0.
The vertices in green color are the vertices included in MST.
Since key value of vertex 1 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 1 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 1 to the adjacent). Min
Heap contains all vertices except vertex 0 and 1.
Since key value of vertex 7 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 7 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 7 to the adjacent). Min
Heap contains all vertices except vertex 0, 1 and 7.
Since key value of vertex 6 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 6 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 6 to the adjacent). Min
Heap contains all vertices except vertex 0, 1, 7 and 6.
The above steps are repeated for rest of the nodes in Min Heap till Min Heap becomes empty
// C / C++ program for Prim's MST for adjacency list representation of graph
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
// A structure to represent a node in adjacency list
struct AdjListNode
{
int dest;
int weight;
struct AdjListNode* next;
};
// A structure to represent an adjacency liat
struct AdjList
{
struct AdjListNode *head; // pointer to head node of list
};
// A structure to represent a graph. A graph is an array of adjacency lists.
// Size of array will be V (number of vertices in graph)
struct Graph
{
int V;
struct AdjList* array;
};
// A utility function to create a new adjacency list node
struct AdjListNode* newAdjListNode(int dest, int weight)
{
struct AdjListNode* newNode =
(struct AdjListNode*) malloc(sizeof(struct AdjListNode));
newNode->dest = dest;
newNode->weight = weight;
newNode->next = NULL;
return newNode;
}
// A utility function that creates a graph of V vertices
struct Graph* createGraph(int V)
{
struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph));
graph->V = V;
// Create an array of adjacency lists. Size of array will be V
graph->array = (struct AdjList*) malloc(V * sizeof(struct AdjList));
// Initialize each adjacency list as empty by making head as NULL
for (int i = 0; i < V; ++i)
graph->array[i].head = NULL;
return graph;
}
// Adds an edge to an undirected graph
void addEdge(struct Graph* graph, int src, int dest, int weight)
{
// Add an edge from src to dest. A new node is added to the adjacency
// list of src. The node is added at the begining
struct AdjListNode* newNode = newAdjListNode(dest, weight);
newNode->next = graph->array[src].head;
graph->array[src].head = newNode;
// Since graph is undirected, add an edge from dest to src also
newNode = newAdjListNode(src, weight);
newNode->next = graph->array[dest].head;
graph->array[dest].head = newNode;
}
// Structure to represent a min heap node
struct MinHeapNode
{
int v;
int key;
};
// Structure to represent a min heap
struct MinHeap
{
int size;
// Number of heap nodes present currently
int capacity; // Capacity of min heap
int *pos;
// This is needed for decreaseKey()
struct MinHeapNode **array;
};
// A utility function to create a new Min Heap Node
struct MinHeapNode* newMinHeapNode(int v, int key)
{
struct MinHeapNode* minHeapNode =
(struct MinHeapNode*) malloc(sizeof(struct MinHeapNode));
minHeapNode->v = v;
minHeapNode->key = key;
return minHeapNode;
}
// A utilit function to create a Min Heap
struct MinHeap* createMinHeap(int capacity)
{
struct MinHeap* minHeap =
(struct MinHeap*) malloc(sizeof(struct MinHeap));
minHeap->pos = (int *)malloc(capacity * sizeof(int));
minHeap->size = 0;
minHeap->capacity = capacity;
minHeap->array =
(struct MinHeapNode**) malloc(capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// A utility function to swap two nodes of min heap. Needed for min heapify
void swapMinHeapNode(struct MinHeapNode** a, struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// A standard function to heapify at given idx
// This function also updates position of nodes when they are swapped.
// Position is needed for decreaseKey()
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest, left, right;
smallest = idx;
left = 2 * idx + 1;
right = 2 * idx + 2;
if (left < minHeap->size &&
minHeap->array[left]->key < minHeap->array[smallest]->key )
smallest = left;
if (right < minHeap->size &&
minHeap->array[right]->key < minHeap->array[smallest]->key )
smallest = right;
if (smallest != idx)
{
// The nodes to be swapped in min heap
MinHeapNode *smallestNode = minHeap->array[smallest];
MinHeapNode *idxNode = minHeap->array[idx];
// Swap positions
minHeap->pos[smallestNode->v] = idx;
minHeap->pos[idxNode->v] = smallest;
// Swap nodes
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if the given minHeap is ampty or not
int isEmpty(struct MinHeap* minHeap)
{
return minHeap->size == 0;
}
// Standard function to extract minimum node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
if (isEmpty(minHeap))
return NULL;
// Store the root node
struct MinHeapNode* root = minHeap->array[0];
// Replace root node with last node
struct MinHeapNode* lastNode = minHeap->array[minHeap->size - 1];
minHeap->array[0] = lastNode;
// Update position of last node
minHeap->pos[root->v] = minHeap->size-1;
minHeap->pos[lastNode->v] = 0;
// Reduce heap size and heapify root
--minHeap->size;
minHeapify(minHeap, 0);
return root;
}
// Function to decreasy key value of a given vertex v. This function
// uses pos[] of min heap to get the current index of node in min heap
void decreaseKey(struct MinHeap* minHeap, int v, int key)
{
// Get the index of v in heap array
int i = minHeap->pos[v];
// Get the node and update its key value
minHeap->array[i]->key = key;
// Travel up while the complete tree is not hepified.
// This is a O(Logn) loop
while (i && minHeap->array[i]->key < minHeap->array[(i - 1) / 2]->key)
{
// Swap this node with its parent
minHeap->pos[minHeap->array[i]->v] = (i-1)/2;
minHeap->pos[minHeap->array[(i-1)/2]->v] = i;
swapMinHeapNode(&minHeap->array[i], &minHeap->array[(i - 1) / 2]);
// move to parent index
i = (i - 1) / 2;
}
}
// A utility function to check if a given vertex
// 'v' is in min heap or not
bool isInMinHeap(struct MinHeap *minHeap, int v)
{
if (minHeap->pos[v] < minHeap->size)
return true;
return false;
}
// A utility function used to print the constructed MST
void printArr(int arr[], int n)
{
for (int i = 1; i < n; ++i)
printf("%d - %d\n", arr[i], i);
}
// The main function that constructs Minimum Spanning Tree (MST)
// using Prim's algorithm
}
pCrawl = pCrawl->next;
}
}
// print edges of MST
printArr(parent, V);
}
// Driver program to test above functions
int main()
{
// Let us create the graph given in above fugure
int V = 9;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1, 4);
addEdge(graph, 0, 7, 8);
addEdge(graph, 1, 2, 8);
addEdge(graph, 1, 7, 11);
addEdge(graph, 2, 3, 7);
addEdge(graph, 2, 8, 2);
addEdge(graph, 2, 5, 4);
addEdge(graph, 3, 4, 9);
addEdge(graph, 3, 5, 14);
addEdge(graph, 4, 5, 10);
addEdge(graph, 5, 6, 2);
addEdge(graph, 6, 7, 1);
addEdge(graph, 6, 8, 6);
addEdge(graph, 7, 8, 7);
PrimMST(graph);
return 0;
}
Output:
0
5
2
3
6
7
0
2
1
2
3
4
5
6
7
8
Time Complexity: The time complexity of the above code/algorithm looks O(V^2) as there are two nested while loops. If we take a closer look,
we can observe that the statements in inner loop are executed O(V+E) times (similar to BFS). The inner loop has decreaseKey() operation which
takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV) (For a connected graph, V =
O(E))
References:
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
http://en.wikipedia.org/wiki/Prims_algorithm
The step#2 uses Union-Find algorithm to detect cycle. So we recommend to read following post as a prerequisite.
Union-Find Algorithm | Set 1 (Detect Cycle in a Graph)
Union-Find Algorithm | Set 2 (Union By Rank and Path Compression)
The algorithm is a Greedy Algorithm. The Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed
so far. Let us understand it with an example: Consider the below input graph.
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9 1) = 8 edges.
After sorting:
Weight Src
Dest
1
7
6
2
8
2
2
6
5
4
0
1
4
2
5
6
8
6
7
2
3
7
7
8
8
0
7
8
1
2
9
3
4
10
5
4
11
1
7
14
3
5
Now pick all edges one by one from sorted list of edges
1. Pick edge 7-6: No cycle is formed, include it.
6. Pick edge 8-6: Since including this edge results in cycle, discard it.
7. Pick edge 2-3: No cycle is formed, include it.
8. Pick edge 7-8: Since including this edge results in cycle, discard it.
9. Pick edge 0-7: No cycle is formed, include it.
10. Pick edge 1-2: Since including this edge results in cycle, discard it.
11. Pick edge 3-4: No cycle is formed, include it.
Since the number of edges included equals (V 1), the algorithm stops here.
C/C++
// C++ program for Kruskal's algorithm to find Minimum Spanning Tree
// of a given connected, undirected and weighted graph
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// a structure to represent a weighted edge in graph
struct Edge
{
int src, dest, weight;
};
// a structure to represent a connected, undirected and weighted graph
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges. Since the graph is
// undirected, the edge from src to dest is also edge from dest
// to src. Both are counted as 1 edge here.
struct Edge* edge;
};
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
struct Graph* graph = (struct Graph*) malloc( sizeof(struct Graph) );
graph->V = V;
graph->E = E;
graph->edge = (struct Edge*) malloc( graph->E * sizeof( struct Edge ) );
return graph;
}
// A structure to represent a subset for union-find
struct subset
{
int parent;
int rank;
};
// A utility function to find set of an element i
// (uses path compression technique)
int find(struct subset subsets[], int i)
{
// find root and make root as parent of i (path compression)
if (subsets[i].parent != i)
subsets[i].parent = find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(struct subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high rank tree
// (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and increment
// its rank by one
else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
// Compare two edges according to their weights.
// Used in qsort() for sorting an array of edges
int myComp(const void* a, const void* b)
{
struct Edge* a1 = (struct Edge*)a;
struct Edge* b1 = (struct Edge*)b;
return a1->weight > b1->weight;
}
// The main function to construct MST using Kruskal's algorithm
void KruskalMST(struct Graph* graph)
{
int V = graph->V;
struct Edge result[V]; // Tnis will store the resultant MST
graph->edge[3].weight = 15;
// add edge 2-3
graph->edge[4].src = 2;
graph->edge[4].dest = 3;
graph->edge[4].weight = 4;
KruskalMST(graph);
return 0;
}
Java
// Java program for Kruskal's algorithm to find Minimum Spanning Tree
// of a given connected, undirected and weighted graph
import java.util.*;
import java.lang.*;
import java.io.*;
class Graph
{
// A class to represent a graph edge
class Edge implements Comparable<Edge>
{
int src, dest, weight;
// Comparator function used for sorting edges based on
// their weight
public int compareTo(Edge compareEdge)
{
return this.weight-compareEdge.weight;
}
};
// A class to represent a subset for union-find
class subset
{
int parent, rank;
};
int V, E;
// V-> no. of vertices & E->no.of edges
Edge edge[]; // collection of all edges
// Creates a graph with V vertices and E edges
Graph(int v, int e)
{
V = v;
E = e;
edge = new Edge[E];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// A utility function to find set of an element i
// (uses path compression technique)
int find(subset subsets[], int i)
{
// find root and make root as parent of i (path compression)
if (subsets[i].parent != i)
subsets[i].parent = find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high rank tree
// (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and increment
graph.edge[0].src = 0;
graph.edge[0].dest = 1;
graph.edge[0].weight = 10;
// add edge 0-2
graph.edge[1].src = 0;
graph.edge[1].dest = 2;
graph.edge[1].weight = 6;
// add edge 0-3
graph.edge[2].src = 0;
graph.edge[2].dest = 3;
graph.edge[2].weight = 5;
// add edge 1-3
graph.edge[3].src = 1;
graph.edge[3].dest = 3;
graph.edge[3].weight = 15;
// add edge 2-3
graph.edge[4].src = 2;
graph.edge[4].dest = 3;
graph.edge[4].weight = 4;
graph.KruskalMST();
}
}
//This code is contributed by Aakash Hasija
Following
2 -- 3 ==
0 -- 3 ==
0 -- 1 ==
Time Complexity: O(ElogE) or O(ElogV). Sorting of edges takes O(ELogE) time. After sorting, we iterate through all edges and apply findunion algorithm. The find and union operations can take atmost O(LogV) time. So overall complexity is O(ELogE + ELogV) time. The value of E
can be atmost V^2, so O(LogV) are O(LogE) same. Therefore, overall time complexity is O(ElogE) or O(ElogV)
References:
http://www.ics.uci.edu/~eppstein/161/960206.html
http://en.wikipedia.org/wiki/Minimum_spanning_tree
Below is the idea behind above algorithm (The idea is same as Prims MST algorithm).
A spanning tree means all vertices must be connected. So the two disjoint subsets (discussed above) of vertices must be connected to
make a Spanning Tree. And they must be connected with the minimum weight edge to make it a Minimum Spanning Tree.
Let us understand the algorithm with below example.
Initially MST is empty. Every vertex is singe component as highlighted in blue color in below diagram.
For every component, find the cheapest edge that connects it to some other component.
Component
{0}
{1}
{2}
{3}
{4}
{5}
{6}
{7}
{8}
The cheapest edges are highlighted with green color. Now MST becomes {0-1, 2-8, 2-3, 3-4, 5-6, 6-7}.
After above step, components are {{0,1}, {2,3,4,8}, {5,6,7}}. The components are encircled with blue color.
We again repeat the step, i.e., for every component, find the cheapest edge that connects it to some other component.
Component
{0,1}
{2,3,4,8}
{5,6,7}
The cheapest edges are highlighted with green color. Now MST becomes {0-1, 2-8, 2-3, 3-4, 5-6, 6-7, 1-2, 2-5}
At this stage, there is only one component {0, 1, 2, 3, 4, 5, 6, 7, 8} which has all edges. Since there is only one component left, we stop and
return MST.
Implementation:
Below is C++ implementation of above algorithm. The input graph is represented as a collection of edges and union-find data structure is used to
keep track of components.
// Boruvka's algorithm to find Minimum Spanning
// Tree of a given connected, undirected and
// weighted graph
#include <stdio.h>
// a structure to represent a weighted edge in graph
struct Edge
{
int src, dest, weight;
};
// a structure to represent a connected, undirected
// and weighted graph as a collection of edges.
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges.
// Since the graph is undirected, the edge
// from src to dest is also edge from dest
// to src. Both are counted as 1 edge here.
Edge* edge;
};
// A structure to represent a subset for union-find
struct subset
{
int parent;
int rank;
};
// Function prototypes for union-find (These functions are defined
// after boruvkaMST() )
int find(struct subset subsets[], int i);
void Union(struct subset subsets[], int x, int y);
// The main function for MST using Boruvka's algorithm
void boruvkaMST(struct Graph* graph)
{
// Get data of given graph
int V = graph->V, E = graph->E;
Edge *edge = graph->edge;
graph->edge[3].weight = 15;
// add edge 2-3
graph->edge[4].src = 2;
graph->edge[4].dest = 3;
graph->edge[4].weight = 4;
boruvkaMST(graph);
return 0;
}
Output:
Edge 0-3 included in MST
Edge 0-1 included in MST
Edge 2-3 included in MST
Weight of MST is 19
The set sptSetis initially empty and distances assigned to vertices are {0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite.
Now pick the vertex with minimum distance value. The vertex 0 is picked, include it in sptSet. So sptSet becomes {0}. After including 0 to
sptSet, update distance values of its adjacent vertices. Adjacent vertices of 0 are 1 and 7. The distance values of 1 and 7 are updated as 4 and 8.
Following subgraph shows vertices and their distance values, only the vertices with finite distance values are shown. The vertices included in SPT
are shown in green color.
Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). The vertex 1 is picked and added to sptSet. So
sptSet now becomes {0, 1}. Update the distance values of adjacent vertices of 1. The distance value of vertex 2 becomes 12.
Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1,
7}. Update the distance values of adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9 respectively).
Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1,
7, 6}. Update the distance values of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.
We repeat the above steps until sptSet doesnt include all vertices of given graph. Finally, we get the following Shortest Path Tree (SPT).
C/C++
// A C / C++ program for Dijkstra's single source shortest path algorithm.
// The program is for adjacency matrix representation of the graph
#include <stdio.h>
#include <limits.h>
// Number of vertices in the graph
#define V 9
// A utility function to find the vertex with minimum distance value, from
// the set of vertices not yet included in shortest path tree
int minDistance(int dist[], bool sptSet[])
{
// Initialize min value
int min = INT_MAX, min_index;
for (int v = 0; v < V; v++)
if (sptSet[v] == false && dist[v] <= min)
min = dist[v], min_index = v;
return min_index;
}
// A utility function to print the constructed distance array
int printSolution(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < V; i++)
printf("%d \t\t %d\n", i, dist[i]);
}
// Funtion that implements Dijkstra's single source shortest path algorithm
// for a graph represented using adjacency matrix representation
void dijkstra(int graph[V][V], int src)
{
int dist[V];
// The output array. dist[i] will hold the shortest
// distance from src to i
bool sptSet[V]; // sptSet[i] will true if vertex i is included in shortest
// path tree or shortest distance from src to i is finalized
// Initialize all distances as INFINITE and stpSet[] as false
for (int i = 0; i < V; i++)
dist[i] = INT_MAX, sptSet[i] = false;
// Distance of source vertex from itself is always 0
dist[src] = 0;
// Find shortest path for all vertices
for (int count = 0; count < V-1; count++)
{
// Pick the minimum distance vertex from the set of vertices not
}
// print the constructed distance array
printSolution(dist, V);
}
// driver program to test above function
int main()
{
/* Let us create the example graph discussed above */
int graph[V][V] = {{0, 4, 0, 0, 0, 0, 0, 8, 0},
{4, 0, 8, 0, 0, 0, 0, 11, 0},
{0, 8, 0, 7, 0, 4, 0, 0, 2},
{0, 0, 7, 0, 9, 14, 0, 0, 0},
{0, 0, 0, 9, 0, 10, 0, 0, 0},
{0, 0, 4, 0, 10, 0, 2, 0, 0},
{0, 0, 0, 14, 0, 2, 0, 1, 6},
{8, 11, 0, 0, 0, 0, 1, 0, 7},
{0, 0, 2, 0, 0, 0, 6, 7, 0}
};
dijkstra(graph, 0);
return 0;
}
Java
// A Java program for Dijkstra's single source shortest path algorithm.
// The program is for adjacency matrix representation of the graph
import java.util.*;
import java.lang.*;
import java.io.*;
class ShortestPath
{
// A utility function to find the vertex with minimum distance value,
// from the set of vertices not yet included in shortest path tree
static final int V=9;
int minDistance(int dist[], Boolean sptSet[])
{
// Initialize min value
int min = Integer.MAX_VALUE, min_index=-1;
for (int v = 0; v < V; v++)
if (sptSet[v] == false && dist[v] <= min)
{
min = dist[v];
min_index = v;
}
return min_index;
}
// A utility function to print the constructed distance array
void printSolution(int dist[], int n)
{
System.out.println("Vertex Distance from Source");
for (int i = 0; i < V; i++)
System.out.println(i+" \t\t "+dist[i]);
}
// Funtion that implements Dijkstra's single source shortest path
// algorithm for a graph represented using adjacency matrix
// representation
}
// print the constructed distance array
printSolution(dist, V);
}
// Driver method
public static void main (String[] args)
{
/* Let us create the example graph discussed above */
int graph[][] = new int[][]{{0, 4, 0, 0, 0, 0, 0, 8, 0},
{4, 0, 8, 0, 0, 0, 0, 11, 0},
{0, 8, 0, 7, 0, 4, 0, 0, 2},
{0, 0, 7, 0, 9, 14, 0, 0, 0},
{0, 0, 0, 9, 0, 10, 0, 0, 0},
{0, 0, 4, 0, 10, 0, 2, 0, 0},
{0, 0, 0, 14, 0, 2, 0, 1, 6},
{8, 11, 0, 0, 0, 0, 1, 0, 7},
{0, 0, 2, 0, 0, 0, 6, 7, 0}
};
ShortestPath t = new ShortestPath();
t.dijkstra(graph, 0);
}
}
//This code is contributed by Aakash Hasija
Vertex
0
1
2
3
4
5
6
7
8
Notes:
1) The code calculates shortest distance, but doesnt calculate the path information. We can create a parent array, update the parent array when
distance is updated (like prims implementation) and use it show the shortest path from source to different vertices.
2) The code is for undirected graph, same dijekstra function can be used for directed graphs also.
3) The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from source to a single target, we can
break the for loop when the picked minimum distance vertex is equal to target (Step 3.a of algorithm).
4) Time Complexity of the implementation is O(V^2). If the input graph is represented using adjacency list, it can be reduced to O(E log V) with
the help of binary heap. Please see
Dijkstras Algorithm for Adjacency List Representation for more details.
5) Dijkstras algorithm doesnt work for graphs with negative weight edges. For graphs with negative weight edges, BellmanFord algorithm can be
used, we will soon be discussing it as a separate post.
Dijkstras Algorithm for Adjacency List Representation
Initially, distance value of source vertex is 0 and INF (infinite) for all other vertices. So source vertex is extracted from Min Heap and distance
values of vertices adjacent to 0 (1 and 7) are updated. Min Heap contains all vertices except vertex 0.
The vertices in green color are the vertices for which minimum distances are finalized and are not in Min Heap
Since distance value of vertex 1 is minimum among all nodes in Min Heap, it is extracted from Min Heap and distance values of vertices adjacent
to 1 are updated (distance is updated if the a vertex is not in Min Heap and distance through 1 is shorter than the previous distance). Min Heap
contains all vertices except vertex 0 and 1.
Pick the vertex with minimum distance value from min heap. Vertex 7 is picked. So min heap now contains all vertices except 0, 1 and 7. Update
the distance values of adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9 respectively).
Pick the vertex with minimum distance from min heap. Vertex 6 is picked. So min heap now contains all vertices except 0, 1, 7 and 6. Update the
distance values of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.
Above steps are repeated till min heap doesnt become empty. Finally, we get the following shortest path tree.
{
// The nodes to be swapped in min heap
MinHeapNode *smallestNode = minHeap->array[smallest];
MinHeapNode *idxNode = minHeap->array[idx];
// Swap positions
minHeap->pos[smallestNode->v] = idx;
minHeap->pos[idxNode->v] = smallest;
// Swap nodes
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if the given minHeap is ampty or not
int isEmpty(struct MinHeap* minHeap)
{
return minHeap->size == 0;
}
// Standard function to extract minimum node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
if (isEmpty(minHeap))
return NULL;
// Store the root node
struct MinHeapNode* root = minHeap->array[0];
// Replace root node with last node
struct MinHeapNode* lastNode = minHeap->array[minHeap->size - 1];
minHeap->array[0] = lastNode;
// Update position of last node
minHeap->pos[root->v] = minHeap->size-1;
minHeap->pos[lastNode->v] = 0;
// Reduce heap size and heapify root
--minHeap->size;
minHeapify(minHeap, 0);
return root;
}
// Function to decreasy dist value of a given vertex v. This function
// uses pos[] of min heap to get the current index of node in min heap
void decreaseKey(struct MinHeap* minHeap, int v, int dist)
{
// Get the index of v in heap array
int i = minHeap->pos[v];
// Get the node and update its dist value
minHeap->array[i]->dist = dist;
// Travel up while the complete tree is not hepified.
// This is a O(Logn) loop
while (i && minHeap->array[i]->dist < minHeap->array[(i - 1) / 2]->dist)
{
// Swap this node with its parent
minHeap->pos[minHeap->array[i]->v] = (i-1)/2;
minHeap->pos[minHeap->array[(i-1)/2]->v] = i;
swapMinHeapNode(&minHeap->array[i], &minHeap->array[(i - 1) / 2]);
// move to parent index
i = (i - 1) / 2;
}
}
// A utility function to check if a given vertex
// 'v' is in min heap or not
bool isInMinHeap(struct MinHeap *minHeap, int v)
{
if (minHeap->pos[v] < minHeap->size)
return true;
return false;
}
// A utility function used to print the solution
void printArr(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < n; ++i)
printf("%d \t\t %d\n", i, dist[i]);
}
// The main function that calulates distances of shortest paths from src to all
// vertices. It is a O(ELogV) function
void dijkstra(struct Graph* graph, int src)
{
int V = graph->V;// Get the number of vertices in graph
int dist[V];
// dist values used to pick minimum weight edge in cut
// minHeap represents set E
struct MinHeap* minHeap = createMinHeap(V);
// Initialize min heap with all vertices. dist value of all vertices
for (int v = 0; v < V; ++v)
{
dist[v] = INT_MAX;
minHeap->array[v] = newMinHeapNode(v, dist[v]);
minHeap->pos[v] = v;
}
// Make dist value of src vertex as 0 so that it is extracted first
minHeap->array[src] = newMinHeapNode(src, dist[src]);
minHeap->pos[src] = src;
dist[src] = 0;
decreaseKey(minHeap, src, dist[src]);
// Initially size of min heap is equal to V
minHeap->size = V;
// In the followin loop, min heap contains all nodes
// whose shortest distance is not yet finalized.
while (!isEmpty(minHeap))
{
// Extract the vertex with minimum distance value
struct MinHeapNode* minHeapNode = extractMin(minHeap);
int u = minHeapNode->v; // Store the extracted vertex number
// Traverse through all adjacent vertices of u (the extracted
// vertex) and update their distance values
struct AdjListNode* pCrawl = graph->array[u].head;
while (pCrawl != NULL)
{
int v = pCrawl->dest;
// If shortest distance to v is not finalized yet, and distance to v
// through u is less than its previously calculated distance
if (isInMinHeap(minHeap, v) && dist[u] != INT_MAX &&
pCrawl->weight + dist[u] < dist[v])
{
dist[v] = dist[u] + pCrawl->weight;
// update distance value in min heap also
decreaseKey(minHeap, v, dist[v]);
}
pCrawl = pCrawl->next;
}
}
// print the calculated shortest distances
printArr(dist, V);
}
// Driver program to test above functions
int main()
{
// create the graph given in above fugure
int V = 9;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1, 4);
addEdge(graph, 0, 7, 8);
addEdge(graph, 1, 2, 8);
addEdge(graph, 1, 7, 11);
addEdge(graph, 2, 3, 7);
addEdge(graph, 2, 8, 2);
addEdge(graph, 2, 5, 4);
addEdge(graph, 3, 4, 9);
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
3,
4,
5,
6,
6,
7,
5,
5,
6,
7,
8,
8,
14);
10);
2);
1);
6);
7);
dijkstra(graph, 0);
return 0;
}
Output:
Vertex
0
1
2
3
4
5
6
7
8
Time Complexity:The time complexity of the above code/algorithm looks O(V^2) as there are two nested while loops. If we take a closer look,
we can observe that the statements in inner loop are executed O(V+E) times (similar to BFS). The inner loop has decreaseKey() operation which
takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV)
Note that the above code uses Binary Heap for Priority Queue implementation. Time complexity can be reduced to O(E + VLogV) using
Fibonacci Heap. The reason is, Fibonacci Heap takes O(1) time for decrease-key operation while Binary Heap takes O(Logn) time.
Notes:
1)The code calculates shortest distance, but doesnt calculate the path information. We can create a parent array, update the parent array when
distance is updated (likeprims implementation) and use it show the shortest path from source to different vertices.
2)The code is for undirected graph, same dijekstra function can be used for directed graphs also.
3)The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from source to a single target, we can
break the for loop when the picked minimum distance vertex is equal to target (Step 3.a of algorithm).
4)Dijkstras algorithm doesnt work for graphs with negative weight edges. For graphs with negative weight edges,BellmanFord algorithmcan be
used, we will soon be discussing it as a separate post.
References:
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Algorithms by Sanjoy Dasgupta, Christos Papadimitriou, Umesh Vazirani
Let all edges are processed in following order: (B,E), (D,B), (B,D), (A,B), (A,C), (D,C), (B,C), (E,D). We get following distances when all edges
are processed first time. The first row in shows initial distances. The second row shows distances when edges (B,E), (D,B), (B,D) and (A,B) are
processed. The third row shows distances when (A,C) is processed. The fourth row shows when (D,C), (B,C) and (E,D) are processed.
The first iteration guarantees to give all shortest paths which are at most 1 edge long. We get following distances when all edges are processed
second time (The last row shows final values).
The second iteration guarantees to give all shortest paths which are at most 2 edges long. The algorithm processes all edges 2 more times. The
distances are minimized after the second iteration, so third and fourth iterations dont update the distances.
Implementation:
C++
// A C / C++ program for Bellman-Ford's single source
// shortest path algorithm.
#include
#include
#include
#include
<stdio.h>
<stdlib.h>
<string.h>
<limits.h>
BellmanFord(graph, 0);
return 0;
}
Java
// A Java program for Bellman-Ford's single source shortest path
// algorithm.
import java.util.*;
import java.lang.*;
import java.io.*;
// A class to represent a connected, directed and weighted graph
class Graph
{
// A class to represent a weighted edge in graph
class Edge {
int src, dest, weight;
Edge() {
src = dest = weight = 0;
}
};
int V, E;
Edge edge[];
// Creates a graph with V vertices and E edges
Graph(int v, int e)
{
V = v;
E = e;
edge = new Edge[e];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// The main function that finds shortest distances from src
// to all other vertices using Bellman-Ford algorithm. The
// function also detects negative weight cycle
void BellmanFord(Graph graph,int src)
{
int V = graph.V, E = graph.E;
int dist[] = new int[V];
// Step 1: Initialize distances from src to all other
// vertices as INFINITE
for (int i=0; i<V; ++i)
dist[i] = Integer.MAX_VALUE;
dist[src] = 0;
// Step 2: Relax all edges |V| - 1 times. A simple
// shortest path from src to any other vertex can
// have at-most |V| - 1 edges
for (int i=1; i<V; ++i)
{
for (int j=0; j<E; ++j)
{
int u = graph.edge[j].src;
int v = graph.edge[j].dest;
int weight = graph.edge[j].weight;
if (dist[u]!=Integer.MAX_VALUE &&
dist[u]+weight<dist[v])
dist[v]=dist[u]+weight;
}
}
// Step 3: check for negative-weight cycles. The above
// step guarantees shortest distances if graph doesn't
// contain negative weight cycle. If we get a shorter
// path, then there is a cycle.
for (int j=0; j<E; ++j)
{
int u = graph.edge[j].src;
int v = graph.edge[j].dest;
int weight = graph.edge[j].weight;
if (dist[u]!=Integer.MAX_VALUE &&
dist[u]+weight<dist[v])
System.out.println("Graph contains negative weight cycle");
}
printArr(dist, V);
}
// A utility function used to print the solution
void printArr(int dist[], int V)
{
System.out.println("Vertex Distance from Source");
for (int i=0; i<V; ++i)
System.out.println(i+"\t\t"+dist[i]);
}
// Driver method to test above function
public static void main(String[] args)
{
int V = 5; // Number of vertices in graph
int E = 8; // Number of edges in graph
Graph graph = new Graph(V, E);
// add edge 0-1 (or A-B in above figure)
graph.edge[0].src = 0;
graph.edge[0].dest = 1;
graph.edge[0].weight = -1;
// add edge 0-2 (or A-C in above figure)
graph.edge[1].src = 0;
graph.edge[1].dest = 2;
graph.edge[1].weight = 4;
// add edge 1-2 (or B-C in above figure)
graph.edge[2].src = 1;
graph.edge[2].dest = 2;
graph.edge[2].weight = 3;
// add edge 1-3 (or B-D in above figure)
graph.edge[3].src = 1;
graph.edge[3].dest = 3;
graph.edge[3].weight = 2;
// add edge 1-4 (or A-E in above figure)
graph.edge[4].src = 1;
graph.edge[4].dest = 4;
graph.edge[4].weight = 2;
// add edge 3-2 (or D-C in above figure)
graph.edge[5].src = 3;
graph.edge[5].dest = 2;
graph.edge[5].weight = 5;
// add edge 3-1 (or D-B in above figure)
graph.edge[6].src = 3;
graph.edge[6].dest = 1;
graph.edge[6].weight = 1;
// add edge 4-3 (or E-D in above figure)
graph.edge[7].src = 4;
graph.edge[7].dest = 3;
graph.edge[7].weight = -3;
graph.BellmanFord(graph, 0);
}
}
// Contributed by Aakash Hasija
Vertex
0
1
2
3
4
Notes
1) Negative weights are found in various applications of graphs. For example, instead of paying cost for a path, we may get some advantage if we
follow the path.
2) Bellman-Ford works better (better than Dijksras) for distributed systems. Unlike Dijksras where we need to find minimum value of all vertices,
in Bellman-Ford, edges are considered one by one.
Exercise
1) The standard Bellman-Ford algorithm reports shortest path only if there is no negative weight cycles. Modify it so that it reports minimum
distances even if there is a negative weight cycle.
2) Can we use Dijksras algorithm for shortest paths for graphs with negative weights one idea can be, calculate the minimum weight value, add a
positive value (equal to absolute value of minimum weight value) to all weights and run the Dijksras algorithm for the modified graph. Will this
algorithm work?
References:
http://www.youtube.com/watch?v=Ttezuzs39nk
http://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm
http://www.cs.arizona.edu/classes/cs445/spring07/ShortestPath2.prn.pdf
9
4
1
0
C/C++
// C Program for Floyd Warshall Algorithm
#include<stdio.h>
// Number of vertices in the graph
#define V 4
/* Define Infinite as a large enough value. This value will be used
for vertices not connected to each other */
#define INF 99999
// A function to print the solution matrix
void printSolution(int dist[][V]);
// Solves the all-pairs shortest path problem using Floyd Warshall algorithm
void floydWarshell (int graph[][V])
{
/* dist[][] will be the output matrix that will finally have the shortest
distances between every pair of vertices */
int dist[V][V], i, j, k;
/* Initialize the solution matrix same as input graph matrix. Or
we can say the initial values of shortest distances are based
on shortest paths considering no intermediate vertex. */
Java
// A Java program for Floyd Warshall All Pairs Shortest
// Path algorithm.
import java.util.*;
import java.lang.*;
import java.io.*;
class AllPairShortestPath
{
final static int INF = 99999, V = 4;
void floydWarshall(int graph[][])
{
int dist[][] = new int[V][V];
int i, j, k;
/* Initialize the solution matrix same as input graph matrix.
Or we can say the initial values of shortest distances
are based on shortest paths considering no intermediate
vertex. */
for (i = 0; i < V; i++)
for (j = 0; j < V; j++)
dist[i][j] = graph[i][j];
/* Add all vertices one by one to the set of intermediate
vertices.
---> Before start of a iteration, we have shortest
distances between all pairs of vertices such that
the shortest distances consider only the vertices in
set {0, 1, 2, .. k-1} as intermediate vertices.
----> After the end of a iteration, vertex no. k is added
to the set of intermediate vertices and the set
becomes {0, 1, 2, .. k} */
for (k = 0; k < V; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as destination for the
// above picked source
for (j = 0; j < V; j++)
{
// If vertex k is on the shortest path from
// i to j, then update the value of dist[i][j]
if (dist[i][k] + dist[k][j] < dist[i][j])
dist[i][j] = dist[i][k] + dist[k][j];
}
}
}
// Print the shortest distance matrix
printSolution(dist);
}
void printSolution(int dist[][])
{
System.out.println("Following matrix shows the shortest "+
"distances between every pair of vertices");
for (int i=0; i<V; ++i)
{
for (int j=0; j<V; ++j)
{
if (dist[i][j]==INF)
System.out.print("INF ");
else
System.out.print(dist[i][j]+" ");
}
System.out.println();
}
}
// Driver program to test above function
public static void main (String[] args)
{
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
int graph[][] = { {0, 5, INF, 10},
{INF, 0, 3, INF},
{INF, INF, 0, 1},
{INF, INF, INF, 0}
};
AllPairShortestPath a = new AllPairShortestPath();
// Print the solution
a.floydWarshall(graph);
}
}
// Contributed by Aakash Hasija
Output:
Following matrix shows the shortest distances between every pair of vertices
0
5
8
9
INF
0
3
4
INF
INF
0
1
INF
INF
INF
0
The property simply means, shortest distance from s to v must be smaller than or equal to shortest distance from s to u plus weight of edge (u, v).
The new weights are w(u, v) + h[u] - h[v]. The value of the new weights must be greater than or equal to zero because of the inequality "h[v] <=
h[u] + w(u, v)". Example:
Let us consider the following graph.
We add a source s and add edges from s to all vertices of the original graph. In the following diagram s is 4.
We calculate the shortest distances from 4 to all other vertices using Bellman-Ford algorithm. The shortest distances from 4 to 0, 1, 2 and 3 are 0,
-5, -1 and 0 respectively, i.e., h[] = {0, -5, -1, 0}. Once we get these distances, we remove the source vertex 4 and reweight the edges using
following formula. w(u, v) = w(u, v) + h[u] - h[v].
Since all weights are positive now, we can run Dijkstra's shortest path algorithm for every vertex as source.
Time Complexity: The main steps in algorithm are Bellman Ford Algorithm called once and Dijkstra called V times. Time complexity of Bellman
Ford is O(VE) and time complexity of Dijkstra is O(VLogV). So overall time complexity is O(V2log V + VE).
The time complexity of Johnson's algorithm becomes same as Floyd Warshell when the graphs is complete (For a complete graph E = O(V2). But
for sparse graphs, the algorithm performs much better than Floyd Warshell.
References:
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
http://www.youtube.com/watch?v=b6LOHvCzmkI
http://www.youtube.com/watch?v=TV2Z6nbo1ic
http://en.wikipedia.org/wiki/Johnson%27s_algorithm
http://www.youtube.com/watch?v=Sygq1e0xWnM
C++
// C++ program to find single source shortest paths for Directed Acyclic Graphs
#include<iostream>
#include <list>
#include <stack>
#include <limits.h>
#define INF INT_MAX
using namespace std;
// Graph is represented using adjacency list. Every node of adjacency list
// contains vertex number of the vertex to which edge connects. It also
// contains weight of the edge
class AdjListNode
{
int v;
int weight;
public:
AdjListNode(int _v, int _w) { v = _v; weight = _w;}
int getV()
{ return v; }
int getWeight() { return weight; }
};
// Class to represent a graph using adjacency list representation
class Graph
{
int V;
// No. of vertices'
// Pointer to an array containing adjacency lists
list<AdjListNode> *adj;
// A function used by shortestPath
void topologicalSortUtil(int v, bool visited[], stack<int> &Stack);
public:
Graph(int V); // Constructor
// function to add an edge to graph
void addEdge(int u, int v, int weight);
// Finds shortest paths from given source vertex
void shortestPath(int s);
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<AdjListNode>[V];
}
void Graph::addEdge(int u, int v, int weight)
{
AdjListNode node(v, weight);
adj[u].push_back(node); // Add v to u's list
}
// A recursive function used by shortestPath. See below link for details
// http://www.geeksforgeeks.org/topological-sorting/
void Graph::topologicalSortUtil(int v, bool visited[], stack<int> &Stack)
{
// Mark the current node as visited
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<AdjListNode>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
{
AdjListNode node = *i;
if (!visited[node.getV()])
topologicalSortUtil(node.getV(), visited, Stack);
}
// Push current vertex to stack which stores topological sort
Stack.push(v);
}
// The function to find shortest paths from given vertex. It uses recursive
// topologicalSortUtil() to get topological sorting of given graph.
void Graph::shortestPath(int s)
{
stack<int> Stack;
int dist[V];
// Mark all the vertices as not visited
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to store Topological Sort
// starting from all vertices one by one
for (int i = 0; i < V; i++)
if (visited[i] == false)
topologicalSortUtil(i, visited, Stack);
// Initialize distances to all vertices as infinite and distance
// to source as 0
for (int i = 0; i < V; i++)
dist[i] = INF;
dist[s] = 0;
// Process vertices in topological order
while (Stack.empty() == false)
{
// Get the next vertex from topological order
int u = Stack.top();
Stack.pop();
// Update distances of all adjacent vertices
list<AdjListNode>::iterator i;
if (dist[u] != INF)
{
for (i = adj[u].begin(); i != adj[u].end(); ++i)
if (dist[i->getV()] > dist[u] + i->getWeight())
dist[i->getV()] = dist[u] + i->getWeight();
}
}
// Print the calculated shortest distances
for (int i = 0; i < V; i++)
(dist[i] == INF)? cout << "INF ": cout << dist[i] << " ";
}
// Driver program to test above functions
int main()
{
// Create a graph given in the above diagram. Here vertex numbers are
// 0, 1, 2, 3, 4, 5 with following mappings:
// 0=r, 1=s, 2=t, 3=x, 4=y, 5=z
Graph g(6);
g.addEdge(0, 1, 5);
g.addEdge(0, 2, 3);
g.addEdge(1, 3, 6);
g.addEdge(1, 2, 2);
g.addEdge(2, 4, 4);
g.addEdge(2, 5, 2);
g.addEdge(2, 3, 7);
g.addEdge(3, 4, -1);
g.addEdge(4, 5, -2);
int s = 1;
cout << "Following are shortest distances from source " << s <<" \n";
g.shortestPath(s);
return 0;
}
Java
// Java program to find single source shortest paths in Directed Acyclic Graphs
import java.io.*;
import java.util.*;
class ShortestPath
{
static final int INF=Integer.MAX_VALUE;
class AdjListNode
{
private int v;
private int weight;
AdjListNode(int _v, int _w) { v = _v; weight = _w; }
int getV() { return v; }
int getWeight() { return weight; }
}
// Class to represent graph as an adjcency list of
// nodes of type AdjListNode
class Graph
{
private int V;
private LinkedList<AdjListNode>adj[];
Graph(int v)
{
V=v;
adj = new LinkedList[V];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList<AdjListNode>();
}
void addEdge(int u, int v, int weight)
{
AdjListNode node = new AdjListNode(v,weight);
adj[u].add(node);// Add v to u's list
}
// A recursive function used by shortestPath.
// See below link for details
void topologicalSortUtil(int v, Boolean visited[], Stack stack)
{
// Mark the current node as visited.
visited[v] = true;
Integer i;
// Recur for all the vertices adjacent to this vertex
Iterator<AdjListNode> it = adj[v].iterator();
while (it.hasNext())
{
AdjListNode node =it.next();
if (!visited[node.getV()])
topologicalSortUtil(node.getV(), visited, stack);
}
// Push current vertex to stack which stores result
stack.push(new Integer(v));
}
// The function to find shortest paths from given vertex. It
// uses recursive topologicalSortUtil() to get topological
// sorting of given graph.
void shortestPath(int s)
{
Stack stack = new Stack();
int dist[] = new int[V];
// Mark all the vertices as not visited
Boolean visited[] = new Boolean[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to store Topological
// Sort starting from all vertices one by one
for (int i = 0; i < V; i++)
if (visited[i] == false)
topologicalSortUtil(i, visited, stack);
// Initialize distances to all vertices as infinite and
// distance to source as 0
for (int i = 0; i < V; i++)
dist[i] = INF;
dist[s] = 0;
// Process vertices in topological order
while (stack.empty() == false)
{
// Get the next vertex from topological order
int u = (int)stack.pop();
// Update distances of all adjacent vertices
Iterator<AdjListNode> it;
if (dist[u] != INF)
{
it = adj[u].iterator();
while (it.hasNext())
{
AdjListNode i= it.next();
if (dist[i.getV()] > dist[u] + i.getWeight())
dist[i.getV()] = dist[u] + i.getWeight();
}
}
}
// Print the calculated shortest distances
for (int i = 0; i < V; i++)
{
if (dist[i] == INF)
System.out.print( "INF ");
else
System.out.print( dist[i] + " ");
}
}
}
Time Complexity: Time complexity of topological sorting is O(V+E). After finding topological order, the algorithm process all vertices and for
every vertex, it runs a loop for all adjacent vertices. Total adjacent vertices in a graph is O(E). So the inner loop runs O(V+E) times. Therefore,
overall time complexity of this algorithm is O(V+E).
References:
http://www.utdallas.edu/~sizheng/CS4349.d/l-notes.d/L17.pdf
The idea is to browse through all paths of length k from u to v using the approach discussed in the previous post and return weight of the shortest
path. A simple solution is to start from u, go to all adjacent vertices and recur for adjacent vertices with k as k-1, source as adjacent vertex and
destination as v. Following are C++ and Java implementations of this simple solution.
C++
// C++ program to find shortest path with exactly k edges
#include <iostream>
#include <climits>
using namespace std;
// Define number of vertices in the graph and inifinite value
#define V 4
#define INF INT_MAX
// A naive recursive function to count walks from u to v with k edges
int shortestPath(int graph[][V], int u, int v, int k)
{
// Base cases
if (k == 0 && u == v)
return 0;
if (k == 1 && graph[u][v] != INF) return graph[u][v];
if (k <= 0)
return INF;
// Initialize result
int res = INF;
// Go to all adjacents of u and recur
for (int i = 0; i < V; i++)
{
if (graph[u][i] != INF && u != i && v != i)
{
int rec_res = shortestPath(graph, i, v, k-1);
if (rec_res != INF)
res = min(res, graph[u][i] + rec_res);
}
}
return res;
}
// driver program to test above function
int main()
{
/* Let us create the graph shown in above diagram*/
int graph[V][V] = { {0, 10, 3, 2},
{INF, 0, INF, 7},
{INF, INF, 0, 6},
{INF, INF, INF, 0}
};
int u = 0, v = 3, k = 2;
cout << "Weight of the shortest path is " <<
shortestPath(graph, u, v, k);
return 0;
}
Java
// Dynamic Programming based Java program to find shortest path
// with exactly k edges
import java.util.*;
import java.lang.*;
import java.io.*;
class ShortestPath
{
// Define number of vertices in the graph and inifinite value
static final int V = 4;
static final int INF = Integer.MAX_VALUE;
// A naive recursive function to count walks from u to v
// with k edges
int shortestPath(int graph[][], int u, int v, int k)
{
// Base cases
if (k == 0 && u == v)
return 0;
if (k == 1 && graph[u][v] != INF) return graph[u][v];
if (k <= 0)
return INF;
// Initialize result
int res = INF;
// Go to all adjacents of u and recur
for (int i = 0; i < V; i++)
{
if (graph[u][i] != INF && u != i && v != i)
{
int rec_res = shortestPath(graph, i, v, k-1);
if (rec_res != INF)
res = Math.min(res, graph[u][i] + rec_res);
}
}
return res;
}
public static void main (String[] args)
{
/* Let us create the graph shown in above diagram*/
int graph[][] = new int[][]{ {0, 10, 3, 2},
{INF, 0, INF, 7},
{INF, INF, 0, 6},
{INF, INF, INF, 0}
};
ShortestPath t = new ShortestPath();
int u = 0, v = 3, k = 2;
System.out.println("Weight of the shortest path is "+
t.shortestPath(graph, u, v, k));
}
}
The worst case time complexity of the above function is O(Vk) where V is the number of vertices in the given graph. We can simply analyze the
time complexity by drawing recursion tree. The worst occurs for a complete graph. In worst case, every internal node of recursion tree would have
exactly V children.
We can optimize the above solution using Dynamic Programming. The idea is to build a 3D table where first dimension is source, second
dimension is destination, third dimension is number of edges from source to destination, and the value is count of walks. Like other Dynamic
Programming problems, we fill the 3D table in bottom up manner.
C++
// Dynamic Programming based C++ program to find shortest path with
// exactly k edges
#include <iostream>
#include <climits>
using namespace std;
// Define number of vertices in the graph and inifinite value
#define V 4
#define INF INT_MAX
// A Dynamic programming based function to find the shortest path from
Java
// Dynamic Programming based Java program to find shortest path with
// exactly k edges
import java.util.*;
import java.lang.*;
import java.io.*;
class ShortestPath
{
// Define number of vertices in the graph and inifinite value
static final int V = 4;
static final int INF = Integer.MAX_VALUE;
// A Dynamic programming based function to find the shortest path
// from u to v with exactly k edges.
int shortestPath(int graph[][], int u, int v, int k)
{
// Table to be filled up using DP. The value sp[i][j][e] will
// store weight of the shortest path from i to j with exactly
// k edges
int sp[][][] = new int[V][V][k+1];
// Loop for number of edges from 0 to k
Time complexity of the above DP based solution is O(V3K) which is much better than the naive solution.
We can either use Breadth First Search (BFS) or Depth First Search (DFS) to find path between two vertices. Take the first vertex as source in
BFS (or DFS), follow the standard BFS (or DFS). If we see the second vertex in our traversal, then return true. Else return false.
Following are C++ and Java codes that use BFS for finding reachability of second vertex from first vertex.
C++
// C++ program to check if there is exist a path between two vertices
// of a graph.
#include<iostream>
#include <list>
using namespace std;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// Pointer to an array containing adjacency lists
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
bool isReachable(int s, int d);
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
// A BFS based function to check whether d is reachable from s.
bool Graph::isReachable(int s, int d)
{
// Base case
if (s == d)
return true;
// Mark all the vertices as not visited
bool *visited = new bool[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Create a queue for BFS
list<int> queue;
// Mark the current node as visited and enqueue it
visited[s] = true;
queue.push_back(s);
// it will be used to get all adjacent vertices of a vertex
list<int>::iterator i;
while (!queue.empty())
{
// Dequeue a vertex from queue and print it
s = queue.front();
queue.pop_front();
// Get all adjacent vertices of the dequeued vertex s
// If a adjacent has not been visited, then mark it visited
// and enqueue it
for (i = adj[s].begin(); i != adj[s].end(); ++i)
{
// If this adjacent node is the destination node, then
// return true
if (*i == d)
return true;
// Else, continue to do BFS
if (!visited[*i])
{
visited[*i] = true;
queue.push_back(*i);
}
}
}
// If BFS is complete without visiting d
return false;
}
// Driver program to test methods of graph class
int main()
{
// Create a graph given in the above diagram
Graph g(4);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
int u = 1, v = 3;
if(g.isReachable(u, v))
cout<< "\n There is a path from " << u << " to " << v;
else
cout<< "\n There is no path from " << u << " to " << v;
u = 3, v = 1;
if(g.isReachable(u, v))
cout<< "\n There is a path from " << u << " to " << v;
else
cout<< "\n There is no path from " << u << " to " << v;
return 0;
}
Java
// Java program to check if there is exist a path between two vertices
// of a graph.
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; //Adjacency List
//Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v,int w) { adj[v].add(w);
As an exercise, try an extended version of the problem where the complete path between two vertices is also needed.
It is easy for undirected graph, we can just do a BFS and DFS starting from any vertex. If BFS or DFS visits all vertices, then the given
undirected graph is connected. This approach wont work for a directed graph. For example, consider the following graph which is not strongly
connected. If we start DFS (or BFS) from vertex 0, we can reach all vertices, but if we start from any other vertex, we cannot reach all vertices.
C++
// C++ program to check if a given directed graph is strongly
// connected or not
#include <iostream>
#include <list>
#include <stack>
using namespace std;
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// An array of adjacency lists
// A recursive function to print DFS starting from v
void DFSUtil(int v, bool visited[]);
public:
// Constructor and Destructor
Graph(int V) { this->V = V; adj = new list<int>[V];}
}
// Driver program to test above functions
int main()
{
// Create graphs given in the above diagrams
Graph g1(5);
g1.addEdge(0, 1);
g1.addEdge(1, 2);
g1.addEdge(2, 3);
g1.addEdge(3, 0);
g1.addEdge(2, 4);
g1.addEdge(4, 2);
g1.isSC()? cout << "Yes\n" : cout << "No\n";
Graph g2(4);
g2.addEdge(0, 1);
g2.addEdge(1, 2);
g2.addEdge(2, 3);
g2.isSC()? cout << "Yes\n" : cout << "No\n";
return 0;
}
Java
// Java program to check if a given directed graph is strongly
// connected or not
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents a directed graph using adjacency
// list representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; //Adjacency List
//Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v,int w) { adj[v].add(w); }
// A recursive function to print DFS starting from v
void DFSUtil(int v,Boolean visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
int n;
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> i = adj[v].iterator();
while (i.hasNext())
{
n = i.next();
if (!visited[n])
DFSUtil(n,visited);
}
}
// Function that returns transpose of this graph
Graph getTranspose()
{
Graph g = new Graph(V);
for (int v = 0; v < V; v++)
{
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> i = adj[v].listIterator();
while (i.hasNext())
g.adj[i.next()].add(v);
}
return g;
}
// The main function that returns true if graph is strongly
// connected
Boolean isSC()
{
// Step 1: Mark all the vertices as not visited
// (For first DFS)
Boolean visited[] = new Boolean[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Step 2: Do DFS traversal starting from first vertex.
DFSUtil(0, visited);
// If DFS traversal doesn't visit all vertices, then
// return false.
for (int i = 0; i < V; i++)
if (visited[i] == false)
return false;
// Step 3: Create a reversed graph
Graph gr = getTranspose();
// Step 4: Mark all the vertices as not visited (For
// second DFS)
for (int i = 0; i < V; i++)
visited[i] = false;
// Step 5: Do DFS for reversed graph starting from
// first vertex. Staring Vertex must be same starting
// point of first DFS
gr.DFSUtil(0, visited);
// If all vertices are not visited in second DFS, then
// return false
for (int i = 0; i < V; i++)
if (visited[i] == false)
return false;
return true;
}
public static void main(String args[])
{
// Create graphs given in the above diagrams
Graph g1 = new Graph(5);
g1.addEdge(0, 1);
g1.addEdge(1, 2);
g1.addEdge(2, 3);
g1.addEdge(3, 0);
g1.addEdge(2, 4);
g1.addEdge(4, 2);
if (g1.isSC())
System.out.println("Yes");
else
System.out.println("No");
Graph g2 = new Graph(4);
g2.addEdge(0, 1);
g2.addEdge(1, 2);
g2.addEdge(2, 3);
if (g2.isSC())
System.out.println("Yes");
else
System.out.println("No");
}
}
// This code is contributed by Aakash Hasija
Yes
No
Time Complexity: Time complexity of above implementation is sane as Depth First Search which is O(V+E) if the graph is represented using
adjacency matrix representation.
Exercise:
We do DFS traversal of given graph with additional code to find out Articulation Points (APs). In DFS traversal, we maintain a parent[] array
where parent[u] stores parent of vertex u. Among the above mentioned two cases, the first case is simple to detect. For every vertex, count
children. If currently visited vertex u is root (parent[u] is NIL) and has more than two children, print it.
How to handle second case? The second case is trickier. We maintain an array disc[] to store discovery time of vertices. For every node u, we
need to find out the earliest visited vertex (the vertex with minimum discovery time) that can be reached from subtree rooted with u. So we
maintain an additional array low[] which is defined as follows.
low[u] = min(disc[u], disc[w])
where w is an ancestor of u and there is a back edge from
some descendant of u to w.
Following are C++ and Java implementations of Tarjans algorithm for finding articulation points.
C++
// A C++ program to find articulation points in an undirected graph
#include<iostream>
#include <list>
#define NIL -1
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
void APUtil(int v, bool visited[], int disc[], int low[],
int parent[], bool ap[]);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void AP();
// prints articulation points
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w);
adj[w].push_back(v); // Note: the graph is undirected
}
// A recursive function that find articulation points using DFS traversal
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
// ap[] --> Store articulation points
void Graph::APUtil(int u, bool visited[], int disc[],
int low[], int parent[], bool ap[])
{
// A static variable is used for simplicity, we can avoid use of static
// variable by passing a pointer.
static int time = 0;
// Count of children in DFS Tree
int children = 0;
// Mark the current node as visited
visited[u] = true;
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
// Go through all vertices aadjacent to this
list<int>::iterator i;
for (i = adj[u].begin(); i != adj[u].end(); ++i)
{
int v = *i; // v is current adjacent of u
// If v is not visited yet, then make it a child of u
// in DFS tree and recur for it
if (!visited[v])
{
children++;
parent[v] = u;
APUtil(v, visited, disc, low, parent, ap);
g3.addEdge(3, 5);
g3.addEdge(4, 5);
g3.AP();
return 0;
}
Java
// A Java program to find articulation points in an undirected graph
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents an undirected graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
int time = 0;
static final int NIL = -1;
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
adj[w].add(v); //Add v to w's list
}
// A recursive function that find articulation points using DFS
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
// ap[] --> Store articulation points
void APUtil(int u, boolean visited[], int disc[],
int low[], int parent[], boolean ap[])
{
// Count of children in DFS Tree
int children = 0;
// Mark the current node as visited
visited[u] = true;
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
// Go through all vertices aadjacent to this
Iterator<Integer> i = adj[u].iterator();
while (i.hasNext())
{
int v = i.next(); // v is current adjacent of u
// If v is not visited yet, then make it a child of u
// in DFS tree and recur for it
if (!visited[v])
{
children++;
parent[v] = u;
APUtil(v, visited, disc, low, parent, ap);
// Check if the subtree rooted with v has a connection to
// one of the ancestors of u
low[u] = Math.min(low[u], low[v]);
// u is an articulation point in following cases
}
// This code is contributed by Aakash Hasija
Time Complexity: The above function is simple DFS with additional arrays. So time complexity is same as DFS which is O(V+E) for adjacency
list representation of graph.
References:
https://www.cs.washington.edu/education/courses/421/04su/slides/artic.pdf
http://www.slideshare.net/TraianRebedea/algorithm-design-and-complexity-course-8
http://faculty.simpson.edu/lydia.sinapova/www/cmsc250/LN250_Weiss/L25-Connectivity.htm
Biconnected graph
An undirected graph is called Biconnected if there are two vertex-disjoint paths between any two vertices. In a Biconnected Graph, there is a
simple cycle through any two vertices.
By convention, two nodes connected by an edge form a biconnected graph, but this does not verify the above properties. For a graph with more
than two vertices, the above properties must be there for it to be Biconnected.
Following are some examples.
C++
// A C++ program to find if a given undirected graph is
// biconnected
#include<iostream>
#include <list>
#define NIL -1
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of
bool isBCUtil(int v, bool visited[], int
int parent[]);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // to add an
bool isBC();
// returns true if graph
};
adjacency lists
disc[], int low[],
edge to graph
is Biconnected
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w);
adj[w].push_back(v); // Note: the graph is undirected
}
// A recursive function that returns true if there is an articulation
// point in given graph, otherwise returns false.
// This function is almost same as isAPUtil() here ( http://goo.gl/Me9Fw )
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
bool Graph::isBCUtil(int u, bool visited[], int disc[],int low[],int parent[])
{
// A static variable is used for simplicity, we can avoid use of static
// variable by passing a pointer.
static int time = 0;
// Count of children in DFS Tree
int children = 0;
// Mark the current node as visited
visited[u] = true;
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
// Go through all vertices aadjacent to this
list<int>::iterator i;
for (i = adj[u].begin(); i != adj[u].end(); ++i)
{
int v = *i; // v is current adjacent of u
// If v is not visited yet, then make it a child of u
// in DFS tree and recur for it
if (!visited[v])
{
children++;
parent[v] = u;
// check if subgraph rooted with v has an articulation point
if (isBCUtil(v, visited, disc, low, parent))
return true;
// Check if the subtree rooted with v has a connection to
// one of the ancestors of u
low[u] = min(low[u], low[v]);
// u is an articulation point in following cases
// (1) u is root of DFS tree and has two or more chilren.
if (parent[u] == NIL && children > 1)
return true;
// (2) If u is not root and low value of one of its child is
// more than discovery value of u.
if (parent[u] != NIL && low[v] >= disc[u])
return true;
}
// Update low value of u for parent function calls.
else if (v != parent[u])
low[u] = min(low[u], disc[v]);
}
return false;
}
// The main function that returns true if graph is Biconnected,
// otherwise false. It uses recursive function isBCUtil()
bool Graph::isBC()
{
// Mark all the vertices as not visited
bool *visited = new bool[V];
int *disc = new int[V];
int *low = new int[V];
int *parent = new int[V];
// Initialize parent and visited, and ap(articulation point)
// arrays
for (int i = 0; i < V; i++)
{
parent[i] = NIL;
visited[i] = false;
}
// Call the recursive helper function to find if there is an articulation
// point in given graph. We do DFS traversal starring from vertex 0
if (isBCUtil(0, visited, disc, low, parent) == true)
return false;
// Now check whether the given graph is connected or not. An undirected
// graph is connected if all vertices are reachable from any starting
// point (we have taken 0 as starting point)
for (int i = 0; i < V; i++)
if (visited[i] == false)
return false;
return true;
}
// Driver program to test above function
int main()
{
// Create graphs given in above diagrams
Graph g1(2);
g1.addEdge(0, 1);
g1.isBC()? cout << "Yes\n" : cout << "No\n";
Graph g2(5);
g2.addEdge(1, 0);
g2.addEdge(0, 2);
g2.addEdge(2, 1);
g2.addEdge(0, 3);
g2.addEdge(3, 4);
g2.addEdge(2, 4);
g2.isBC()? cout << "Yes\n" : cout << "No\n";
Graph g3(3);
g3.addEdge(0, 1);
g3.addEdge(1, 2);
g3.isBC()? cout << "Yes\n" : cout << "No\n";
Graph g4(5);
g4.addEdge(1, 0);
g4.addEdge(0, 2);
g4.addEdge(2, 1);
g4.addEdge(0, 3);
g4.addEdge(3, 4);
g4.isBC()? cout << "Yes\n" : cout << "No\n";
Graph g5(3);
g5.addEdge(0, 1);
g5.addEdge(1, 2);
g5.addEdge(2, 0);
g5.isBC()? cout << "Yes\n" : cout << "No\n";
return 0;
}
Java
// A Java program to find if a given undirected graph is
// biconnected
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents a directed graph using adjacency
// list representation
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
int time = 0;
static final int NIL = -1;
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); //Note that the graph is undirected.
adj[w].add(v);
}
// A recursive function that returns true if there is an articulation
// point in given graph, otherwise returns false.
// This function is almost same as isAPUtil() @ http://goo.gl/Me9Fw
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
boolean isBCUtil(int u, boolean visited[], int disc[],int low[],
int parent[])
{
// Count of children in DFS Tree
int children = 0;
// Mark the current node as visited
visited[u] = true;
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
// Go through all vertices aadjacent to this
Iterator<Integer> i = adj[u].iterator();
while (i.hasNext())
{
int v = i.next(); // v is current adjacent of u
// If v is not visited yet, then make it a child of u
// in DFS tree and recur for it
if (!visited[v])
{
children++;
parent[v] = u;
// check if subgraph rooted with v has an articulation point
if (isBCUtil(v, visited, disc, low, parent))
return true;
// Check if the subtree rooted with v has a connection to
// one of the ancestors of u
low[u] = Math.min(low[u], low[v]);
// u is an articulation point in following cases
// (1) u is root of DFS tree and has two or more chilren.
if (parent[u] == NIL && children > 1)
return true;
// (2) If u is not root and low value of one of its
// child is more than discovery value of u.
if (parent[u] != NIL && low[v] >= disc[u])
return true;
}
// Update low value of u for parent function calls.
else if (v != parent[u])
low[u] = Math.min(low[u], disc[v]);
}
return false;
}
// The main function that returns true if graph is Biconnected,
// otherwise false. It uses recursive function isBCUtil()
boolean isBC()
{
// Mark all the vertices as not visited
boolean visited[] = new boolean[V];
int disc[] = new int[V];
int low[] = new int[V];
int parent[] = new int[V];
// Initialize parent and visited, and ap(articulation point)
// arrays
for (int i = 0; i < V; i++)
{
parent[i] = NIL;
visited[i] = false;
}
//
//
//
if
if (g4.isBC())
System.out.println("Yes");
else
System.out.println("No");
Graph g5= new Graph(3);
g5.addEdge(0, 1);
g5.addEdge(1, 2);
g5.addEdge(2, 0);
if (g5.isBC())
System.out.println("Yes");
else
System.out.println("No");
}
}
// This code is contributed by Aakash Hasija
Yes
Yes
No
No
Yes
Time Complexity: The above function is a simple DFS with additional arrays. So time complexity is same as DFS which is O(V+E) for
adjacency list representation of graph.
References:
http://www.cs.purdue.edu/homes/ayg/CS251/slides/chap9d.pdf
Bridges in a graph
An edge in an undirected connected graph is a bridge iff removing it disconnects the graph. For a disconnected undirected graph, definition is
similar, a bridge is an edge removing which increases number of connected components.
Like Articulation Points,bridges represent vulnerabilities in a connected network and are useful for designing reliable networks. For example, in a
wired computer network, an articulation point indicates the critical computers and a bridge indicates the critical wires or connections.
Following are some example graphs with bridges highlighted with red color.
C++
// A C++ program to find bridges in a given undirected graph
#include<iostream>
#include <list>
#define NIL -1
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
void bridgeUtil(int v, bool visited[], int disc[], int low[],
int parent[]);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // to add an edge to graph
void bridge();
// prints all bridges
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w);
adj[w].push_back(v); // Note: the graph is undirected
}
// A recursive function that finds and prints bridges using
// DFS traversal
// u --> The vertex to be visited next
Java
// A Java program to find bridges in a given undirected graph
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents a undirected graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
int time = 0;
static final int NIL = -1;
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
// Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
adj[w].add(v); //Add v to w's list
}
// A recursive function that finds and prints bridges
// using DFS traversal
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
void bridgeUtil(int u, boolean visited[], int disc[],
int low[], int parent[])
{
// Count of children in DFS Tree
int children = 0;
// Mark the current node as visited
visited[u] = true;
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
// Go through all vertices aadjacent to this
Iterator<Integer> i = adj[u].iterator();
while (i.hasNext())
{
}
// Update low value of u for parent function calls.
else if (v != parent[u])
low[u] = Math.min(low[u], disc[v]);
}
}
// DFS based function to find all bridges. It uses recursive
// function bridgeUtil()
void bridge()
{
// Mark all the vertices as not visited
boolean visited[] = new boolean[V];
int disc[] = new int[V];
int low[] = new int[V];
int parent[] = new int[V];
// Initialize parent and visited, and ap(articulation point)
// arrays
for (int i = 0; i < V; i++)
{
parent[i] = NIL;
visited[i] = false;
}
// Call the recursive helper function to find Bridges
// in DFS tree rooted with vertex 'i'
for (int i = 0; i < V; i++)
if (visited[i] == false)
bridgeUtil(i, visited, disc, low, parent);
}
public static void main(String args[])
{
// Create graphs given in above diagrams
System.out.println("Bridges in first graph ");
Graph g1 = new Graph(5);
g1.addEdge(1, 0);
g1.addEdge(0, 2);
g1.addEdge(2, 1);
g1.addEdge(0, 3);
g1.addEdge(3, 4);
g1.bridge();
System.out.println();
System.out.println("Bridges in Second graph");
Graph g2 = new Graph(4);
g2.addEdge(0, 1);
g2.addEdge(1, 2);
g2.addEdge(2, 3);
g2.bridge();
System.out.println();
System.out.println("Bridges in Third graph ");
Graph g3 = new Graph(7);
g3.addEdge(0, 1);
g3.addEdge(1, 2);
g3.addEdge(2, 0);
g3.addEdge(1, 3);
g3.addEdge(1,
g3.addEdge(1,
g3.addEdge(3,
g3.addEdge(4,
g3.bridge();
4);
6);
5);
5);
}
}
// This code is contributed by Aakash Hasija
Time Complexity: The above function is simple DFS with additional arrays. So time complexity is same as DFS which is O(V+E) for adjacency
list representation of graph.
References:
https://www.cs.washington.edu/education/courses/421/04su/slides/artic.pdf
http://www.slideshare.net/TraianRebedea/algorithm-design-and-complexity-course-8
http://faculty.simpson.edu/lydia.sinapova/www/cmsc250/LN250_Weiss/L25-Connectivity.htm
http://www.youtube.com/watch?v=bmyyxNyZKzI
C++
// A C++ program to check if a given graph is Eulerian or not
#include<iostream>
#include <list>
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
public:
// Constructor and destructor
Graph(int V) {this->V = V; adj = new list<int>[V]; }
~Graph() { delete [] adj; } // To avoid memory leak
// function to add an edge to graph
void addEdge(int v, int w);
// Method to check if this graph is Eulerian or not
int isEulerian();
// Method to check if all non-zero degree vertices are connected
bool isConnected();
// Function to do DFS starting from v. Used in isConnected();
void DFSUtil(int v, bool visited[]);
};
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w);
adj[w].push_back(v); // Note: the graph is undirected
}
void Graph::DFSUtil(int v, bool visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
if (!visited[*i])
DFSUtil(*i, visited);
}
// Method to check if all non-zero degree vertices are connected.
// It mainly does DFS traversal starting from
bool Graph::isConnected()
{
// Mark all the vertices as not visited
bool visited[V];
int i;
for (i = 0; i < V; i++)
visited[i] = false;
// Find a vertex with non-zero degree
for (i = 0; i < V; i++)
if (adj[i].size() != 0)
break;
// If there are no edges in the graph, return true
if (i == V)
return true;
// Start DFS traversal from a vertex with non-zero degree
DFSUtil(i, visited);
// Check if all non-zero degree vertices are visited
for (i = 0; i < V; i++)
if (visited[i] == false && adj[i].size() > 0)
return false;
return true;
}
/* The function returns one of the following values
0 --> If grpah is not Eulerian
1 --> If graph has an Euler path (Semi-Eulerian)
2 --> If graph has an Euler Circuit (Eulerian) */
int Graph::isEulerian()
{
// Check if all non-zero degree vertices are connected
if (isConnected() == false)
return 0;
// Count vertices with odd degree
int odd = 0;
for (int i = 0; i < V; i++)
if (adj[i].size() & 1)
odd++;
// If count is more than 2, then graph is not Eulerian
if (odd > 2)
return 0;
// If odd count is 2, then semi-eulerian.
// If odd count is 0, then eulerian
// Note that odd count can never be 1 for undirected graph
return (odd)? 1 : 2;
}
// Function to run test cases
void test(Graph &g)
{
int res = g.isEulerian();
if (res == 0)
cout << "graph is not Eulerian\n";
else if (res == 1)
cout << "graph has a Euler path\n";
else
cout << "graph has a Euler cycle\n";
}
// Driver program to test above function
int main()
{
// Let us create and test graphs shown in above figures
Graph g1(5);
g1.addEdge(1, 0);
g1.addEdge(0, 2);
g1.addEdge(2, 1);
g1.addEdge(0, 3);
g1.addEdge(3, 4);
test(g1);
Graph g2(5);
g2.addEdge(1,
g2.addEdge(0,
g2.addEdge(2,
g2.addEdge(0,
g2.addEdge(3,
g2.addEdge(4,
test(g2);
0);
2);
1);
3);
4);
0);
Graph g3(5);
g3.addEdge(1,
g3.addEdge(0,
g3.addEdge(2,
g3.addEdge(0,
g3.addEdge(3,
g3.addEdge(1,
test(g3);
0);
2);
1);
3);
4);
3);
Java
// A Java program to check if a given graph is Eulerian or not
import java.io.*;
import java.util.*;
import java.util.LinkedList;
graph
graph
graph
graph
graph
There are two vertices with odd degree, 2 and 3, we can start path from any of them. Let us start tour from vertex 2.
There are three edges going out from vertex 2, which one to pick? We dont pick the edge 2-3? because that is a bridge (we wont be able to come
back to 3). We can pick any of the remaining two edge. Let us say we pick 2-0?. We remove this edge and move to vertex 0.
There is only one edge from vertex 0, so we pick it, remove it and move to vertex 1. Euler tour becomes 2-0 0-1?.
There is only one edge from vertex 1, so we pick it, remove it and move to vertex 2. Euler tour becomes 2-0 0-1 1-2?
Again there is only one edge from vertex 2, so we pick it, remove it and move to vertex 3. Euler tour becomes 2-0 0-1 1-2 2-3?
There are no more edges left, so we stop here. Final tour is 2-0 0-1 1-2 2-3?.
See this for and this fore more examples.
Following is C++ implementation of above algorithm. In the following code, it is assumed that the given graph has an Eulerian trail or Circuit. The
main focus is to print an Eulerian trail or circuit. We can use isEulerian() to first check whether there is an Eulerian Trail or Circuit in the given
graph.
We first find the starting point which must be an odd vertex (if there are odd vertices) and store it in variable u. If there are zero odd vertices, we
start from vertex 0. We call printEulerUtil() to print Euler tour starting with u. We traverse all adjacent vertices of u, if there is only one adjacent
vertex, we immediately consider it. If there are more than one adjacent vertices, we consider an adjacent v only if edge u-v is not a bridge. How to
find if a given is edge is bridge? We count number of vertices reachable from u. We remove edge u-v and again count number of reachable
vertices from u. If number of reachable vertices are reduced, then edge u-v is a bridge. To count reachable vertices, we can either use BFS or
DFS, we have used DFS in the above code. The function DFSCount(u) returns number of vertices reachable from u.
Once an edge is processed (included in Euler tour), we remove it from the graph. To remove the edge, we replace the vertex entry with -1 in
adjacency list. Note that simply deleting the node may not work as the code is recursive and a parent call may be in middle of adjacency list.
// A C++ program print Eulerian Trail in a given Eulerian or Semi-Eulerian Graph
#include <iostream>
#include <string.h>
#include <algorithm>
#include <list>
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
public:
// Constructor and destructor
Graph(int V) { this->V = V; adj = new list<int>[V]; }
~Graph()
{ delete [] adj; }
// functions to add and remove edge
void addEdge(int u, int v) { adj[u].push_back(v); adj[v].push_back(u); }
void rmvEdge(int u, int v);
// Methods to print Eulerian tour
void printEulerTour();
void printEulerUtil(int s);
// This function returns count of vertices reachable from v. It does DFS
int DFSCount(int v, bool visited[]);
// Utility function to check if edge u-v is a valid next edge in
// Eulerian trail or circuit
bool isValidNextEdge(int u, int v);
};
/* The main function that print Eulerian Trail. It first finds an odd
degree vertex (if there is any) and then calls printEulerUtil()
to print the path */
void Graph::printEulerTour()
{
// Find a vertex with odd degree
int u = 0;
for (int i = 0; i < V; i++)
if (adj[i].size() & 1)
{ u = i; break; }
// Print tour starting from oddv
printEulerUtil(u);
cout << endl;
}
// Print Euler tour starting from vertex u
void Graph::printEulerUtil(int u)
{
Output:
2-0 0-1 1-2 2-3
0-1 1-2 2-0
0-1 1-2 2-0 0-3 3-4 4-2 2-3 3-1
Note that the above code modifies given graph, we can create a copy of graph if we dont want the given graph to be modified.
Time Complexity: Time complexity of the above implementation is O ((V+E)2). The function printEulerUtil() is like DFS and it calls
isValidNextEdge() which also does DFS two times. Time complexity of DFS for adjacency list representation is O(V+E). Therefore overall time
complexity is O((V+E)*(V+E)) which can be written as O(E2) for a connected graph.
There are better algorithms to print Euler tour, we will soon be covering them as separate posts.
References:
http://www.math.ku.edu/~jmartin/courses/math105-F11/Lectures/chapter5-part2.pdf
http://en.wikipedia.org/wiki/Eulerian_path#Fleury.27s_algorithm
We can find all strongly connected components in O(V+E) time using Kosarajus algorithm. Following is detailed Kosarajus algorithm.
1) Create an empty stack S and do DFS traversal of a graph. In DFS traversal, after calling recursive DFS for adjacent vertices of a vertex, push
the vertex to stack.
2) Reverse directions of all arcs to obtain the transpose graph.
3) One by one pop a vertex from S while S is not empty. Let the popped vertex be v. Take v as source and do DFS (call DFSUtil(v)). The DFS
starting from v prints strongly connected component of v.
How does this work?
The above algorithm is DFS based. It does DFS two times. DFS of a graph produces a single tree if all vertices are reachable from the DFS
starting point. Otherwise DFS produces a forest. So DFS of a graph with only one SCC always produces a tree. The important point to note is
DFS may produce a tree or a forest when there are more than one SCCs depending upon the chosen starting point. For example, in the above
diagram, if we start DFS from vertices 0 or 1 or 2, we get a tree as output. And if we start from 3 or 4, we get a forest. To find and print all
SCCs, we would want to start DFS from vertex 4 (which is a sink vertex), then move to 3 which is sink in the remaining set (set excluding 4) and
finally any of the remaining vertices (0, 1, 2). So how do we find this sequence of picking vertices as starting points of DFS? Unfortunately, there is
no direct way for getting this sequence. However, if we do a DFS of graph and store vertices according to their finish times, we make sure that the
finish time of a vertex that connects to other SCCs (other that its own SCC), will always be greater than finish time of vertices in the other SCC
(See this for proof). For example, in DFS of above example graph, finish time of 0 is always greater than 3 and 4 (irrespective of the sequence of
vertices considered for DFS). And finish time of 3 is always greater than 4. DFS doesnt guarantee about other vertices, for example finish times of
1 and 2 may be smaller or greater than 3 and 4 depending upon the sequence of vertices considered for DFS. So to use this property, we do DFS
traversal of complete graph and push every finished vertex to a stack. In stack, 3 always appears after 4, and 0 appear after both 3 and 4.
In the next step, we reverse the graph. Consider the graph of SCCs. In the reversed graph, the edges that connect two components are reversed.
So the SCC {0, 1, 2} becomes sink and the SCC {4} becomes source. As discussed above, in stack, we always have 0 before 3 and 4. So if we
do a DFS of the reversed graph using sequence of vertices in stack, we process vertices from sink to source. That is what we wanted to achieve
and that is all needed to print SCCs one by one.
C++
// C++ Implementation of Kosaraju's algorithm to print all SCCs
#include <iostream>
#include <list>
#include <stack>
using namespace std;
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// An array of adjacency lists
// Fills Stack with vertices (in increasing order of finishing
// times). The top element of stack has the maximum finishing
// time
void fillOrder(int v, bool visited[], stack<int> &Stack);
// A recursive function to print DFS starting from v
void DFSUtil(int v, bool visited[]);
public:
Graph(int V);
void addEdge(int v, int w);
// The main function that finds and prints strongly connected
// components
void printSCCs();
// Function that returns reverse (or transpose) of this graph
Graph getTranspose();
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
// A recursive function to print DFS starting from v
void Graph::DFSUtil(int v, bool visited[])
{
// Mark the current node as visited and print it
visited[v] = true;
cout << v << " ";
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
if (!visited[*i])
DFSUtil(*i, visited);
}
Graph Graph::getTranspose()
{
Graph g(V);
for (int v = 0; v < V; v++)
{
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
{
g.adj[*i].push_back(v);
}
}
return g;
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
void Graph::fillOrder(int v, bool visited[], stack<int> &Stack)
{
// Mark the current node as visited and print it
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
if(!visited[*i])
fillOrder(*i, visited, Stack);
// All vertices reachable from v are processed by now, push v
Stack.push(v);
}
// The main function that finds and prints all strongly connected
// components
void Graph::printSCCs()
{
stack<int> Stack;
// Mark all the vertices as not visited (For first DFS)
bool *visited = new bool[V];
for(int i = 0; i < V; i++)
visited[i] = false;
// Fill vertices in stack according to their finishing times
for(int i = 0; i < V; i++)
if(visited[i] == false)
fillOrder(i, visited, Stack);
// Create a reversed graph
Graph gr = getTranspose();
// Mark all the vertices as not visited (For second DFS)
for(int i = 0; i < V; i++)
visited[i] = false;
// Now process all vertices in order defined by Stack
while (Stack.empty() == false)
{
// Pop a vertex from stack
int v = Stack.top();
Stack.pop();
// Print Strongly connected component of the popped vertex
if (visited[v] == false)
{
gr.DFSUtil(v, visited);
cout << endl;
}
}
}
// Driver program to test above functions
int main()
{
// Create a graph given in the above diagram
Graph g(5);
g.addEdge(1, 0);
g.addEdge(0, 2);
g.addEdge(2, 1);
g.addEdge(0, 3);
g.addEdge(3, 4);
cout << "Following are strongly connected components in "
"given graph \n";
g.printSCCs();
return 0;
}
Java
// Java implementation of Kosaraju's algorithm to print all SCCs
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents a directed graph using adjacency list
// representation
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; //Adjacency List
//Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w) { adj[v].add(w); }
// A recursive function to print DFS starting from v
Time Complexity: The above algorithm calls DFS, fins reverse of the graph and again calls DFS. DFS takes O(V+E) for a graph represented
using adjacency list. Reversing a graph also takes O(V+E) time. For reversing the graph, we simple traverse all adjacency lists.
The above algorithm is asymptotically best algorithm, but there are other algorithms like Tarjans algorithm and path-based which have same time
complexity but find SCCs using single DFS. The Tarjans algorithm is discussed in the following post.
Tarjans Algorithm to find Strongly Connected Components
Applications:
SCC algorithms can be used as a first step in many graph algorithms that work only on strongly connected graph.
In social networks, a group of people are generally strongly connected (For example, students of a class or any other common place). Many
people in these groups generally like some common pages or play common games. The SCC algorithms can be used to find such groups and
suggest the commonly liked pages or games to the people in the group who have not yet liked commonly liked a page or played a game.
References:
http://en.wikipedia.org/wiki/Kosaraju%27s_algorithm
https://www.youtube.com/watch?v=PZQ0Pdk15RA
You may also like to see Tarjans Algorithm to find Strongly Connected Components.
C++
// Program for transitive closure using Floyd Warshall Algorithm
#include<stdio.h>
// Number of vertices in the graph
#define V 4
// A function to print the solution matrix
void printSolution(int reach[][V]);
// Prints transitive closure of graph[][] using Floyd Warshall algorithm
void transitiveClosure(int graph[][V])
{
/* reach[][] will be the output matrix that will finally have the
shortest distances between every pair of vertices */
int reach[V][V], i, j, k;
/* Initialize the solution matrix same as input graph matrix. Or
we can say the initial values of shortest distances are based
on shortest paths considering no intermediate vertex. */
for (i = 0; i < V; i++)
for (j = 0; j < V; j++)
reach[i][j] = graph[i][j];
/* Add all vertices one by one to the set of intermediate vertices.
---> Before start of a iteration, we have reachability values for
all pairs of vertices such that the reachability values
consider only the vertices in set {0, 1, 2, .. k-1} as
intermediate vertices.
----> After the end of a iteration, vertex no. k is added to the
set of intermediate vertices and the set becomes {0, 1, .. k} */
for (k = 0; k < V; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as destination for the
// above picked source
for (j = 0; j < V; j++)
{
// If vertex k is on a path from i to j,
// then make sure that the value of reach[i][j] is 1
reach[i][j] = reach[i][j] || (reach[i][k] && reach[k][j]);
}
}
}
// Print the shortest distance matrix
printSolution(reach);
}
/* A utility function to print solution */
void printSolution(int reach[][V])
{
printf ("Following matrix is transitive closure of the given graph\n");
for (int i = 0; i < V; i++)
{
for (int j = 0; j < V; j++)
printf ("%d ", reach[i][j]);
printf("\n");
}
}
// driver program to test above function
int main()
{
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
int graph[V][V] = { {1, 1, 0, 1},
{0, 1, 1, 0},
{0, 0, 1, 1},
{0, 0, 0, 1}
};
// Print the solution
transitiveClosure(graph);
return 0;
}
Java
// Program for transitive closure using Floyd Warshall Algorithm
import java.util.*;
import java.lang.*;
import java.io.*;
class GraphClosure
{
final static int V = 4; //Number of vertices in a graph
// Prints transitive closure of graph[][] using Floyd
// Warshall algorithm
void transitiveClosure(int graph[][])
{
/* reach[][] will be the output matrix that will finally
have the shortest distances between every pair of
vertices */
int reach[][] = new int[V][V];
int i, j, k;
/* Initialize the solution matrix same as input graph
matrix. Or we can say the initial values of shortest
distances are based on shortest paths considering
no intermediate vertex. */
for (i = 0; i < V; i++)
for (j = 0; j < V; j++)
reach[i][j] = graph[i][j];
/* Add all vertices one by one to the set of intermediate
vertices.
---> Before start of a iteration, we have reachability
values for all pairs of vertices such that the
reachability values consider only the vertices in
set {0, 1, 2, .. k-1} as intermediate vertices.
----> After the end of a iteration, vertex no. k is
added to the set of intermediate vertices and the
set becomes {0, 1, 2, .. k} */
for (k = 0; k < V; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as destination for the
// above picked source
for (j = 0; j < V; j++)
{
// If vertex k is on a path from i to j,
// then make sure that the value of reach[i][j] is 1
reach[i][j] = (reach[i][j]!=0) ||
((reach[i][k]!=0) && (reach[k][j]!=0))?1:0;
}
}
}
// Print the shortest distance matrix
printSolution(reach);
}
/* A utility function to print solution */
void printSolution(int reach[][])
{
System.out.println("Following matrix is transitive closure"+
" of the given graph");
for (int i = 0; i < V; i++)
{
for (int j = 0; j < V; j++)
System.out.print(reach[i][j]+" ");
System.out.println();
}
}
// Driver program to test above function
public static void main (String[] args)
{
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
int graph[][] = new int[][]{ {1,
{0,
{0,
{0,
};
1,
1,
0,
0,
0,
1,
1,
0,
1},
0},
1},
1}
A graph where all vertices are connected with each other, has exactly one connected component, consisting of the whole graph. Such graph with
only one connected component is called as Strongly Connected Graph.
The problem can be easily solved by applying DFS() on each component. In each DFS() call, a component or a sub-graph is visited. We will call
DFS on the next un-visited component. The number of calls to DFS() gives the number of connected components. BFS can also be used.
What is an island?
A group of connected 1s forms an island. For example, the below matrix contains 5 islands
{1, 1, 0, 0, 0},
{0, 1, 0,
{1, 0, 0,
{0, 0, 0,
{1, 0, 1,
0,
1,
0,
0,
1},
1},
0},
1}
A cell in 2D matrix can be connected to 8 neighbors. So, unlike standard DFS(), where we recursively call for all adjacent vertices, here we can
recursive call for 8 neighbors only. We keep track of the visited 1s so that they are not visited again.
C/C++
// Program to count islands in boolean 2D matrix
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#define ROW 5
#define COL 5
// A function to check if a given cell (row, col) can be included in DFS
int isSafe(int M[][COL], int row, int col, bool visited[][COL])
{
// row number is in range, column number is in range and value is 1
// and not yet visited
return (row >= 0) && (row < ROW) &&
(col >= 0) && (col < COL) &&
(M[row][col] && !visited[row][col]);
}
// A utility function to do DFS for a 2D boolean matrix. It only considers
// the 8 neighbours as adjacent vertices
void DFS(int M[][COL], int row, int col, bool visited[][COL])
{
// These arrays are used to get row and column numbers of 8 neighbours
// of a given cell
static int rowNbr[] = {-1, -1, -1, 0, 0, 1, 1, 1};
static int colNbr[] = {-1, 0, 1, -1, 1, -1, 0, 1};
// Mark this cell as visited
visited[row][col] = true;
// Recur for all connected neighbours
for (int k = 0; k < 8; ++k)
{1, 1, 0, 0, 0},
1},
1},
0},
1}
Java
// Java program to count islands in boolean 2D matrix
import java.util.*;
import java.lang.*;
import java.io.*;
class Islands
{
//No of rows and columns
static final int ROW = 5, COL = 5;
// A function to check if a given cell (row, col) can
// be included in DFS
boolean isSafe(int M[][], int row, int col,
boolean visited[][])
{
// row number is in range, column number is in range
// and value is 1 and not yet visited
return (row >= 0) && (row < ROW) &&
(col >= 0) && (col < COL) &&
(M[row][col]==1 && !visited[row][col]);
}
// A utility function to do DFS for a 2D boolean matrix.
// It only considers the 8 neighbors as adjacent vertices
void DFS(int M[][], int row, int col, boolean visited[][])
{
// These arrays are used to get row and column numbers
// of 8 neighbors of a given cell
int rowNbr[] = new int[] {-1, -1, -1, 0, 0, 1, 1, 1};
int colNbr[] = new int[] {-1, 0, 1, -1, 1, -1, 0, 1};
// Mark this cell as visited
visited[row][col] = true;
// Recur for all connected neighbours
for (int k = 0; k < 8; ++k)
Output:
Number of islands is: 5
Count all possible walks from a source to a destination with exactly k edges
Given a directed graph and two vertices u and v in it, count all possible walks from u to v with exactly k edges on the walk.
The graph is given as adjacency matrix representation where value of graph[i][j] as 1 indicates that there is an edge from vertex i to vertex j and a
value 0 indicates no edge from i to j.
For example consider the following graph. Let source u be vertex 0, destination v be 3 and k be 2. The output should be 2 as there are two walk
from 0 to 3 with exactly 2 edges. The walks are {0, 2, 3} and {0, 1, 3}
A simple solution is to start from u, go to all adjacent vertices and recur for adjacent vertices with k as k-1, source as adjacent vertex and
destination as v. Following is C++ implementation of this simple solution.
C++
// C++ program to count walks from u to v with exactly k edges
#include <iostream>
using namespace std;
// Number of vertices in the graph
#define V 4
// A naive recursive function to count walks from u to v with k edges
int countwalks(int graph[][V], int u, int v, int k)
{
// Base cases
if (k == 0 && u == v)
return 1;
if (k == 1 && graph[u][v]) return 1;
if (k <= 0)
return 0;
// Initialize result
int count = 0;
// Go to all adjacents of u and recur
for (int i = 0; i < V; i++)
if (graph[u][i] == 1) // Check if is adjacent of u
count += countwalks(graph, i, v, k-1);
return count;
}
// driver program to test above function
int main()
{
/* Let us create the graph shown in above diagram*/
int graph[V][V] = { {0, 1, 1, 1},
{0, 0, 0, 1},
{0, 0, 0, 1},
{0, 0, 0, 0}
};
int u = 0, v = 3, k = 2;
cout << countwalks(graph, u, v, k);
return 0;
}
Java
// Java program to count walks from u to v with exactly k edges
import java.util.*;
import java.lang.*;
import java.io.*;
class KPaths
{
static final int V = 4; //Number of vertices
// A naive recursive function to count walks from u
// to v with k edges
int countwalks(int graph[][], int u, int v, int k)
{
// Base cases
if (k == 0 && u == v)
return 1;
if (k == 1 && graph[u][v] == 1) return 1;
if (k <= 0)
return 0;
// Initialize result
int count = 0;
// Go to all adjacents of u and recur
for (int i = 0; i < V; i++)
if (graph[u][i] == 1) // Check if is adjacent of u
count += countwalks(graph, i, v, k-1);
return count;
}
// Driver method
public static void main (String[] args) throws java.lang.Exception
{
/* Let us create the graph shown in above diagram*/
int graph[][] =new int[][] { {0, 1, 1, 1},
{0, 0, 0, 1},
{0, 0, 0, 1},
{0, 0, 0, 0}
};
int u = 0, v = 3, k = 2;
KPaths p = new KPaths();
System.out.println(p.countwalks(graph, u, v, k));
}
}//Contributed by Aakash Hasija
The worst case time complexity of the above function is O(Vk) where V is the number of vertices in the given graph. We can simply analyze the
time complexity by drawing recursion tree. The worst occurs for a complete graph. In worst case, every internal node of recursion tree would have
exactly n children.
We can optimize the above solution using Dynamic Programming. The idea is to build a 3D table where first dimension is source, second
dimension is destination, third dimension is number of edges from source to destination, and the value is count of walks. Like other Dynamic
Programming problems, we fill the 3D table in bottom up manner.
C++
// C++ program to count walks from u to v with exactly k edges
#include <iostream>
using namespace std;
// Number of vertices in the graph
#define V 4
// A Dynamic programming based function to count
// to v with k edges
int countwalks(int graph[][V], int u, int v, int
{
// Table to be filled up using DP. The value
// store count of possible walks from i to j
int count[V][V][k+1];
walks from u
k)
count[i][j][e] will
with exactly k edges
Java
// Java program to count walks from u to v with exactly k edges
import java.util.*;
import java.lang.*;
import java.io.*;
class KPaths
{
static final int V = 4; //Number of vertices
// A Dynamic programming based function to count walks from u
// to v with k edges
int countwalks(int graph[][], int u, int v, int k)
{
// Table to be filled up using DP. The value count[i][j][e]
// will/ store count of possible walks from i to j with
// exactly k edges
int count[][][] = new int[V][V][k+1];
// Loop for number of edges from 0 to k
for (int e = 0; e <= k; e++)
{
for (int i = 0; i < V; i++) // for source
{
for (int j = 0; j < V; j++) // for destination
{
// initialize value
count[i][j][e] = 0;
// from base cases
if (e == 0 && i == j)
count[i][j][e] = 1;
if (e == 1 && graph[i][j]!=0)
count[i][j][e] = 1;
// go to adjacent only when number of edges
// is more than 1
if (e > 1)
{
for (int a = 0; a < V; a++) // adjacent of i
if (graph[i][a]!=0)
count[i][j][e] += count[a][j][e-1];
}
}
}
}
return count[u][v][k];
}
// Driver method
public static void main (String[] args) throws java.lang.Exception
{
/* Let us create the graph shown in above diagram*/
int graph[][] =new int[][] { {0, 1, 1, 1},
{0, 0, 0, 1},
{0, 0, 0, 1},
{0, 0, 0, 0}
};
int u = 0, v = 3, k = 2;
KPaths p = new KPaths();
System.out.println(p.countwalks(graph, u, v, k));
}
}//Contributed by Aakash Hasija
Time complexity of the above DP based solution is O(V3K) which is much better than the naive solution.
We can also use Divide and Conquer to solve the above problem in O(V3Logk) time. The count of walks of length k from u to v is the [u][v]th
entry in (graph[V][V])k. We can calculate power of by doing O(Logk) multiplication by using the divide and conquer technique to calculate power.
A multiplication between two matrices of size V x V takes O(V3) time. Therefore overall time complexity of this method is O(V3Logk).
C++
// A C++ program to check if a given directed graph is Eulerian or not
#include<iostream>
#include <list>
#define CHARS 26
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
int *in;
public:
// Constructor and destructor
Graph(int V);
~Graph() { delete [] adj; delete [] in; }
// function to add an edge to graph
void addEdge(int v, int w) { adj[v].push_back(w); (in[w])++; }
// Method to check if this graph is Eulerian or not
bool isEulerianCycle();
// Method to check if all non-zero degree vertices are connected
bool isSC();
// Function to do DFS starting from v. Used in isConnected();
void DFSUtil(int v, bool visited[]);
Graph getTranspose();
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
in = new int[V];
for (int i = 0; i < V; i++)
in[i] = 0;
}
/* This function returns true if the directed graph has an eulerian
cycle, otherwise returns false */
bool Graph::isEulerianCycle()
{
Java
// A Java program to check if a given directed graph is Eulerian or not
// A class that represents an undirected graph
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents a directed graph using adjacency list
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[];//Adjacency List
private int in[]; //maintaining in degree
//Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
in = new int[V];
for (int i=0; i<v; ++i)
{
adj[i] = new LinkedList();
in[i] = 0;
}
}
//Function to add an edge into the graph
void addEdge(int v,int w)
{
adj[v].add(w);
in[w]++;
}
// A recursive function to print DFS starting from v
void DFSUtil(int v,Boolean visited[])
{
// Mark the current node as visited
visited[v] = true;
int n;
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> i =adj[v].iterator();
while (i.hasNext())
{
n = i.next();
if (!visited[n])
DFSUtil(n,visited);
}
}
// Function that returns reverse (or transpose) of this graph
Graph getTranspose()
{
Graph g = new Graph(V);
for (int v = 0; v < V; v++)
{
// Recur for all the vertices adjacent to this vertex
Iterator<Integer> i = adj[v].listIterator();
while (i.hasNext())
{
g.adj[i.next()].add(v);
(g.in[v])++;
}
}
return g;
}
// The main function that returns true if graph is strongly
// connected
Boolean isSC()
{
// Step 1: Mark all the vertices as not visited (For
// first DFS)
Boolean visited[] = new Boolean[V];
for (int i = 0; i < V; i++)
visited[i] = false;
// Step 2: Do DFS traversal starting from first vertex.
DFSUtil(0, visited);
// If DFS traversal doesn't visit all vertices, then return false.
for (int i = 0; i < V; i++)
if (visited[i] == false)
return false;
// Step 3: Create a reversed graph
Graph gr = getTranspose();
// Step 4: Mark all the vertices as not visited (For second DFS)
for (int i = 0; i < V; i++)
visited[i] = false;
// Step 5: Do DFS for reversed graph starting from first vertex.
// Staring Vertex must be same starting point of first DFS
gr.DFSUtil(0, visited);
// If all vertices are not visited in second DFS, then
// return false
for (int i = 0; i < V; i++)
if (visited[i] == false)
return false;
return true;
}
/* This function returns true if the directed graph has an eulerian
cycle, otherwise returns false */
Boolean isEulerianCycle()
{
// Check if all non-zero degree vertices are connected
if (isSC() == false)
return false;
// Check if in degree and out degree of every vertex is same
for (int i = 0; i < V; i++)
if (adj[i].size() != in[i])
return false;
return true;
}
public static void main (String[] args) throws java.lang.Exception
{
Graph g = new Graph(5);
g.addEdge(1, 0);
g.addEdge(0, 2);
g.addEdge(2, 1);
g.addEdge(0, 3);
g.addEdge(3, 4);
g.addEdge(4, 0);
if (g.isEulerianCycle())
Time complexity of the above implementation is O(V + E) as Kosarajus algorithm takes O(V + E) time. After running Kosarajus algorithm we
traverse all vertices and compare in degree with out degree which takes O(V) time.
See following as an application of this.
Find if the given array of strings can be chained to form a circle.
Biconnected Components
A biconnected component is a maximal biconnected subgraph.
Biconnected Graph is already discussed here. In this article, we will see how to find biconnected component in a graph using algorithm by John
Hopcroft and Robert Tarjan.
C++
// A C++ program to find biconnected components in a given undirected graph
#include<iostream>
#include <list>
#include <stack>
#define NIL -1
using namespace std;
int count = 0;
class Edge
{
public:
int u;
int v;
Edge(int u, int v);
};
Edge::Edge(int u, int v)
{
this->u = u;
this->v = v;
}
// A class that represents an directed graph
class Graph
{
int V;
// No. of vertices
int E;
// No. of edges
list<int> *adj;
// A dynamic array of adjacency lists
// A Recursive DFS based function used by BCC()
void BCCUtil(int u, int disc[], int low[],
list<Edge> *st, int parent[]);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void BCC();
// prints strongly connected components
};
Graph::Graph(int V)
{
this->V = V;
this->E = 0;
Java
// A Java program to find biconnected components in a given
// undirected graph
import java.io.*;
import java.util.*;
// This class represents a directed graph using adjacency
// list representation
class Graph
{
private int V, E; // No. of vertices & Edges respectively
private LinkedList<Integer> adj[]; // Adjacency List
// Count is number of biconnected components. time is
// used to find discovery times
static int count = 0, time = 0;
class Edge
{
int u;
int v;
Edge(int u, int v)
{
this.u = u;
this.v = v;
}
};
//Constructor
Graph(int v)
{
V = v;
E = 0;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v,int w)
{
adj[v].add(w);
E++;
}
// A recursive function that finds and prints strongly connected
// components using DFS traversal
// u --> The vertex to be visited next
// disc[] --> Stores discovery times of visited vertices
// low[] -- >> earliest visited vertex (the vertex with minimum
//
discovery time) that can be reached from subtree
//
rooted with current vertex
// *st -- >> To store visited edges
void BCCUtil(int u, int disc[], int low[], LinkedList<Edge>st,
int parent[])
{
// Initialize discovery time and low value
disc[u] = low[u] = ++time;
int children = 0;
// Go through all vertices adjacent to this
Iterator<Integer> it = adj[u].iterator();
while (it.hasNext())
{
int v = it.next(); // v is current adjacent of 'u'
// If v is not visited yet, then recur for it
if (disc[v] == -1)
{
children++;
parent[v] = u;
// store the edge in stack
st.add(new Edge(u,v));
BCCUtil(v, disc, low, st, parent);
//
//
//
if
// If u is an articulation point,
// pop all edges from stack till u -- v
if ( (disc[u] == 1 && children > 1) ||
(disc[u] > 1 && low[v] >= disc[u]) )
{
while (st.getLast().u != u || st.getLast().v != v)
{
System.out.print(st.getLast().u + "--" +
st.getLast().v + " ");
st.removeLast();
}
System.out.println(st.getLast().u + "--" +
st.getLast().v + " ");
st.removeLast();
count++;
}
}
// Update low value of 'u' only of 'v' is still in stack
// (i.e. it's a back edge, not cross edge).
// Case 2 -- per Strongly Connected Components Article
Output:
4--2 3--4
8--9
8--5 7--8
6--0 5--6
10--11
Above are
We have discussed Kosarajus algorithm for strongly connected components. The previously discussed algorithm requires two DFS traversals of a
Graph. In this post, Tarjans algorithm is discussed that requires only one DFS traversal.
Tarjan Algorithm is based on following facts:
1. DFS search produces a DFS tree/forest
2. Strongly Connected Components form subtrees of the DFS tree.
3. If we can find head of such subtrees, we can print/store all the nodes in that subtree (including head) and that will be one SCC.
4. There is no back edge from one SCC to another (There can be cross edges, but cross edges will not be used while processing the graph).
To find head of a SCC, we calculate desc and low array (as done for articulation point, bridge, biconnected component). As discussed in the
previous posts, low[u] indicates earliest visited vertex (the vertex with minimum discovery time) that can be reached from subtree rooted with u. A
node u is head if disc[u] = low[u].
Disc and Low Values
(click on image to see it properly)
Strongly Connected Component relates to directed graph only, but Disc and Low values relate to both directed and undirected graph, so in above
pic we have taken an undirected graph.
In above Figure, we have shown a graph and its one of DFS tree (There could be different DFS trees on same graph depending on order in which
edges are traversed).
In DFS tree, continuous arrows are tree edges and dashed arrows are back edges (DFS Tree Edges
Disc and Low values are showin in Figure for every node as (Disc/Low).
Disc: This is the time when a node is visited 1st time while DFS traversal. For nodes A, B, C, .., J in DFS tree, Disc values are 1, 2, 3, .., 10.
Low: In DFS tree, Tree edges take us forward, from ancestor node to one of its descendants. For example, from node C, tree edges can take us
to node node G, node I etc. Back edges take us backward, from a descendant node to one of its ancestors. For example, from node G, Back
edges take us to E or C. If we look at both Tree and Back edge together, then we can see that if we start traversal from one node, we may go
down the tree via Tree edges and then go up via back edges. For example, from node E, we can go down to G and then go up to C. Similarly
from E, we can go down to I or J and then go up to F. Low value of a node tells the topmost reachable ancestor (with minimum possible Disc
value) via the subtree of that node. So for any node, Low value equal to its Disc value anyway (A node is ancestor of itself). Then we look into its
subtree and see if there is any node which can take us to any of its ancestor. If there are multiple back edges in subtree which take us to different
ancestors, then we take the one with minimum Disc value (i.e. the topmost one). If we look at node F, it has two subtrees. Subtree with node G,
takes us to E and C. The other subtree takes us back to F only. Here topmost ancestor is C where F can reach and so Low value of F is 3 (The
Disc value of C).
Based on above discussion, it should be clear that Low values of B, C, and D are 1 (As A is the topmost node where B, C and D can reach). In
same way, Low values of E, F, G are 3 and Low values of H, I, J are 6.
For any node u, when DFS starts, Low will be set to its Disc 1st.
Then later on DFS will be performed on each of its children v one by one, Low value of u can change it two case:
Case1 (Tree Edge): If node v is not visited already, then after DFS of v is complete, then minimum of low[u] and low[v] will be updated to
low[u].
low[u] = min(low[u], low[v]);
Case 2 (Back Edge): When child v is already visited, then minimum of low[u] and Disc[v] will be updated to low[u].
low[u] = min(low[u], disc[v]);
In case two, can we take low[v] instead of disc[v] ?? . Answer is NO. If you can think why answer is NO, you probably understood the Low and
Disc concept.
Same Low and Disc values help to solve other graph problems like articulation point, bridge and biconnected component.
To track the subtree rooted at head, we can use a stack (keep pushing node while visiting). When a head node found, pop all nodes from stack till
you get head out of stack.
To make sure, we dont consider cross edges, when we reach a node which is already visited, we should process the visited node only if it is
present in stack, else ignore the node.
Following is C++ implementation of Tarjans algorithm to print all SCCs.
// A C++ program to find strongly connected components in a given
// directed graph using Tarjan's algorithm (single DFS)
#include<iostream>
#include <list>
#include <stack>
#define NIL -1
using namespace std;
// A class that represents an directed graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
// A Recursive DFS based function used by SCC()
void SCCUtil(int u, int disc[], int low[],
stack<int> *st, bool stackMember[]);
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void SCC();
// prints strongly connected components
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w);
}
// A recursive function that finds and prints strongly connected
// components using DFS traversal
// u --> The vertex to be visited next
// disc[] --> Stores discovery times of visited vertices
// low[] -- >> earliest visited vertex (the vertex with minimum
//
discovery time) that can be reached from subtree
//
rooted with current vertex
// *st -- >> To store all the connected ancestors (could be part
//
of SCC)
// stackMember[] --> bit/index array for faster check whether
//
a node is in stack
void Graph::SCCUtil(int u, int disc[], int low[], stack<int> *st,
bool stackMember[])
{
// A static variable is used for simplicity, we can avoid use
// of static variable by passing a pointer.
g1.addEdge(3, 4);
g1.SCC();
cout << "\nSCCs in second graph \n";
Graph g2(4);
g2.addEdge(0, 1);
g2.addEdge(1, 2);
g2.addEdge(2, 3);
g2.SCC();
cout << "\nSCCs in third graph \n";
Graph g3(7);
g3.addEdge(0, 1);
g3.addEdge(1, 2);
g3.addEdge(2, 0);
g3.addEdge(1, 3);
g3.addEdge(1, 4);
g3.addEdge(1, 6);
g3.addEdge(3, 5);
g3.addEdge(4, 5);
g3.SCC();
cout << "\nSCCs in fourth graph \n";
Graph g4(11);
g4.addEdge(0,1);g4.addEdge(0,3);
g4.addEdge(1,2);g4.addEdge(1,4);
g4.addEdge(2,0);g4.addEdge(2,6);
g4.addEdge(3,2);
g4.addEdge(4,5);g4.addEdge(4,6);
g4.addEdge(5,6);g4.addEdge(5,7);g4.addEdge(5,8);g4.addEdge(5,9);
g4.addEdge(6,4);
g4.addEdge(7,9);
g4.addEdge(8,9);
g4.addEdge(9,8);
g4.SCC();
cout << "\nSCCs in fifth graph \n";
Graph g5(5);
g5.addEdge(0,1);
g5.addEdge(1,2);
g5.addEdge(2,3);
g5.addEdge(2,4);
g5.addEdge(3,0);
g5.addEdge(4,2);
g5.SCC();
return 0;
}
Output:
SCCs in first graph
4
3
1 2 0
SCCs in second graph
3
2
1
0
SCCs in third graph
5
3
4
6
2 1 0
SCCs in fourth graph
8 9
7
5 4 6
3 2 1 0
10
SCCs in fifth graph
4 3 2 1 0
Time Complexity: The above algorithm mainly calls DFS, DFS takes O(V+E) for a graph represented using adjacency list.
References:
http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
http://www.ics.uci.edu/~eppstein/161/960220.html#sca
Following are C++ and Java implementations of the above Greedy Algorithm.
C++
// A C++ program to implement greedy algorithm for graph coloring
#include <iostream>
#include <list>
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
public:
// Constructor and destructor
Graph(int V) { this->V = V; adj = new list<int>[V]; }
~Graph()
{ delete [] adj; }
// function to add an edge to graph
void addEdge(int v, int w);
// Prints greedy coloring of the vertices
void greedyColoring();
};
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w);
adj[w].push_back(v); // Note: the graph is undirected
}
// Assigns colors (starting from 0) to all vertices and prints
// the assignment of colors
void Graph::greedyColoring()
{
int result[V];
// Assign the first color to first vertex
result[0] = 0;
// Initialize remaining V-1 vertices as unassigned
for (int u = 1; u < V; u++)
result[u] = -1; // no color is assigned to u
// A temporary array to store the available colors. True
// value of available[cr] would mean that the color cr is
// assigned to one of its adjacent vertices
bool available[V];
for (int cr = 0; cr < V; cr++)
available[cr] = false;
// Assign colors to remaining V-1 vertices
for (int u = 1; u < V; u++)
{
// Process all adjacent vertices and flag their colors
// as unavailable
list<int>::iterator i;
for (i = adj[u].begin(); i != adj[u].end(); ++i)
if (result[*i] != -1)
available[result[*i]] = true;
Java
// A Java program to implement greedy algorithm for graph coloring
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents an undirected graph using adjacency list
class Graph
{
private int V; // No. of vertices
private LinkedList<Integer> adj[]; //Adjacency List
//Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v,int w)
{
adj[v].add(w);
adj[w].add(v); //Graph is undirected
}
// Assigns colors (starting from 0) to all vertices and
// prints the assignment of colors
void greedyColoring()
{
Coloring
Vertex 0
Vertex 1
Vertex 2
Vertex 3
Vertex 4
of graph 1
---> Color
---> Color
---> Color
---> Color
---> Color
0
1
2
0
1
Coloring
Vertex 0
Vertex 1
Vertex 2
Vertex 3
Vertex 4
of graph 2
---> Color
---> Color
---> Color
---> Color
---> Color
0
1
2
0
3
So the order in which the vertices are picked is important. Many people have suggested different ways to find an ordering that work better than the
basic algorithm on average. The most common is WelshPowell Algorithm which considers vertices in descending order of degrees.
How does the basic algorithm guarantee an upper bound of d+1?
Here d is the maximum degree in the given graph. Since d is maximum degree, a vertex cannot be attached to more than d vertices. When we
color a vertex, at most d colors could have already been used by its adjacent. To color this vertex, we need to pick the smallest numbered color
that is not used by the adjacent vertices. If colors are numbered like 1, 2, ., then the value of such smallest number must be between 1 to d+1
(Note that d numbers are already picked by adjacent vertices).
This can also be proved using induction. See this video lecture for proof.
We will soon be discussing some interesting facts about chromatic number and graph coloring.
For example, consider the graph shown in figure on right side. A TSP tour in the graph is 1-2-4-3-1. The cost of the tour is 10+25+30+15 which
is 80.
The problem is a famous NP hard problem. There is no polynomial time know solution for this problem.
Following are different solutions for the traveling salesman problem.
Naive Solution:
1) Consider city 1 as the starting and ending point.
2) Generate all (n-1)! Permutations of cities.
3) Calculate cost of every permutation and keep track of minimum cost permutation.
4) Return the permutation with minimum cost.
Time Complexity: ?(n!)
Dynamic Programming:
Let the given set of vertices be {1, 2, 3, 4,.n}. Let us consider 1 as starting and ending point of output. For every other vertex i (other than 1), we
find the minimum cost path with 1 as the starting point, i as the ending point and all vertices appearing exactly once. Let the cost of this path be
cost(i), the cost of corresponding Cycle would be cost(i) + dist(i, 1) where dist(i, 1) is the distance from i to 1. Finally, we return the minimum of
all [cost(i) + dist(i, 1)] values. This looks simple so far. Now the question is how to get cost(i)?
To calculate cost(i) using Dynamic Programming, we need to have some recursive relation in terms of sub-problems. Let us define a term C(S, i)
be the cost of the minimum cost path visiting each vertex in set S exactly once, starting at 1 and ending at i.
We start with all subsets of size 2 and calculate C(S, i) for all subsets where S is the subset, then we calculate C(S, i) for all subsets S of size 3 and
so on. Note that 1 must be present in every subset.
If size of S is 2, then S must be {1, i},
C(S, i) = dist(1, i)
Else if size of S is greater than 2.
C(S, i) = min { C(S-{i}, j) + dis(j, i)} where j belongs to S, j != i and j != 1.
For a set of size n, we consider n-2 subsets each of size n-1 such that all subsets dont have nth in them.
Using the above recurrence relation, we can write dynamic programming based solution. There are at most O(n*2n) subproblems, and each one
takes linear time to solve. The total running time is therefore O(n2*2n). The time complexity is much less than O(n!), but still exponential. Space
required is also exponential. So this approach is also infeasible even for slightly higher number of vertices.
We will soon be discussing approximate algorithms for travelling salesman problem.
References:
http://www.lsi.upc.edu/~mjserna/docencia/algofib/P07/dynprog.pdf
http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf
In this case, the approximate algorithm produces the optimal tour, but it may not produce optimal tour in all cases.
How is this algorithm 2-approximate? The cost of the output produced by the above algorithm is never more than twice the cost of best
possible output. Let us see how is this guaranteed by the above algorithm.
Let us define a term full walk to understand this. A full walk is lists all vertices when they are first visited in preorder, it also list vertices when they
are returned after a subtree is visited in preorder. The full walk of above tree would be 1-2-1-4-1-3-1.
Following are some important facts that prove the 2-approximateness.
1) The cost of best possible Travelling Salesman tour is never less than the cost of MST. (The definition of MST says, it is a minimum cost tree that
connects all vertices).
2) The total cost of full walk is at most twice the cost of MST (Every edge of MST is visited at-most twice)
3) The output of the above algorithm is less than the cost of full walk. In above algorithm, we print preorder walk as output. In prreorder walk,
two or more edges of full walk are replaced with a single edge. For example, 2-1 and 1-4 are replaced by 1 edge 2-4. So if the graph follows
triangle inequality, then this is always true.
From the above three statements, we can conclude that the cost of output produced by the approximate algorithm is never more than twice the
cost of best possible solution.
We have discussed a very simple 2-approximate algorithm for the travelling salesman problem. There are other better approximate algorithms for
the problem. For example Christofides algorithm is 1.5 approximate algorithm. We will soon be discussing these algorithms as separate posts.
References:
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/AproxAlgor/TSP/tsp.htm
Naive Algorithm
Generate all possible configurations of vertices and print a configuration that satisfies the given constraints. There will be n! (n factorial)
configurations.
while there are untried conflagrations
{
generate the next configuration
if ( there are edges between two consecutive vertices of this
configuration and there is an edge from the last vertex to
the first ).
{
print this configuration;
break;
}
}
Backtracking Algorithm
Create an empty path array and add vertex 0 to it. Add other vertices, starting from the vertex 1. Before adding a vertex, check for whether it is
adjacent to the previously added vertex and not already added. If we find such a vertex, we add the vertex as part of the solution. If we do not
find a vertex then we return false.
Implementation of Backtracking solution
Following are implementations of the Backtracking solution.
C/C++
/* C/C++ program for solution of Hamiltonian Cycle problem
using backtracking */
#include<stdio.h>
// Number of vertices in the graph
#define V 5
void printSolution(int path[]);
/* A utility function to check if the vertex v can be added at
index 'pos' in the Hamiltonian Cycle constructed so far (stored
in 'path[]') */
bool isSafe(int v, bool graph[V][V], int path[], int pos)
{
/* Check if this vertex is an adjacent vertex of the previously
added vertex. */
if (graph [ path[pos-1] ][ v ] == 0)
return false;
/* Check if the vertex has already been included.
This step can be optimized by creating an array of size V */
for (int i = 0; i < pos; i++)
if (path[i] == v)
return false;
return true;
}
/* A recursive utility function to solve hamiltonian cycle problem */
bool hamCycleUtil(bool graph[V][V], int path[], int pos)
{
/* base case: If all vertices are included in Hamiltonian Cycle */
if (pos == V)
{
// And if there is an edge from the last included vertex to the
// first vertex
if ( graph[ path[pos-1] ][ path[0] ] == 1 )
return true;
else
return false;
}
// Try different vertices as a next candidate in Hamiltonian Cycle.
// We don't try for 0 as we included 0 as starting point in in hamCycle()
for (int v = 1; v < V; v++)
{
/* Check if this vertex can be added to Hamiltonian Cycle */
if (isSafe(v, graph, path, pos))
{
path[pos] = v;
/* recur to construct rest of the path */
if (hamCycleUtil (graph, path, pos+1) == true)
return true;
/* If adding vertex v doesn't lead to a solution,
then remove it */
path[pos] = -1;
}
}
/* If no vertex can be added to Hamiltonian Cycle constructed so far,
then return false */
return false;
}
/* This function solves the Hamiltonian Cycle problem using Backtracking.
It mainly uses hamCycleUtil() to solve the problem. It returns false
if there is no Hamiltonian Cycle possible, otherwise return true and
prints the path. Please note that there may be more than one solutions,
this function prints one of the feasible solutions. */
bool hamCycle(bool graph[V][V])
{
int *path = new int[V];
for (int i = 0; i < V; i++)
path[i] = -1;
/* Let us put vertex 0 as the first vertex in the path. If there is
a Hamiltonian Cycle, then the path can be started from any point
of the cycle as the graph is undirected */
path[0] = 0;
if ( hamCycleUtil(graph, path, 1) == false )
{
printf("\nSolution does not exist");
return false;
}
printSolution(path);
return true;
}
/* A utility function to print solution */
void printSolution(int path[])
{
printf ("Solution Exists:"
" Following is one Hamiltonian Cycle \n");
for (int i = 0; i < V; i++)
printf(" %d ", path[i]);
// Let us print the first vertex again to show the complete cycle
printf(" %d ", path[0]);
printf("\n");
}
// driver program to test above function
int main()
{
/* Let us create the following graph
(0)--(1)--(2)
| / \ |
| / \ |
| /
\ |
(3)-------(4)
*/
bool graph1[V][V] = {{0, 1, 0, 1, 0},
{1, 0, 1, 1, 1},
{0, 1, 0, 0, 1},
{1, 1, 0, 0, 1},
{0, 1, 1, 1, 0},
};
// Print the solution
hamCycle(graph1);
/* Let us create the following graph
(0)--(1)--(2)
| / \ |
| / \ |
| /
\ |
(3)
(4)
*/
bool graph2[V][V] = {{0, 1, 0, 1, 0},
{1, 0, 1, 1, 1},
{0, 1, 0, 0, 1},
{1, 1, 0, 0, 0},
{0, 1, 1, 0, 0},
};
// Print the solution
hamCycle(graph2);
return 0;
}
Java
/* Java program for solution of Hamiltonian Cycle problem
using backtracking */
class HamiltonianCycle
{
final int V = 5;
int path[];
/* A utility function to check if the vertex v can be
added at index 'pos'in the Hamiltonian Cycle
constructed so far (stored in 'path[]') */
boolean isSafe(int v, int graph[][], int path[], int pos)
{
/* Check if this vertex is an adjacent vertex of
the previously added vertex. */
if (graph[path[pos - 1]][v] == 0)
return false;
/* Check if the vertex has already been included.
This step can be optimized by creating an array
of size V */
for (int i = 0; i < pos; i++)
if (path[i] == v)
return false;
return true;
}
/* A recursive utility function to solve hamiltonian
cycle problem */
boolean hamCycleUtil(int graph[][], int path[], int pos)
{
/* base case: If all vertices are included in
Hamiltonian Cycle */
if (pos == V)
{
// And if there is an edge from the last included
// vertex to the first vertex
if (graph[path[pos - 1]][path[0]] == 1)
return true;
else
return false;
}
// Try different vertices as a next candidate in
// Hamiltonian Cycle. We don't try for 0 as we
// included 0 as starting point in in hamCycle()
for (int v = 1; v < V; v++)
{
/* Check if this vertex can be added to Hamiltonian
Cycle */
if (isSafe(v, graph, path, pos))
{
path[pos] = v;
/* recur to construct rest of the path */
if (hamCycleUtil(graph, path, pos + 1) == true)
return true;
/* If adding vertex v doesn't lead to a solution,
then remove it */
path[pos] = -1;
}
}
/* If no vertex can be added to Hamiltonian Cycle
constructed so far, then return false */
return false;
}
/* This function solves the Hamiltonian Cycle problem using
Backtracking. It mainly uses hamCycleUtil() to solve the
problem. It returns false if there is no Hamiltonian Cycle
possible, otherwise return true and prints the path.
Please note that there may be more than one solutions,
this function prints one of the feasible solutions. */
int hamCycle(int graph[][])
{
path = new int[V];
for (int i = 0; i < V; i++)
path[i] = -1;
/* Let us put vertex 0 as the first vertex in the path.
If there is a Hamiltonian Cycle, then the path can be
started from any point of the cycle as the graph is
undirected */
path[0] = 0;
if (hamCycleUtil(graph, path, 1) == false)
{
System.out.println("\nSolution does not exist");
return 0;
}
printSolution(path);
return 1;
}
/* A utility function to print solution */
void printSolution(int path[])
{
System.out.println("Solution Exists: Following" +
" is one Hamiltonian Cycle");
for (int i = 0; i < V; i++)
System.out.print(" " + path[i] + " ");
// Let us print the first vertex again to show the
// complete cycle
System.out.println(" " + path[0] + " ");
}
// driver program to test above function
public static void main(String args[])
{
HamiltonianCycle hamiltonian =
new HamiltonianCycle();
/* Let us create the following graph
(0)--(1)--(2)
| / \ |
| / \ |
| /
\ |
(3)-------(4)
int graph1[][] =
{1, 0, 1, 1,
{0, 1, 0, 0,
{1, 1, 0, 0,
{0, 1, 1, 1,
};
*/
{{0, 1, 0, 1, 0},
1},
1},
1},
0},
*/
{{0, 1, 0, 1, 0},
1},
1},
0},
0},
Vertex Cover Problem is a known NP Complete problem, i.e., there is no polynomial time solution for this unless P = NP. There are approximate
polynomial time algorithms to solve the problem though. Following is a simple approximate algorithm adapted from CLRS book.
Approximate Algorithm for Vertex Cover:
1) Initialize the result as {}
2) Consider a set of all edges in given graph. Let the set be E.
3) Do following while E is not empty
...a) Pick an arbitrary edge (u, v) from set E and add 'u' and 'v' to result
...b) Remove all edges from E which are either incident on u or v.
4) Return result
Following diagram taken from CLRS book shows execution of above approximate algorithm.
C++
// Program to print Vertex Cover of a given undirected graph
#include<iostream>
#include <list>
using namespace std;
// This class represents a undirected graph using adjacency list
class Graph
{
int V;
// No. of vertices
list<int> *adj; // Pointer to an array containing adjacency lists
public:
Graph(int V); // Constructor
void addEdge(int v, int w); // function to add an edge to graph
void printVertexCover(); // prints vertex cover
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
adj[w].push_back(v); // Since the graph is undirected
}
// The function to print vertex cover
void Graph::printVertexCover()
{
// Initialize all vertices as not visited.
bool visited[V];
for (int i=0; i<V; i++)
visited[i] = false;
list<int>::iterator i;
// Consider all edges one by one
for (int u=0; u<V; u++)
{
// An edge is only picked when both visited[u] and visited[v]
// are false
if (visited[u] == false)
{
// Go through all adjacents of u and pick the first not
// yet visited vertex (We are basically picking an edge
// (u, v) from remaining edges.
for (i= adj[u].begin(); i != adj[u].end(); ++i)
{
int v = *i;
if (visited[v] == false)
{
// Add the vertices (u, v) to the result set.
// We make the vertex u and v visited so that
// all edges from/to them would be ignored
visited[v] = true;
visited[u] = true;
break;
}
}
}
}
// Print the vertex cover
for (int i=0; i<V; i++)
if (visited[i])
cout << i << " ";
}
// Driver program to test methods of graph class
int main()
{
// Create a graph given in the above diagram
Graph g(7);
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 3);
g.addEdge(3, 4);
g.addEdge(4, 5);
g.addEdge(5, 6);
g.printVertexCover();
return 0;
}
Java
// Java Program to print Vertex Cover of a given undirected graph
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// This class represents an undirected graph using adjacency list
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
adj[w].add(v); //Graph is undirected
}
// The function to print vertex cover
void printVertexCover()
{
// Initialize all vertices as not visited.
boolean visited[] = new boolean[V];
for (int i=0; i<V; i++)
visited[i] = false;
Iterator<Integer> i;
// Consider all edges one by one
for (int u=0; u<V; u++)
{
// An edge is only picked when both visited[u]
// and visited[v] are false
if (visited[u] == false)
{
// Go through all adjacents of u and pick the
// first not yet visited vertex (We are basically
// picking an edge (u, v) from remaining edges.
i = adj[u].iterator();
while (i.hasNext())
{
int v = i.next();
if (visited[v] == false)
{
// Add the vertices (u, v) to the result
// set. We make the vertex u and v visited
// so that all edges from/to them would
// be ignored
visited[v] = true;
visited[u] = true;
break;
}
}
}
}
// Print the vertex cover
for (int j=0; j<V; j++)
if (visited[j])
System.out.print(j+" ");
}
// Driver method
0 1 3 4 5 6
There is no polynomial time solution available for this problem as the problem is a known NP-Hard problem. There is a polynomial time Greedy
approximate algorithm, the greedy algorithm provides a solution which is never worse that twice the optimal solution. The greedy solution works
only if the distances between cities follow Triangular Inequality (Distance between two points is always smaller than sum of distances through a
third point).
The 2-Approximate Greedy Algorithm:
1) Choose the first center arbitrarily.
2) Choose remaining k-1 centers using the following criteria.
Let c1, c2, c3, ci be the already chosen centers. Choose
(i+1)th center by picking the city which is farthest from already
selected centers, i.e, the point p which has following value as maximum
Min[dist(p, c1), dist(p, c2), dist(p, c3), . dist(p, ci)]
The following diagram taken from here illustrates above algorithm.
b) This means that distances between all centers are also > 2OPT.
c) We have k + 1 points with distances > 2OPT between every pair.
d) Each point has a center of the optimal solution with distance ? OPT to it.
e) There exists a pair of points with the same center X in the optimal solution (pigeonhole principle: k optimal centers, k+1 points)
f) The distance between them is at most 2OPT (triangle inequality) which is a contradiction.
Source:
http://algo2.iti.kit.edu/vanstee/courses/kcenter.pdf
Ford-Fulkerson Algorithm
The following is simple idea of Ford-Fulkerson algorithm:
1) Start with initial flow as 0.
2) While there is a augmenting path from source to sink.
Add this path-flow to flow.
3) Return flow.
Time Complexity: Time complexity of the above algorithm is O(max_flow * E). We run a loop while there is an augmenting path. In worst case,
we may add 1 unit flow in every iteration. Therefore the time complexity becomes O(max_flow * E).
How to implement the above simple algorithm?
Let us first define the concept of Residual Graph which is needed for understanding the implementation.
Residual Graph of a flow network is a graph which indicates additional possible flow. If there is a path from source to sink in residual graph, then
it is possible to add flow. Every edge of a residual graph has a value called residual capacity which is equal to original capacity of the edge minus
current flow. Residual capacity is basically the current capacity of the edge.
Let us now talk about implementation details. Residual capacity is 0 if there is no edge between to vertices of residual graph. We can initialize the
residual graph as original graph as there is no initial flow and initially residual capacity is equal to original capacity. To find an augmenting path, we
can either do a BFS or DFS of the residual graph. We have used BFS in below implementation. Using BFS, we can find out if there is a path from
source to sink. BFS also builds parent[] array. Using the parent[] array, we traverse through the found path and find possible flow through this path
by finding minimum residual capacity along the path. We later add the found path flow to overall flow.
The important thing is, we need to update residual capacities in the residual graph. We subtract path flow from all edges along the path and we add
path flow along the reverse edges We need to add path flow along reverse edges because may later need to send flow in reverse direction (See
following video for example).
http://www.youtube.com/watch?v=-8MwfgB-lyM
Following are C++ and Java implementations of Ford-Fulkerson algorithm. To keep things simple, graph is represented as a 2D matrix.
C++
// C++ program for implementation of Ford Fulkerson algorithm
#include <iostream>
#include <limits.h>
#include <string.h>
#include <queue>
using namespace std;
// Number of vertices in given graph
#define V 6
/* Returns true if there is a path from source 's' to sink 't' in
residual graph. Also fills parent[] to store the path */
bool bfs(int rGraph[V][V], int s, int t, int parent[])
{
// Create a visited array and mark all vertices as not visited
bool visited[V];
memset(visited, 0, sizeof(visited));
// Create a queue, enqueue source vertex and mark source vertex
// as visited
queue <int> q;
q.push(s);
visited[s] = true;
parent[s] = -1;
// Standard BFS Loop
while (!q.empty())
{
int u = q.front();
q.pop();
for (int v=0; v<V; v++)
{
if (visited[v]==false && rGraph[u][v] > 0)
{
q.push(v);
parent[v] = u;
visited[v] = true;
}
}
}
// If we reached sink in BFS starting from source, then return
// true, else false
return (visited[t] == true);
}
// Returns tne maximum flow from s to t in the given graph
int fordFulkerson(int graph[V][V], int s, int t)
{
int u, v;
// Create a residual graph and fill the residual graph with
// given capacities in the original graph as residual capacities
// in residual graph
int rGraph[V][V]; // Residual graph where rGraph[i][j] indicates
// residual capacity of edge from i to j (if there
// is an edge. If rGraph[i][j] is 0, then there is not)
for (u = 0; u < V; u++)
for (v = 0; v < V; v++)
rGraph[u][v] = graph[u][v];
int parent[V]; // This array is filled by BFS and to store path
int max_flow = 0; // There is no flow initially
// Augment the flow while tere is path from source to sink
while (bfs(rGraph, s, t, parent))
{
// Find minimum residual capacity of the edhes along the
// path filled by BFS. Or we can say find the maximum flow
// through the path found.
int path_flow = INT_MAX;
for (v=t; v!=s; v=parent[v])
{
u = parent[v];
path_flow = min(path_flow, rGraph[u][v]);
}
// update residual capacities of the edges and reverse edges
// along the path
for (v=t; v != s; v=parent[v])
{
u = parent[v];
rGraph[u][v] -= path_flow;
rGraph[v][u] += path_flow;
}
// Add path flow to overall flow
max_flow += path_flow;
}
// Return the overall flow
return max_flow;
}
// Driver program to test above functions
int main()
{
// Let us create a graph shown in the above example
int graph[V][V] = { {0, 16, 13, 0, 0, 0},
{0, 0, 10, 12, 0, 0},
{0, 4, 0, 0, 14, 0},
{0, 0, 9, 0, 0, 20},
{0, 0, 0, 7, 0, 4},
{0, 0, 0, 0, 0, 0}
};
cout << "The maximum possible flow is " << fordFulkerson(graph, 0, 5);
return 0;
}
Java
// Java program for implementation of Ford Fulkerson algorithm
import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.LinkedList;
class MaxFlow
{
static final int V = 6; //Number of vertices in graph
/* Returns true if there is a path from source 's' to sink
't' in residual graph. Also fills parent[] to store the
path */
boolean bfs(int rGraph[][], int s, int t, int parent[])
{
// Create a visited array and mark all vertices as not
// visited
boolean visited[] = new boolean[V];
for(int i=0; i<V; ++i)
visited[i]=false;
// Create a queue, enqueue source vertex and mark
// source vertex as visited
LinkedList<Integer> queue = new LinkedList<Integer>();
queue.add(s);
visited[s] = true;
parent[s]=-1;
// Standard BFS Loop
while (queue.size()!=0)
{
int u = queue.poll();
for (int v=0; v<V; v++)
{
if (visited[v]==false && rGraph[u][v] > 0)
{
queue.add(v);
parent[v] = u;
visited[v] = true;
}
}
}
// If we reached sink in BFS starting from source, then
// return true, else false
return (visited[t] == true);
}
// Returns tne maximum flow from s to t in the given graph
int fordFulkerson(int graph[][], int s, int t)
{
int u, v;
The above implementation of Ford Fulkerson Algorithm is called Edmonds-Karp Algorithm. The idea of Edmonds-Karp is to use BFS in Ford
Fulkerson implementation as BFS always picks a path with minimum number of edges. When BFS is used, the worst case time complexity can be
reduced to O(VE2). The above implementation uses adjacency matrix representation though where BFS takes O(V2) time, the time complexity of
the above implementation is O(EV3) (Refer CLRS book for proof of time complexity)
This is an important problem as it arises in many practical situations. Examples include, maximizing the transportation with given traffic limits,
There can be maximum two edge disjoint paths from source 0 to destination 7 in the above graph. Two edge disjoint paths are highlighted below in
red and blue colors are 0-2-6-7 and 0-3-6-5-7.
Note that the paths may be different, but the maximum number is same. For example, in the above diagram, another possible set of paths is 0-1-26-7 and 0-3-6-5-7 respectively.
This problem can be solved by reducing it to maximum flow problem. Following are steps.
1) Consider the given source and destination as source and sink in flow network. Assign unit capacity to each edge.
2) Run Ford-Fulkerson algorithm to find the maximum flow from source to sink.
3) The maximum flow is equal to the maximum number of edge-disjoint paths.
When we run Ford-Fulkerson, we reduce the capacity by a unit. Therefore, the edge can not be used again. So the maximum flow is equal to the
maximum number of edge-disjoint paths.
Following is C++ implementation of the above algorithm. Most of the code is taken from here.
// C++ program to find maximum number of edge disjoint paths
#include <iostream>
#include <limits.h>
#include <string.h>
#include <queue>
using namespace std;
// Number of vertices in given graph
#define V 8
/* Returns true if there is a path from source 's' to sink 't' in
residual graph. Also fills parent[] to store the path */
bool bfs(int rGraph[V][V], int s, int t, int parent[])
{
// Create a visited array and mark all vertices as not visited
bool visited[V];
memset(visited, 0, sizeof(visited));
// Create a queue, enqueue source vertex and mark source vertex
// as visited
queue <int> q;
q.push(s);
visited[s] = true;
parent[s] = -1;
// Standard BFS Loop
while (!q.empty())
{
int u = q.front();
q.pop();
for (int v=0; v<V; v++)
{
if (visited[v]==false && rGraph[u][v] > 0)
{
q.push(v);
parent[v] = u;
visited[v] = true;
}
}
}
// If we reached sink in BFS starting from source, then return
// true, else false
return (visited[t] == true);
}
// Returns tne maximum number of edge-disjoint paths from s to t.
// This function is copy of forFulkerson() discussed at http://goo.gl/wtQ4Ks
int findDisjointPaths(int graph[V][V], int s, int t)
{
int u, v;
// Create a residual graph and fill the residual graph with
// given capacities in the original graph as residual capacities
// in residual graph
int rGraph[V][V]; // Residual graph where rGraph[i][j] indicates
// residual capacity of edge from i to j (if there
// is an edge. If rGraph[i][j] is 0, then there is not)
for (u = 0; u < V; u++)
for (v = 0; v < V; v++)
rGraph[u][v] = graph[u][v];
int parent[V]; // This array is filled by BFS and to store path
int max_flow = 0; // There is no flow initially
// Augment the flow while tere is path from source to sink
while (bfs(rGraph, s, t, parent))
{
// Find minimum residual capacity of the edges along the
// path filled by BFS. Or we can say find the maximum flow
// through the path found.
int path_flow = INT_MAX;
for (v=t; v!=s; v=parent[v])
{
u = parent[v];
path_flow = min(path_flow, rGraph[u][v]);
}
// update residual capacities of the edges and reverse edges
// along the path
for (v=t; v != s; v=parent[v])
{
u = parent[v];
rGraph[u][v] -= path_flow;
rGraph[v][u] += path_flow;
}
// Add path flow to overall flow
max_flow += path_flow;
}
// Return the overall flow (max_flow is equal to maximum
// number of edge-disjoint paths)
return max_flow;
}
// Driver program to test above functions
int main()
{
// Let us create a graph shown in the above example
int graph[V][V] = { {0, 1, 1, 1, 0, 0, 0, 0},
{0, 0, 1, 0, 0, 0, 0, 0},
{0, 0, 0, 1, 0, 0, 1, 0},
{0, 0, 0, 0, 0, 0, 1, 0},
{0, 0, 1, 0, 0, 0, 0, 1},
{0, 1, 0, 0, 0, 0, 0, 1},
{0, 0, 0, 0, 0, 1, 0, 1},
{0, 0, 0, 0, 0, 0, 0, 0}
};
int s =
int t =
cout <<
<<
0;
7;
"There can be maximum " << findDisjointPaths(graph, s, t)
" edge-disjoint paths from " << s <<" to "<< t ;
return 0;
}
Output:
There can be maximum 2 edge-disjoint paths from 0 to 7
Time Complexity: Same as time complexity of Edmonds-Karp implementation of Ford-Fulkerson (See time complexity discussed here)
References:
http://www.win.tue.nl/~nikhil/courses/2012/2WO08/max-flow-applications-4up.pdf
C++
// C++ program for finding minimum cut using Ford-Fulkerson
#include <iostream>
#include <limits.h>
#include <string.h>
#include <queue>
using namespace std;
// Number of vertices in given graph
#define V 6
/* Returns true if there is a path from source 's' to sink 't' in
residual graph. Also fills parent[] to store the path */
int bfs(int rGraph[V][V], int s, int t, int parent[])
{
// Create a visited array and mark all vertices as not visited
bool visited[V];
memset(visited, 0, sizeof(visited));
// Create a queue, enqueue source vertex and mark source vertex
// as visited
queue <int> q;
q.push(s);
visited[s] = true;
parent[s] = -1;
// Standard BFS Loop
while (!q.empty())
{
int u = q.front();
q.pop();
for (int v=0; v<V; v++)
{
if (visited[v]==false && rGraph[u][v] > 0)
{
q.push(v);
parent[v] = u;
visited[v] = true;
}
}
}
// If we reached sink in BFS starting from source, then return
// true, else false
return (visited[t] == true);
}
// A DFS based function to find all reachable vertices from s. The function
// marks visited[i] as true if i is reachable from s. The initial values in
// visited[] must be false. We can also use BFS to find reachable vertices
void dfs(int rGraph[V][V], int s, bool visited[])
{
visited[s] = true;
for (int i = 0; i < V; i++)
if (rGraph[s][i] && !visited[i])
dfs(rGraph, i, visited);
}
// Prints the minimum s-t cut
void minCut(int graph[V][V], int s, int t)
{
int u, v;
// Create a residual graph and fill the residual graph with
// given capacities in the original graph as residual capacities
// in residual graph
int rGraph[V][V]; // rGraph[i][j] indicates residual capacity of edge i-j
for (u = 0; u < V; u++)
for (v = 0; v < V; v++)
rGraph[u][v] = graph[u][v];
int parent[V]; // This array is filled by BFS and to store path
// Augment the flow while tere is path from source to sink
while (bfs(rGraph, s, t, parent))
{
// Find minimum residual capacity of the edhes along the
// path filled by BFS. Or we can say find the maximum flow
// through the path found.
int path_flow = INT_MAX;
for (v=t; v!=s; v=parent[v])
{
u = parent[v];
path_flow = min(path_flow, rGraph[u][v]);
}
// update residual capacities of the edges and reverse edges
// along the path
for (v=t; v != s; v=parent[v])
{
u = parent[v];
rGraph[u][v] -= path_flow;
rGraph[v][u] += path_flow;
}
}
// Flow is maximum now, find vertices reachable from s
bool visited[V];
memset(visited, false, sizeof(visited));
dfs(rGraph, s, visited);
// Print all edges that are from a reachable vertex to
// non-reachable vertex in the original graph
for (int i = 0; i < V; i++)
for (int j = 0; j < V; j++)
if (visited[i] && !visited[j] && graph[i][j])
cout << i << " - " << j << endl;
return;
}
Java
// Java program for implementation of Ford Fulkerson algorithm
import java.util.*;
import java.lang.*;
import java.io.*;
import java.util.LinkedList;
class MaxFlow
{
static final int V=6;
1 - 3
4 - 3
4 - 5
References:
http://www.stanford.edu/class/cs97si/08-network-flow-problems.pdf
http://www.cs.princeton.edu/courses/archive/spring06/cos226/lectures/maxflow.pdf
C++
// A C++ program to find maximal Bipartite matching.
#include <iostream>
#include <string.h>
using namespace std;
// M is number of applicants and N is number of jobs
#define M 6
#define N 6
// A DFS based recursive function that returns true if a
// matching for vertex u is possible
bool bpm(bool bpGraph[M][N], int u, bool seen[], int matchR[])
{
// Try every job one by one
for (int v = 0; v < N; v++)
{
// If applicant u is interested in job v and v is
// not visited
if (bpGraph[u][v] && !seen[v])
{
seen[v] = true; // Mark v as visited
//
//
//
//
//
if
{
}
}
}
return false;
}
// Returns maximum number of matching from M to N
int maxBPM(bool bpGraph[M][N])
{
// An array to keep track of the applicants assigned to
// jobs. The value of matchR[i] is the applicant number
// assigned to job i, the value -1 indicates nobody is
// assigned.
int matchR[N];
// Initially all jobs are available
memset(matchR, -1, sizeof(matchR));
Java
// A Java program to find maximal Bipartite matching.
import java.util.*;
import java.lang.*;
import java.io.*;
class MaxBipartite
{
// M is number of applicants and N is number of jobs
static final int M = 6;
static final int N = 6;
// A DFS based recursive function that returns true if a
// matching for vertex u is possible
boolean bpm(boolean bpGraph[][], int u, boolean seen[],
int matchR[])
{
// Try every job one by one
for (int v = 0; v < N; v++)
{
// If applicant u is interested in job v and v
// is not visited
if (bpGraph[u][v] && !seen[v])
{
seen[v] = true; // Mark v as visited
//
//
//
//
//
//
if
{
matchR[v] = u;
return true;
}
}
}
return false;
}
// Returns maximum number of matching from M to N
int maxBPM(boolean bpGraph[][])
{
References:
http://www.cs.cornell.edu/~wdtseng/icpc/notes/graph_part5.pdf
http://www.youtube.com/watch?v=NlQqmEXuiC8
http://en.wikipedia.org/wiki/Maximum_matching
http://www.stanford.edu/class/cs97si/08-network-flow-problems.pdf
http://www.cs.princeton.edu/courses/archive/spring13/cos423/lectures/07NetworkFlowII-22.pdf
http://www.ise.ncsu.edu/fangroup/or766.dir/or766_ch7.pdf
The above is the input format. We call the above matrix M. Each value M[i; j] represents the number of packets Transmitter i has to send to
Receiver j. The output should be:
The number of maximum packets sent in the time slot is 3
T1 -> R2
T2 -> R3
T3 -> R1
Note that the maximum number of packets that can be transferred in any slot is min(M, N).
Algorithm:
The channel assignment problem between sender and receiver can be easily transformed into Maximum Bipartite Matching(MBP) problem that
can be solved by converting it into a flow network.
Step 1: Build a Flow Network
There must be a source and sink in a flow network. So we add a dummy source and add edges from source to all senders. Similarly, add edges
from all receivers to dummy sink. The capacity of all added edges is marked as 1 unit.
Step 2: Find the maximum flow.
We use Ford-Fulkerson algorithm to find the maximum flow in the flow network built in step 1. The maximum flow is actually the maximum number
of packets that can be transmitted without bandwidth interference in a time slot.
Implementation:
Let us first define input and output forms. Input is in the form of Edmonds matrix which is a 2D array table[M][N] with M rows (for M senders)
and N columns (for N receivers). The value table[i][j] is the number of packets that has to be sent from transmitter i to receiver j. Output is the
maximum number of packets that can be transmitted without bandwidth interference in a time slot.
A simple way to implement this is to create a matrix that represents adjacency matrix representation of a directed graph with M+N+2 vertices.
Call the fordFulkerson() for the matrix. This implementation requires O((M+N)*(M+N)) extra space.
Extra space can be reduced and code can be simplified using the fact that the graph is bipartite. The idea is to use DFS traversal to find a receiver
for a transmitter (similar to augmenting path in Ford-Fulkerson). We call bpm() for every applicant, bpm() is the DFS based function that tries all
possibilities to assign a receiver to the sender. In bpm(), we one by one try all receivers that a sender u is interested in until we find a receiver, or all
receivers are tried without luck.
For every receiver we try, we do following:
If a receiver is not assigned to anybody, we simply assign it to the sender and return true. If a receiver is assigned to somebody else say x, then we
recursively check whether x can be assigned some other receiver. To make sure that x doesnt get the same receiver again, we mark the receiver v
as seen before we make recursive call for x. If x can get other receiver, we change the sender for receiver v and return true. We use an array
maxR[0..N-1] that stores the senders assigned to different receivers.
If bmp() returns true, then it means that there is an augmenting path in flow network and 1 unit of flow is added to the result in maxBPM().
Time and space complexity analysis:
In case of bipartite matching problem, F ? |V| since there can be only |V| possible edges coming out from source node. So the total running time is
O(mn) = O((m + n)n). The space complexity is also substantially reduces from O ((M+N)*(M+N)) to just a single dimensional array of size M
thus storing the mapping between M and N.
#include <iostream>
#include <string.h>
#include <vector>
#define M 3
#define N 4
using namespace std;
// A Depth First Search based recursive function that returns true
// if a matching for vertex u is possible
bool bpm(int table[M][N], int u, bool seen[], int matchR[])
{
// Try every receiver one by one
for (int v = 0; v < N; v++)
{
// If sender u has packets to send to receiver v and
//
//
//
if
{
}
}
}
return false;
}
// Returns maximum number of packets
// time slot from sender to receiver
int maxBPM(int table[M][N])
{
// An array to keep track of the
// The value of matchR[i] is the
// the value -1 indicates nobody
int matchR[N];
Output:
The number of maximum packets sent in the time slot is 3
T3-> R1
T1-> R2
T2-> R3
Given an array of strings, find if the strings can be chained to form a circle
Given an array of strings, find if the given strings can be chained to form a circle. A string X can be put before another string Y in circle if the last
character of X is same as first character of Y.
Examples:
Input: arr[] = {"geek", "king"}
Output: Yes, the given strings can be chained.
Note that the last character of first string is same
as first character of second string and vice versa is
also true.
Input: arr[] = {"for", "geek", "rig", "kaf"}
Output: Yes, the given strings can be chained.
The strings can be chained as "for", "rig", "geek"
and "kaf"
Input: arr[] = {"aab", "bac", "aaa", "cda"}
Output: Yes, the given strings can be chained.
The strings can be chained as "aaa", "aab", "bac"
and "cda"
Input: arr[] = {"aaa", "bbb", "baa", "aab"};
Output: Yes, the given strings can be chained.
The strings can be chained as "aaa", "aab", "bbb"
and "baa"
Input: arr[] = {"aaa"};
Output: Yes
Input: arr[] = {"aaa", "bbb"};
Output: No
The idea is to create a directed graph of all characters and then find if their is an eulerian circuit in the graph or not. If there is an eulerian circuit,
then chain can be formed, otherwise not.
Note that a directed graph has eulerian circuit only if in degree and out degree of every vertex is same, and all non-zero degree vertices form a
single strongly connected component.
Following are detailed steps of the algorithm.
1) Create a directed graph g with number of vertices equal to the size of alphabet. We have created a graph with 26 vertices in the below
program.
2) Do following for every string in the given array of strings.
..a) Add an edge from first character to last character of the given graph.
3) If the created graph has eulerian circuit, then return true, else return false.
Following is C++ implementation of the above algorithm.
// A C++ program to check if a given directed graph is Eulerian or not
#include<iostream>
#include <list>
#define CHARS 26
using namespace std;
// A class that represents an undirected graph
class Graph
{
int V;
// No. of vertices
list<int> *adj;
// A dynamic array of adjacency lists
int *in;
public:
// Constructor and destructor
Graph(int V);
~Graph() { delete [] adj; delete [] in; }
// function to add an edge to graph
void addEdge(int v, int w) { adj[v].push_back(w); (in[w])++; }
// Method to check if this graph is Eulerian or not
bool isEulerianCycle();
// Method to check if all non-zero degree vertices are connected
bool isSC();
Output:
Can be chained
Can't be chained
The idea is to create a graph of characters and then find topological sorting of the created graph. Following are the detailed steps.
1) Create a graph g with number of vertices equal to the size of alphabet in the given alien language. For example, if the alphabet size is 5, then
there can be 5 characters in words. Initially there are no edges in graph.
2) Do following for every pair of adjacent words in given sorted array.
..a) Let the current pair of words be word1 and word2. One by one compare characters of both words and find the first mismatching characters.
..b) Create an edge in g from mismatching character of word1 to that of word2.
3) Print topological sorting of the above created graph.
Following is C++ implementation of the above algorithm.
// A C++ program to order of characters in an alien language
#include<iostream>
#include <list>
#include <stack>
#include <cstring>
using namespace std;
// Class to represent a graph
class Graph
{
int V;
// No. of vertices'
// Pointer to an array containing adjacency listsList
list<int> *adj;
// A function used by topologicalSort
void topologicalSortUtil(int v, bool visited[], stack<int> &Stack);
public:
Graph(int V); // Constructor
// function to add an edge to graph
void addEdge(int v, int w);
// prints a Topological Sort of the complete graph
void topologicalSort();
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int v, int w)
{
adj[v].push_back(w); // Add w to vs list.
}
// A recursive function used by topologicalSort
void Graph::topologicalSortUtil(int v, bool visited[], stack<int> &Stack)
{
// Mark the current node as visited.
visited[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
if (!visited[*i])
topologicalSortUtil(*i, visited, Stack);
Output:
c a b
Time Complexity: The first step to create a graph takes O(n + alhpa) time where n is number of given words and alpha is number of characters
in given alphabet. The second step is also topological sorting. Note that there would be alpha vertices and at-most (n-1) edges in the graph. The
time complexity of topological sorting is O(V+E) which is O(n + aplha) here. So overall time complexity is O(n + aplha) + O(n + aplha) which is
O(n + aplha).
Exercise:
The above code doesnt work when the input is not valid. For example {aba, bba, aaa} is not valid, because from first two words, we can deduce
a should appear before b, but from last two words, we can deduce b should appear before a which is not possible. Extend the above program to
handle invalid inputs and generate the output as Not valid.
A Simple Solution use Max-Flow based s-t cut algorithm to find minimum cut. Consider every pair of vertices as source s and sink t, and call
minimum s-t cut algorithm to find the s-t cut. Return minimum of all s-t cuts. Best possible time complexity of this algorithm is O(V5) for a graph.
[How? there are total possible V2 pairs and s-t cut algorithm for one pair takes O(V*E) time and E = O(V2)].
Below is simple Kargers Algorithm for this purpose. Below Kargers algorithm can be implemented in O(E) = O(V2) time.
1) Initialize contracted graph CG as copy of original graph
2) While there are more than 2 vertices.
a) Pick a random edge (u, v) in the contracted graph.
b) Merge (or contract) u and v into a single vertex (update
the contracted graph).
c) Remove self-loops
3) Return cut represented by two vertices.
Let the next randomly picked edge be d. We remove this edge and combine vertices (0,1) and 3.
Now graph has two vertices, so we stop. The number of edges in the resultant graph is the cut produced by Kargers algorithm.
Kargers algorithm is a Monte Carlo algorithm and cut produced by it may not be minimum. For example, the following diagram shows
that a different order of picking random edges produces a min-cut of size 3.
Below is C++ implementation of above algorithm. The input graph is represented as a collection of edges and union-find data structure is used to
keep track of components.
// Karger's algorithm to find Minimum Cut in an
// undirected, unweighted and connected graph.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// a structure to represent a unweighted edge in graph
struct Edge
{
int src, dest;
};
// a structure to represent a connected, undirected
// and unweighted graph as a collection of edges.
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges.
// Since the graph is undirected, the edge
// from src to dest is also edge from dest
// to src. Both are counted as 1 edge here.
Edge* edge;
};
// A structure to represent a subset for union-find
struct subset
{
int parent;
int rank;
};
// Function prototypes for union-find (These functions are defined
// after kargerMinCut() )
int find(struct subset subsets[], int i);
void Union(struct subset subsets[], int x, int y);
// A very basic implementation of Karger's randomized
// algorithm for finding the minimum cut. Please note
// that Karger's algorithm is a Monte Carlo Randomized algo
// and the cut returned by the algorithm may not be
// minimum always
int kargerMinCut(struct Graph* graph)
{
// Get data of given graph
int V = graph->V, E = graph->E;
Edge *edge = graph->edge;
// Allocate memory for creating V subsets.
struct subset *subsets = new subset[V];
// Create V subsets with single elements
for (int v = 0; v < V; ++v)
{
subsets[v].parent = v;
subsets[v].rank = 0;
}
// Initially there are V vertices in
// contracted graph
int vertices = V;
// Keep contracting vertices until there are
// 2 vertices.
while (vertices > 2)
{
// Pick a random edge
int i = rand() % E;
// Find vertices (or sets) of two corners
// of current edge
int subset1 = find(subsets, edge[i].src);
int subset2 = find(subsets, edge[i].dest);
// If two corners belong to same subset,
// then no point considering this edge
if (subset1 == subset2)
continue;
// Else contract the edge (or combine the
// corners of edge into one vertex)
else
{
printf("Contracting edge %d-%d\n",
edge[i].src, edge[i].dest);
vertices--;
Union(subsets, subset1, subset2);
}
}
// Now we have two vertices (or subsets) left in
// the contracted graph, so count the edges between
// two components and return the count.
int cutedges = 0;
for (int i=0; i<E; i++)
{
int subset1 = find(subsets, edge[i].src);
int subset2 = find(subsets, edge[i].dest);
if (subset1 != subset2)
cutedges++;
}
return cutedges;
}
// A utility function to find set of an element i
// (uses path compression technique)
int find(struct subset subsets[], int i)
{
// find root and make root as parent of i
// (path compression)
if (subsets[i].parent != i)
subsets[i].parent =
find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(struct subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high
// rank tree (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and
// increment its rank by one
else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
Graph* graph = new Graph;
graph->V = V;
graph->E = E;
graph->edge = new Edge[E];
return graph;
}
// Driver program to test above functions
int main()
{
/* Let us create following unweighted graph
0------1
| \
|
| \ |
|
\|
2------3 */
int V = 4; // Number of vertices in graph
int E = 5; // Number of edges in graph
struct Graph* graph = createGraph(V, E);
// add edge 0-1
graph->edge[0].src = 0;
graph->edge[0].dest = 1;
// add edge 0-2
graph->edge[1].src = 0;
graph->edge[1].dest = 2;
// add edge 0-3
graph->edge[2].src = 0;
graph->edge[2].dest = 3;
// add edge 1-3
graph->edge[3].src = 1;
graph->edge[3].dest = 3;
// add edge 2-3
graph->edge[4].src = 2;
graph->edge[4].dest = 3;
// Use a different seed value for every run.
srand(time(NULL));
printf("\nCut found by Karger's randomized algo is %d\n",
kargerMinCut(graph));
return 0;
}
Output:
Contracting edge 0-2
Contracting edge 0-3
Cut found by Karger's randomized algo is 2
Note that the above program is based on outcome of a random function and may produce different output.
In this post, we have discussed simple Kargers algorithm and have seen that the algorithm doesnt always produce min-cut. The above algorithm
produces min-cut with probability greater or equal to that 1/(n2). See next post on Analysis and Applications of Kargers Algortihm, applications,
proof of this probability and improvements are discussed.
References:
http://en.wikipedia.org/wiki/Karger%27s_algorithm
https://www.youtube.com/watch?v=P0l8jMDQTEQ
https://www.cs.princeton.edu/courses/archive/fall13/cos521/lecnotes/lec2final.pdf
http://web.stanford.edu/class/archive/cs/cs161/cs161.1138/lectures/11/Small11.pdf
As discussed in the previous post, Kargers algorithm doesnt always find min cut. In this post, the probability of finding min-cut is discussed.
Probability that the cut produced by Kargers Algorithm is Min-Cut is at greater than or equal to 1/(n2)
Proof:
Let there be a unique Min-Cut of given graph and let there be C edges in the Min-Cut and the edges be {e1, e2, e3, .. ec}. The Kargers algorithm
would produce this Min-Cut if and only if none of the edges in set {e1, e2, e3, .. ec} is removed in iterations in the main while loop of above
algorithm.
..................
..................
The cut produced by Karger's algorithm would be a min-cut if none of the above
events happen.
So the required probability is P[S1' ? S2' ? S3' ? ............]
In the initial graph all single edges are augmenting paths and we can pick in any order. In the middle stage, there is only one augmenting path. We
remove matching edges of this path from M and add not-matching edges. In final matching, there are no augmenting paths so the matching is
maximum.
Implementation of Hopcroft Karp algorithm is discussed in set 2.
HopcroftKarp Algorithm for Maximum Matching | Set 2 (Implementation)
References:
https://en.wikipedia.org/wiki/Hopcroft%E2%80%93Karp_algorithm
http://www.dis.uniroma1.it/~leon/tcs/lecture2.pdf
In the initial graph all single edges are augmenting paths and we can pick in any order. In the middle stage, there is only one augmenting path. We
remove matching edges of this path from M and add not-matching edges. In final matching, there are no augmenting paths so the matching is
maximum.
Implementation of Hopcroft Karp algorithm is discussed in set 2.
HopcroftKarp Algorithm for Maximum Matching | Set 2 (Implementation)
References:
https://en.wikipedia.org/wiki/Hopcroft%E2%80%93Karp_algorithm
http://www.dis.uniroma1.it/~leon/tcs/lecture2.pdf
Linearity of Expectation
This post is about a mathematical concept, but it covers one of the required topics to understand Randomized Algorithms.
Let us consider the following simple problem.
Given a fair dice with 6 faces, the dice is thrown n times, find expected value of sum of all results.
For example, if n = 2, there are total 36 possible outcomes.
(1, 1), (1, 2), .... (1, 6)
(2, 1), (2, 2), .... (2, 6)
................
................
(6, 1), (6, 2), ..... (6, 6)
Expected value of a discrete random variable is R defined as following. Suppose R can take value r1 with probability p1, value r2 with probability
p2, and so on, up to value rk with probability pk. Then the expectation of this random variable R is defined as
E[R] = r1*p1 + r2*p2 + ... rk*pk
The above way to solve the problem becomes difficult when there are more dice throws.
If we know about linearity of expectation, then we can quickly solve the above problem for any number of throws.
Linearity of Expectation: Let R1 and R2 be two discrete random variables on some probability space, then
E[R1 + R2] = E[R1] + E[R2]
Using the above formula, we can quickly solve the dice problem.
Expected Value of sum of 2 dice throws =
=
=
=
2) Balls and Bins: Suppose we have m balls, labeled i = 1, , m and n bins, labeled j = 1, .. ,n. Each ball is thrown into one of the bin independently
and uniformly at random.
a) What is the expected number of balls in every bin
b) What is the expected number of empty bins.
3) Coupon Collector: Suppose there are n types of coupons in a lottery and each lot contains one coupon (with probability 1 = n each). How
many lots have to be bought (in expectation) until we have at least one coupon of each type.
See following for solution of Coupon Collector.
Expected Number of Trials until Success
Linearity of expectation is useful in algorithms. For example, expected time complexity of random algorithms like randomized quick sort is
evaluated using linearity of expectation (See this for reference).
References:
http://www.cse.iitd.ac.in/~mohanty/col106/Resources/linearity_expectation.pdf
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/videolectures/lecture-22-expectation-i/
Exercise:
1) A 6 faced fair dice is thrown until a '5' is seen as result of dice throw. What is the expected number of throws?
2) What is the ratio of boys and girls in above puzzle if probability of a baby boy is 1/3?
Reference:
http://www.cse.iitd.ac.in/~mohanty/col106/Resources/linearity_expectation.pdf
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/videolectures/lecture-22-expectation-i/
The important thing in our analysis is, time taken by step 2 is O(n).
How many times while loop runs before finding a central pivot?
The probability that the randomly chosen element is central pivot is 1/2.
Therefore, expected number of times the while loop runs is 2 (See this for details)
Thus, the expected time complexity of step 2 is O(n).
What is overall Time Complexity in Worst Case?
In worst case, each partition divides array such that one side has n/4 elements and other side has 3n/4 elements. The worst case height of recursion
tree is Log 3/4 n which is O(Log n).
T(n) < T(n/4) + T(3n/4) + O(n)
T(n) < 2T(3n/4) + O(n)
Solution of above recurrence is O(n Log n)
Note that the above randomized algorithm is not the best way to implement randomized Quick Sort. The idea here is to simplify the analysis as it is
simple to analyse.
Typically, randomized Quick Sort is implemented by randomly picking a pivot (no loop). Or by shuffling array elements. Expected worst case time
complexity of this algorithm is also O(n Log n), but analysis is complex, the MIT prof himself mentions same in his lecture here.
Randomized Algorithms | Set 2 (Classification and Applications)
References:
http://www.tcs.tifr.res.in/~workshop/nitrkl_igga/randomized-lecture.pdf
Classification
Randomized algorithms are classified in two categories.
Las Vegas: These algorithms always produce correct or optimum result. Time complexity of these algorithms is based on a random value and
time complexity is evaluated as expected value. For example, Randomized QuickSort always sorts an input array and expected worst case time
complexity of QuickSort is O(nLogn).
Monte Carlo: Produce correct or optimum result with some probability. These algorithms have deterministic running time and it is generally easier
to find out worst case time complexity. For example this implementation of Kargers Algorithm produces minimum cut with probability greater than
or equal to 1/n2 (n is number of vertices) and has worst case time complexity as O(E).
Example to Understand Classification:
Consider a binary array where exactly half elements are 0
and half are 1. The task is to find index of any 1.
A Las Vegas algorithm for this task is to keep picking a random element until we find a 1. A Monte Carlo algorithm for the same is to keep picking
a random element until we either find 1 or we have tried maximum allowed times say k.
The Las Vegas algorithm always finds an index of 1, but time complexity is determined as expect value. The expected number of trials before
success is 2, therefore expected time complexity is O(1).
The Monte Carlo Algorithm finds a 1 with probability [1 (1/2)k]. Time complexity of Monte Carlo is O(k) which is deterministic
A Simple Solution use Max-Flow based s-t cut algorithm to find minimum cut. Consider every pair of vertices as source s and sink t, and call
minimum s-t cut algorithm to find the s-t cut. Return minimum of all s-t cuts. Best possible time complexity of this algorithm is O(V5) for a graph.
[How? there are total possible V2 pairs and s-t cut algorithm for one pair takes O(V*E) time and E = O(V2)].
Below is simple Kargers Algorithm for this purpose. Below Kargers algorithm can be implemented in O(E) = O(V2) time.
1) Initialize contracted graph CG as copy of original graph
2) While there are more than 2 vertices.
a) Pick a random edge (u, v) in the contracted graph.
b) Merge (or contract) u and v into a single vertex (update
the contracted graph).
c) Remove self-loops
3) Return cut represented by two vertices.
Let the next randomly picked edge be d. We remove this edge and combine vertices (0,1) and 3.
Now graph has two vertices, so we stop. The number of edges in the resultant graph is the cut produced by Kargers algorithm.
Kargers algorithm is a Monte Carlo algorithm and cut produced by it may not be minimum. For example, the following diagram shows
that a different order of picking random edges produces a min-cut of size 3.
Below is C++ implementation of above algorithm. The input graph is represented as a collection of edges and union-find data structure is used to
keep track of components.
// Karger's algorithm to find Minimum Cut in an
// undirected, unweighted and connected graph.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// a structure to represent a unweighted edge in graph
struct Edge
{
int src, dest;
};
// a structure to represent a connected, undirected
// and unweighted graph as a collection of edges.
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges.
// Since the graph is undirected, the edge
// from src to dest is also edge from dest
// to src. Both are counted as 1 edge here.
Edge* edge;
};
// A structure to represent a subset for union-find
struct subset
{
int parent;
int rank;
};
// Function prototypes for union-find (These functions are defined
// after kargerMinCut() )
int find(struct subset subsets[], int i);
void Union(struct subset subsets[], int x, int y);
// A very basic implementation of Karger's randomized
// algorithm for finding the minimum cut. Please note
// that Karger's algorithm is a Monte Carlo Randomized algo
// and the cut returned by the algorithm may not be
// minimum always
int kargerMinCut(struct Graph* graph)
{
// Get data of given graph
int V = graph->V, E = graph->E;
Edge *edge = graph->edge;
// Allocate memory for creating V subsets.
struct subset *subsets = new subset[V];
// Create V subsets with single elements
for (int v = 0; v < V; ++v)
{
subsets[v].parent = v;
subsets[v].rank = 0;
}
// Initially there are V vertices in
// contracted graph
int vertices = V;
// Keep contracting vertices until there are
// 2 vertices.
while (vertices > 2)
{
// Pick a random edge
int i = rand() % E;
// Find vertices (or sets) of two corners
// of current edge
int subset1 = find(subsets, edge[i].src);
int subset2 = find(subsets, edge[i].dest);
// If two corners belong to same subset,
// then no point considering this edge
if (subset1 == subset2)
continue;
// Else contract the edge (or combine the
// corners of edge into one vertex)
else
{
printf("Contracting edge %d-%d\n",
edge[i].src, edge[i].dest);
vertices--;
Union(subsets, subset1, subset2);
}
}
// Now we have two vertices (or subsets) left in
// the contracted graph, so count the edges between
// two components and return the count.
int cutedges = 0;
for (int i=0; i<E; i++)
{
int subset1 = find(subsets, edge[i].src);
int subset2 = find(subsets, edge[i].dest);
if (subset1 != subset2)
cutedges++;
}
return cutedges;
}
// A utility function to find set of an element i
// (uses path compression technique)
int find(struct subset subsets[], int i)
{
// find root and make root as parent of i
// (path compression)
if (subsets[i].parent != i)
subsets[i].parent =
find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(struct subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high
// rank tree (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and
// increment its rank by one
else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
Graph* graph = new Graph;
graph->V = V;
graph->E = E;
graph->edge = new Edge[E];
return graph;
}
// Driver program to test above functions
int main()
{
/* Let us create following unweighted graph
0------1
| \
|
| \ |
|
\|
2------3 */
int V = 4; // Number of vertices in graph
int E = 5; // Number of edges in graph
struct Graph* graph = createGraph(V, E);
// add edge 0-1
graph->edge[0].src = 0;
graph->edge[0].dest = 1;
// add edge 0-2
graph->edge[1].src = 0;
graph->edge[1].dest = 2;
// add edge 0-3
graph->edge[2].src = 0;
graph->edge[2].dest = 3;
// add edge 1-3
graph->edge[3].src = 1;
graph->edge[3].dest = 3;
// add edge 2-3
graph->edge[4].src = 2;
graph->edge[4].dest = 3;
// Use a different seed value for every run.
srand(time(NULL));
printf("\nCut found by Karger's randomized algo is %d\n",
kargerMinCut(graph));
return 0;
}
Output:
Contracting edge 0-2
Contracting edge 0-3
Cut found by Karger's randomized algo is 2
Note that the above program is based on outcome of a random function and may produce different output.
In this post, we have discussed simple Kargers algorithm and have seen that the algorithm doesnt always produce min-cut. The above algorithm
produces min-cut with probability greater or equal to that 1/(n2). See next post on Analysis and Applications of Kargers Algortihm, applications,
proof of this probability and improvements are discussed.
References:
http://en.wikipedia.org/wiki/Karger%27s_algorithm
https://www.youtube.com/watch?v=P0l8jMDQTEQ
https://www.cs.princeton.edu/courses/archive/fall13/cos521/lecnotes/lec2final.pdf
http://web.stanford.edu/class/archive/cs/cs161/cs161.1138/lectures/11/Small11.pdf
i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] arount the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}
Output:
K'th smallest element is 5
Time Complexity:
The worst case time complexity of the above solution is still O(n2). In worst case, the randomized function may always pick a corner element. The
expected time complexity of above randomized QuickSelect is ?(n), see CLRS book or MIT video lecture for proof. The assumption in the
analysis is, random number generator is equally likely to generate any number in the input range.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Reservoir Sampling
Reservoir sampling is a family of randomized algorithms for randomly choosing k samples from a list of n items, where n is either a very large or
unknown number. Typically n is large enough that the list doesnt fit into main memory. For example, a list of search queries in Google and
Facebook.
So we are given a big array (or stream) of numbers (to simplify), and we need to write an efficient function to randomly select k numbers where 1
<= k <= n. Let the input array be stream[].
A simple solution is to create an array reservoir[] of maximum size k. One by one randomly select an item from stream[0..n-1]. If the selected
item is not previously selected, then put it in reservoir[]. To check if an item is previously selected or not, we need to search the item in
reservoir[]. The time complexity of this algorithm will be O(k^2). This can be costly if k is big. Also, this is not efficient if the input is in the form of
a stream.
It can be solved in O(n) time. The solution also suits well for input in the form of stream. The idea is similar to this post. Following are the steps.
1) Create an array reservoir[0..k-1] and copy first k items of stream[] to it.
2) Now one by one consider all items from (k+1)th item to nth item.
a) Generate a random number from 0 to i where i is index of current item in stream[]. Let the generated random number is j.
b) If j is in range 0 to k-1, replace reservoir[j] with arr[i]
Following is C implementation of the above algorithm.
// An efficient program to randomly select k items from a stream of items
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// A utility function to print an array
void printArray(int stream[], int n)
{
for (int i = 0; i < n; i++)
printf("%d ", stream[i]);
printf("\n");
}
// A function to randomly select k items from stream[0..n-1].
void selectKItems(int stream[], int n, int k)
{
int i; // index for elements in stream[]
// reservoir[] is the output array. Initialize it with
// first k elements from stream[]
int reservoir[k];
for (i = 0; i < k; i++)
reservoir[i] = stream[i];
// Use a different seed value so that we don't get
// same result each time we run this program
srand(time(NULL));
// Iterate from the (k+1)th element to nth element
for (; i < n; i++)
{
// Pick a random index from 0 to i.
int j = rand() % (i+1);
// If the randomly picked index is smaller than k, then replace
// the element present at the index with new element from stream
if (j < k)
reservoir[j] = stream[i];
}
printf("Following are k randomly selected items \n");
printArray(reservoir, k);
}
// Driver program to test above function.
int main()
{
int stream[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
int n = sizeof(stream)/sizeof(stream[0]);
int k = 5;
selectKItems(stream, n, k);
return 0;
}
Output:
Following are k randomly selected items
6 2 11 8 12
Let the given array be arr[]. A simple solution is to create an auxiliary array temp[] which is initially a copy of arr[]. Randomly select an element
from temp[], copy the randomly selected element to arr[0] and remove the selected element from temp[]. Repeat the same process n times and
keep copying elements to arr[1], arr[2], . The time complexity of this solution will be O(n^2).
FisherYates shuffle Algorithm works in O(n) time complexity. The assumption here is, we are given a function rand() that generates random
number in O(1) time.
The idea is to start from the last element, swap it with a randomly selected element from the whole array (including last). Now consider the array
from 0 to n-2 (size reduced by 1), and repeat the process till we hit the first element.
Following is the detailed algorithm
To shuffle an array a of n elements (indices 0..n-1):
for i from n - 1 downto 1 do
j = random integer with 0 <= j <= i
exchange a[j] and a[i]
Output:
7 8 4 6 3 1 2 5
Note that the above program is based on outcome of a random function and may produce different output.
How does this work?
Let there be total N nodes in list. It is easier to understand from last node.
The probability that last node is result simply 1/N [For last or Nth node, we generate a random number between 0 to N-1 and make last node as
result if the generated number is 0 (or any other fixed number]
The probability that second last node is result should also be 1/N.
The probability that the second last node is result
= [Probability that the second last node replaces result] X
[Probability that the last node doesn't replace the result]
= [1 / (N-1)] * [(N-1)/N]
= 1/N
Similarly we can show probability for 3rd last node and other nodes.
You are given an array of sorted words in an arbitrary language, you need to find order (or precedence of characters) in the
language. For example if the given arrays is {baa, abcd, abca, cab, cad}, then order of characters is b, d, a, c. Note that words are sorted and in
the given language baa comes before abcd, therefore b is before a in output. Similarly we can find other orders.
This can be solved using two steps: First create a graph by processing given set of words, then do topological sorting of the created graph, See this
for more details.
We will soon be adding more questions.
You may also like to see Commonly Asked Data Structure Interview Questions | Set 1
'X',
'X',
'X',
'X',
'O',
'X',
'X',
'O',
'X',
'O',
'X'},
'X'},
'O'},
'X'},
'O'},
1) is the largest
'X',
'O',
'O',
'X',
'X',
'X'},
'X'},
'X'},
'X'},
'O'},
2) is the largest
A Simple Solution is to consider every square submatrix and check whether is has all corner edges filled with X. The time complexity of this
solution is O(N4).
We can solve this problem in O(N3) time using extra space. The idea is to create two auxiliary arrays hor[N][N] and ver[N][N]. The value
stored in hor[i][j] is the number of horizontal continuous X characters till mat[i][j] in mat[][]. Similarly, the value stored in ver[i][j] is the number of
vertical continuous X characters till mat[i][j] in mat[][]. Following is an example.
mat[6][6] = X
X
X
O
X
O
O
O
X
X
X
O
X
X
X
X
X
X
X
X
O
X
O
O
X
O
O
X
X
O
X
X
X
X
O
O
hor[6][6] = 1
1
1
0
1
0
0
0
2
1
2
0
1
1
3
2
3
1
2
2
0
3
0
0
3
0
0
4
1
0
4
1
1
5
0
0
ver[6][6] = 1
2
3
0
1
0
0
0
1
2
3
0
1
2
3
4
5
6
1
2
0
1
0
0
1
0
0
1
2
0
1
2
3
4
0
0
Once we have filled values in hor[][] and ver[][], we start from the bottommost-rightmost corner of matrix and move toward the leftmost-topmost
in row by row manner. For every visited entry mat[i][j], we compare the values of hor[i][j] and ver[i][j], and pick the smaller of two as we need a
square. Let the smaller of two be small. After picking smaller of two, we check if both ver[][] and hor[][] for left and up edges respectively. If they
have entries for the same, then we found a subsquare. Otherwise we try for small-1.
Below is C++ implementation of the above idea.
// A C++ program to find the largest subsquare
// surrounded by 'X' in a given matrix of 'O' and 'X'
#include<iostream>
using namespace std;
// Size of given matrix is N X N
#define N 6
// A utility function to find minimum of two numbers
int getMin(int x, int y) { return (x<y)? x: y; }
// Returns size of maximum size subsquare matrix
// surrounded by 'X'
int findSubSquare(int mat[][N])
{
int max = 1; // Initialize result
// Initialize the left-top value in hor[][] and ver[][]
int hor[N][N], ver[N][N];
hor[0][0] = ver[0][0] = (mat[0][0] == 'X');
// Fill values in hor[][] and ver[][]
for (int i=0; i<N; i++)
{
for (int j=0; j<N; j++)
{
if (mat[i][j] == 'O')
ver[i][j] = hor[i][j] = 0;
else
{
hor[i][j] = (j==0)? 1: hor[i][j-1] + 1;
ver[i][j] = (i==0)? 1: ver[i-1][j] + 1;
}
}
}
// Start from the rightmost-bottommost corner element and find
// the largest ssubsquare with the help of hor[][] and ver[][]
for (int i = N-1; i>=1; i--)
{
for (int j = N-1; j>=1; j--)
{
// Find smaller of values in hor[][] and ver[][]
// A Square can only be made by taking smaller
// value
int small = getMin(hor[i][j], ver[i][j]);
// At this point, we are sure that there is a right
// vertical line and bottom horizontal line of length
// at least 'small'.
// We found a bigger square if following conditions
// are met:
// 1)If side of square is greater than max.
// 2)There is a left vertical line of length >= 'small'
// 3)There is a top horizontal line of length >= 'small'
while (small > max)
{
if (ver[i][j-small+1] >= small &&
hor[i-small+1][j] >= small)
{
max = small;
}
small--;
}
}
}
return max;
}
// Driver program to test above
int main()
{
int mat[][N] = {{'X', 'O',
{'X', 'O',
{'X', 'X',
{'O', 'X',
{'X', 'X',
{'O', 'O',
};
cout << findSubSquare(mat);
return 0;
}
Output:
4
function
'X',
'X',
'X',
'X',
'X',
'X',
'X',
'X',
'O',
'X',
'O',
'O',
'X',
'O',
'O',
'X',
'X',
'O',
'X'},
'X'},
'X'},
'X'},
'O'},
'O'},
}
}
// Similar to standard partition method. Here we pass the pivot element
// too instead of choosing it inside the method.
private static int partition(char[] arr, int low, int high, char pivot)
{
int i = low;
char temp1, temp2;
for (int j = low; j < high; j++)
{
if (arr[j] < pivot){
temp1 = arr[i];
arr[i] = arr[j];
arr[j] = temp1;
i++;
} else if(arr[j] == pivot){
temp1 = arr[j];
arr[j] = arr[high];
arr[high] = temp1;
j--;
}
}
temp2 = arr[i];
arr[i] = arr[high];
arr[high] = temp2;
// Return the partition index of an array based on the pivot
// element of other array.
return i;
}
}
Output:
Matched nuts and bolts are :
# $ % & @ ^
# $ % & @ ^
Output:
Updated
1 1 1 1
1 1 1 1
1 0 0 1
1 3 3 3
1 1 1 3
1 1 1 3
1 1 1 1
1 1 1 1
References:
http://en.wikipedia.org/wiki/Flood_fill
Output:
Following are conflicting intervals
[3,7] Conflicts with [1,5]
[2,6] Conflicts with [1,5]
[5,6] Conflicts with [3,7]
Note that the above implementation uses simple Binary Search Tree insert operations. Therefore, time complexity of the above implementation is
more than O(nLogn). We can use Red-Black Tree or AVL Tree balancing techniques to make the above implementation O(nLogn).
Question: Given a sentence, validate the given sentence for above given rules.
Algorithm :
1. Check for the corner cases
..1.a) Check if the first character is uppercase or not in the sentence.
..1.b) Check if the last character is a full stop or not.
2. For rest of the string, this problem could be solved by following a state diagram. Please refer to the below state diagram for that.
3. We need to maintain previous and current state of different characters in the string. Based on that we can always validate the sentence of every
character traversed.
A C based implementation is below. (By the way this sentence is also correct according to the rule and code)
C++
// C program to validate a given sentence for a set of rules
#include<stdio.h>
#include<string.h>
#include<stdbool.h>
// Method to check a given sentence for given rules
bool checkSentence(char str[])
{
// Calculate the length of the string.
int len = strlen(str);
// Check that the first character lies in [A-Z].
// Otherwise return false.
if (str[0] < 'A' || str[0] > 'Z')
return false;
//If the last character is not a full stop(.) no
//need to check further.
if (str[len - 1] != '.')
return false;
// Maintain 2 states. Previous and current state based
// on which vertex state you are. Initialise both with
// 0 = start state.
int prev_state = 0, curr_state = 0;
//Keep the index to the next character in the string.
int index = 1;
//Loop to go over the string.
while (str[index])
{
// Set states according to the input characters in the
// string and the rule defined in the description.
// If current character is [A-Z]. Set current state as 0.
if (str[index] >= 'A' && str[index] <= 'Z')
curr_state = 0;
// If current character is a space. Set current state as 1.
else if (str[index] == ' ')
curr_state = 1;
// If current character is [a-z]. Set current state as 2.
else if (str[index] >= 'a' && str[index] <= 'z')
curr_state = 2;
// If current state is a dot(.). Set current state as 3.
else if (str[index] == '.')
curr_state = 3;
// Validates all current state with previous state for the
// rules in the description of the problem.
if (prev_state == curr_state && curr_state != 2)
return false;
if (prev_state == 2 && curr_state == 0)
return false;
//
//
//
if
index++;
// Set previous state as current state before going over
// to the next character.
prev_state = curr_state;
}
return false;
}
// Driver program
int main()
{
char *str[] = { "I love cinema.", "The vertex is S.",
"I am single.", "My name is KG.",
"I lovE cinema.", "GeeksQuiz. is a quiz site.",
"I love Geeksquiz and Geeksforgeeks.",
" You are my friend.", "I love cinema" };
int str_size = sizeof(str) / sizeof(str[0]);
int i = 0;
for (i = 0; i < str_size; i++)
checkSentence(str[i])? printf("\"%s\" is correct \n", str[i]):
printf("\"%s\" is incorrect \n", str[i]);
return 0;
}
Python
# Python program to validate a given sentence for a set of rules
Time complexity O(n), worst case as we have to traverse the full sentence where n is the length of the sentence.
Auxiliary space O(1)
A Simple Solution is to traverse the array, for every 0, count the number of 1s on both sides of it. Keep track of maximum count for any 0.
Finally return index of the 0 with maximum number of 1s around it. The time complexity of this solution is O(n2).
Using an Efficient Solution, the problem can solved in O(n) time. The idea is to keep track of three indexes, current index (curr), previous zero
index (prev_zero) and previous to previous zero index (prev_prev_zero). Traverse the array, if current element is 0, calculate the difference
between curr and prev_prev_zero (This difference minus one is the number of 1s around the prev_zero). If the difference between curr and
prev_prev_zero is more than maximum so far, then update the maximum. Finally return index of the prev_zero with maximum difference.
Following are C++ and Java implementations of the above algorithm.
C++
// C++ program to find Index of 0 to be replaced with 1 to get
// longest continuous sequence of 1s in a binary array
#include<iostream>
using namespace std;
// Returns index of 0 to be replaced with 1 to get longest
// continuous sequence of 1s. If there is no 0 in array, then
// it returns -1.
int maxOnesIndex(bool arr[], int n)
{
int max_count = 0; // for maximum number of 1 around a zero
int max_index; // for storing result
int prev_zero = -1; // index of previous zero
int prev_prev_zero = -1; // index of previous to previous zero
// Traverse the input array
for (int curr=0; curr<n; ++curr)
{
// If current element is 0, then calculate the difference
// between curr and prev_prev_zero
if (arr[curr] == 0)
{
// Update result if count of 1s around prev_zero is more
if (curr - prev_prev_zero > max_count)
{
max_count = curr - prev_prev_zero;
max_index = prev_zero;
}
// Update for next iteration
prev_prev_zero = prev_zero;
prev_zero = curr;
}
}
// Check for the last encountered zero
if (n-prev_prev_zero > max_count)
max_index = prev_zero;
return max_index;
}
// Driver program
int main()
{
Java
// Java program to find Index of 0 to be replaced with 1 to get
// longest continuous sequence of 1s in a binary array
import java.io.*;
class Binary
{
// Returns index of 0 to be replaced with 1 to get longest
// continuous sequence of 1s. If there is no 0 in array, then
// it returns -1.
static int maxOnesIndex(int arr[], int n)
{
int max_count = 0; // for maximum number of 1 around a zero
int max_index=0; // for storing result
int prev_zero = -1; // index of previous zero
int prev_prev_zero = -1; // index of previous to previous zero
// Traverse the input array
for (int curr=0; curr<n; ++curr)
{
// If current element is 0, then calculate the difference
// between curr and prev_prev_zero
if (arr[curr] == 0)
{
// Update result if count of 1s around prev_zero is more
if (curr - prev_prev_zero > max_count)
{
max_count = curr - prev_prev_zero;
max_index = prev_zero;
}
// Update for next iteration
prev_prev_zero = prev_zero;
prev_zero = curr;
}
}
// Check for the last encountered zero
if (n-prev_prev_zero > max_count)
max_index = prev_zero;
return max_index;
}
// Driver program to test above function
public static void main(String[] args)
{
int arr[] = {1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1};
int n = arr.length;
System.out.println("Index of 0 to be replaced is "+
maxOnesIndex(arr, n));
}
}
/* This code is contributed by Devesh Agrawal */
Index of 0 to be replaced is 9
Method 1 (Simple)
Iterate through every element of first set and search it in other set, if any element is found, return false. If no element is found, return tree. Time
complexity of this method is O(mn).
Following is C++ implementation of above idea.
// A Simple C++ program to check if two sets are disjoint
#include<iostream>
using namespace std;
// Returns true if set1[] and set2[] are disjoint, else false
bool areDisjoint(int set1[], int set2[], int m, int n)
{
// Take every element of set1[] and search it in set2
for (int i=0; i<m; i++)
for (int j=0; j<n; j++)
if (set1[i] == set2[j])
return false;
// If no element of set1 is present in set2
return true;
}
// Driver program to test above function
int main()
{
int set1[] = {12, 34, 11, 9, 3};
int set2[] = {7, 2, 1, 5};
int m = sizeof(set1)/sizeof(set1[0]);
int n = sizeof(set2)/sizeof(set2[0]);
areDisjoint(set1, set2, m, n)? cout << "Yes" : cout << " No";
return 0;
}
Output:
Yes
Output:
Yes
}
// Driver method to test above method
public static void main (String[] args)
{
int set1[] = {10, 5, 3, 4, 6};
int set2[] = {8, 7, 9, 3};
if (areDisjoint(set1, set2)
System.out.println("Yes");
else
System.out.println("No");
}
}
Output:
Yes
Time complexity of the above implementation is O(m+n) under the assumption that hash set operations like add() and contains() work in O(1)
time.
We can solve the above problem in O(nLogn) time. The idea is to consider all evens in sorted order. Once we have all events in sorted order,
we can trace the number of trains at any time keeping track of trains that have arrived, but not departed.
For example consider the above example.
arr[] = {9:00, 9:40, 9:50, 11:00, 15:00, 18:00}
dep[] = {9:10, 12:00, 11:20, 11:30, 19:00, 20:00}
All events sorted by time.
Total platforms at any time can be obtained by subtracting total
departures from total arrivals by that time.
Time
Event Type
Total Platforms Needed at this Time
9:00
Arrival
1
9:10
Departure
0
9:40
Arrival
1
9:50
Arrival
2
11:00
Arrival
3
11:20
Departure
2
11:30
Departure
1
12:00
Departure
0
15:00
Arrival
1
18:00
Arrival
2
19:00
Departure
1
20:00
Departure
0
Minimum Platforms needed on railway station = Maximum platforms
needed at any time
= 3
Following is C++ implementation of above approach. Note that the implementation doesnt create a single sorted list of all events, rather it
individually sorts arr[] and dep[] arrays, and then uses merge process of merge sort to process them together as a single sorted array.
// Program to find minimum number of platforms required on a railway station
#include<iostream>
#include<algorithm>
using namespace std;
// Returns minimum number of platforms reqquired
int findPlatform(int arr[], int dep[], int n)
{
// Sort arrival and departure arrays
sort(arr, arr+n);
sort(dep, dep+n);
// plat_needed indicates number of platforms needed at a time
int plat_needed = 1, result = 1;
int i = 1, j = 0;
// Similar to merge in merge sort to process all events in sorted order
while (i < n && j < n)
{
// If next event in sorted order is arrival, increment count of
// platforms needed
if (arr[i] < dep[j])
{
plat_needed++;
i++;
if (plat_needed > result) // Update result if needed
result = plat_needed;
}
else // Else decrement count of platforms needed
{
plat_needed--;
j++;
}
}
return result;
}
// Driver program to test methods of graph class
int main()
{
int arr[] = {900, 940, 950, 1100, 1500, 1800};
int dep[] = {910, 1200, 1120, 1130, 1900, 2000};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Minimum Number of Platforms Required = "
<< findPlatform(arr, dep, n);
return 0;
}
Output:
Minimum Number of Platforms Required = 3
The important thing to note in question is, it is given that all elements are distinct. If all elements are distinct, then a subarray has contiguous
elements if and only if the difference between maximum and minimum elements in subarray is equal to the difference between last and first indexes
of subarray. So the idea is to keep track of minimum and maximum element in every subarray.
The following is C++ implementation of above idea.
#include<iostream>
using namespace std;
// Utility functions to find minimum and maximum of
// two elements
int min(int x, int y) { return (x < y)? x : y; }
int max(int x, int y) { return (x > y)? x : y; }
// Returns length of the longest contiguous subarray
int findLength(int arr[], int n)
{
int max_len = 1; // Initialize result
for (int i=0; i<n-1; i++)
{
// Initialize min and max for all subarrays starting with i
int mn = arr[i], mx = arr[i];
// Consider all subarrays starting with i and ending with j
for (int j=i+1; j<n; j++)
{
// Update min and max in this subarray if needed
mn = min(mn, arr[j]);
mx = max(mx, arr[j]);
// If current subarray has all contiguous elements
if ((mx - mn) == j-i)
max_len = max(max_len, mx-mn+1);
}
}
return max_len; // Return result
}
// Driver program to test above function
int main()
{
int arr[] = {1, 56, 58, 57, 90, 92, 94, 93, 91, 45};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Length of the longest contiguous subarray is "
<< findLength(arr, n);
return 0;
}
Output:
Length of the longest contiguous subarray is 5
The idea is similar to previous post. In the previous post, we checked whether maximum value minus minimum value is equal to ending index minus
starting index or not. Since duplicate elements are allowed, we also need to check if the subarray contains duplicate elements or not. For example,
the array {12, 14, 12} follows the first property, but numbers in it are not contiguous elements.
To check duplicate elements in a subarray, we create a hash set for every subarray and if we find an element already in hash, we dont consider the
current subarray.
Following is Java implementation of the above idea.
/* Java program to find length of the largest subarray which has
all contiguous elements */
import java.util.*;
class Main
{
// This function prints all distinct elements
static int findLength(int arr[])
{
int n = arr.length;
int max_len = 1; // Inialize result
// One by one fix the starting points
for (int i=0; i<n-1; i++)
{
// Create an empty hash set and add i'th element
// to it.
HashSet<Integer> set = new HashSet<>();
set.add(arr[i]);
// Initialize max and min in current subarray
int mn = arr[i], mx = arr[i];
// One by one fix ending points
for (int j=i+1; j<n; j++)
{
// If current element is already in hash set, then
// this subarray cannot contain contiguous elements
if (set.contains(arr[j]))
break;
// Else add curremt element to hash set and update
// min, max if required.
set.add(arr[j]);
mn = Math.min(mn, arr[j]);
mx = Math.max(mx, arr[j]);
// We have already cheched for duplicates, now check
// for other property and update max_len if needed
if (mx-mn == j-i)
max_len = Math.max(max_len, mx-mn+1);
}
}
return max_len; // Return result
}
// Driver method to test above method
public static void main (String[] args)
{
int arr[] = {10, 12, 12, 10, 10, 11, 10};
System.out.println("Length of the longest contiguous subarray is " +
findLength(arr));
}
}
Output:
Length of the longest contiguous subarray is 2
Time complexity of the above solution is O(n2) under the assumption that hash set operations like add() and contains() work in O(1) time.
Its a good recursion question. The idea is to create an array of length k. The array stores current sequence. For every position in array, we check
the previous element and one by one put all elements greater than the previous element. If there is no previous element (first position), we put all
numbers from 1 to n.
Following is C++ implementation of above idea.
// C++ program to print all increasing sequences of
// length 'k' such that the elements in every sequence
// are from first 'n' natural numbers.
#include<iostream>
using namespace std;
// A utility function to print contents of arr[0..k-1]
void printArr(int arr[], int k)
{
for (int i=0; i<k; i++)
cout << arr[i] << " ";
cout << endl;
}
// A recursive function to print all increasing sequences
// of first n natural numbers. Every sequence should be
// length k. The array arr[] is used to store current
// sequence.
void printSeqUtil(int n, int k, int &len, int arr[])
{
// If length of current increasing sequence becomes k,
// print it
if (len == k)
{
printArr(arr, k);
return;
}
// Decide the starting number to put at current position:
// If length is 0, then there are no previous elements
// in arr[]. So start putting new numbers with 1.
// If length is not 0, then start from value of
// previous element plus 1.
int i = (len == 0)? 1 : arr[len-1] + 1;
// Increase length
len++;
// Put all numbers (which are greater than the previous
// element) at new position.
while (i<=n)
{
arr[len-1] = i;
printSeqUtil(n, k, len, arr);
i++;
}
Output:
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
4
4
4
5
2
2
2
2
2
3
3
3
3
4
4
4
5
5
6
3
3
3
3
4
4
4
5
5
6
4
4
4
5
5
6
5
5
6
6
3
4
5
6
7
4
5
6
7
5
6
7
6
7
7
4
5
6
7
5
6
7
6
7
7
5
6
7
6
7
7
6
7
7
7
The idea is simple, we traverse both strings from one side to other side (say from rightmost character to leftmost). If we find a matching character,
we move ahead in both strings. Otherwise we move ahead only in str2.
Following is Recursive Implementation in C++ and Python of the above idea.
C/C++
// Recursive C++ program to check if a string is subsequence of another string
#include<iostream>
#include<cstring>
using namespace std;
// Returns true if str1[] is a subsequence of str2[]. m is
// length of str1 and n is length of str2
bool isSubSequence(char str1[], char str2[], int m, int n)
{
// Base Cases
if (m == 0) return true;
if (n == 0) return false;
// If last characters of two strings are matching
if (str1[m-1] == str2[n-1])
return isSubSequence(str1, str2, m-1, n-1);
// If last characters are not matching
return isSubSequence(str1, str2, m, n-1);
}
// Driver program to test methods of graph class
int main()
{
char str1[] = "gksrek";
char str2[] = "geeksforgeeks";
int m = strlen(str1);
int n = strlen(str2);
isSubSequence(str1, str2, m, n)? cout << "Yes ":
cout << "No";
return 0;
}
Python
# Recursive Python program to check if a string is subsequence
# of another string
# Returns true if str1[] is a subsequence of str2[]. m is
# length of str1 and n is length of str2
def isSubSequence(string1, string2, m, n):
# Base Cases
if m == 0:
return True
if n == 0:
return False
# If last characters of two strings are matching
if string1[m-1] == string2[n-1]:
return isSubSequence(string1, string2, m-1, n-1)
# If last characters are not matching
return isSubSequence(string1, string2, m, n-1)
# Driver program to test the above function
string1 = "gksrek"
string2 = "geeksforgeeks"
m = len(string1)
n = len(string2)
if isSubSequence(string1, string2, m, n):
print "Yes"
else:
print "No"
# This code is contributed by BHAVYA JAIN
Yes
Output:
Yes
Time Complexity of both implementations above is O(n) where n is the length of str2.
For example consider the board shown on right side (taken from here), the minimum number of dice throws required to reach cell 30 from cell 1 is
3. Following are steps.
a) First throw two on dice to reach cell number 3 and then ladder to reach 22
b) Then throw 6 to reach 28.
c) Finally through 2 to reach 30.
There can be other solutions as well like (2, 2, 6), (2, 4, 4), (2, 3, 5).. etc.
Following is C++ implementation of the above idea. The input is represented by two things, first is N which is number of cells in the given board,
second is an array move[0N-1] of size N. An entry move[i] is -1 if there is no snake and no ladder from i, otherwise move[i] contains index of
destination cell for the snake or the ladder at i.
// C++ program to find minimum number of dice throws required to
// reach last cell from first cell of a given snake and ladder
// board
#include<iostream>
#include <queue>
using namespace std;
// An entry in queue used in BFS
struct queueEntry
{
int v;
// Vertex number
int dist; // Distance of this vertex from source
};
// This function returns minimum number of dice throws required to
// Reach last cell from 0'th cell in a snake and ladder game.
// move[] is an array of size N where N is no. of cells on board
// If there is no snake or ladder from cell i, then move[i] is -1
// Otherwise move[i] contains cell to which snake or ladder at i
// takes to.
int getMinDiceThrows(int move[], int N)
{
// The graph has N vertices. Mark all the vertices as
// not visited
bool *visited = new bool[N];
for (int i = 0; i < N; i++)
visited[i] = false;
// Create a queue for BFS
queue<queueEntry> q;
// Mark the node 0 as visited and enqueue it.
visited[0] = true;
queueEntry s = {0, 0}; // distance of 0't vertex is also 0
q.push(s); // Enqueue 0'th vertex
// Do a BFS starting from vertex at index 0
queueEntry qe; // A queue entry (qe)
while (!q.empty())
{
qe = q.front();
int v = qe.v; // vertex no. of queue entry
=
=
=
=
0;
8;
3;
6;
cout << "Min Dice throws required is " << getMinDiceThrows(moves, N);
return 0;
}
Output:
Min Dice throws required is 3
Time complexity of the above solution is O(N) as every cell is added and removed only once from queue. And a typical enqueue or dequeue
operation takes O(1) time.
Output:
Total cost for connecting ropes is 29
Time Complexity: Time complexity of the algorithm is O(nLogn) assuming that we use a O(nLogn) sorting algorithm. Note that heap operations
like insert and extract take O(Logn) time.
Algorithmic Paradigm: Greedy Algorithm
Output:
Total cost for connecting ropes is 29
This is mainly an application of Catalan Numbers. Total possible valid expressions for input n is n/2th Catalan Number if n is even and 0 if n is odd.
Following is C++ implementation of the above idea.
// C++ program to find valid paranthesisations of length n
// The majority of code is taken from method 3 of
// http://www.geeksforgeeks.org/program-nth-catalan-number/
#include<iostream>
using namespace std;
// Returns value of Binomial Coefficient C(n, k)
unsigned long int binomialCoeff(unsigned int n, unsigned int k)
{
unsigned long int res = 1;
// Since C(n, k) = C(n, n-k)
if (k > n - k)
k = n - k;
// Calculate value of [n*(n-1)*---*(n-k+1)] / [k*(k-1)*---*1]
for (int i = 0; i < k; ++i)
{
res *= (n - i);
res /= (i + 1);
}
return res;
}
// A Binomial coefficient based function to find nth catalan
// number in O(n) time
unsigned long int catalan(unsigned int n)
{
// Calculate value of 2nCn
unsigned long int c = binomialCoeff(2*n, n);
// return 2nCn/(n+1)
return c/(n+1);
}
// Function to find possible ways to put balanced parenthesis
// in an expression of length n
unsigned long int findWays(unsigned n)
{
// If n is odd, not possible to create any valid parentheses
if (n & 1) return 0;
// Otherwise return n/2'th Catalan Numer
return catalan(n/2);
}
// Driver program to test above functions
int main()
{
int n = 6;
cout << "Total possible expressions of length "
<< n << " is " << findWays(6);
return 0;
}
Output:
Total possible expressions of length 6 is 5
Output:
4
Generate all binary permutations such that there are more or equal 1s than 0s before
every point in all permutations
Generate all permutations of given length such that every permutation has more or equal 1s than 0s in all prefixes of the permutation.
Examples:
Input: len = 4
Output: 1111 1110 1101 1100 1011 1010
Note that a permutation like 0101 can not be in output because
there are more 0's from index 0 to 2 in this permutation.
Input: len = 3
Output: 111 110 101
Input: len = 2
Output: 11 10
Like permutation generation problems, recursion is the simplest approach to solve this. We start with an empty string, attach 1 to it and recur.
While recurring, if we find more 1s at any point, we append a 0 and make one more recursive call.
// C++ program to generate all permutations of 1's and 0's such that
// every permutation has more 1's than 0's at all indexes.
#include <iostream>
#include <cstring>
using namespace std;
// ones & zeroes --> counts of 1's and 0's in current string 'str'
// len ---> desired length of every permutation
void generate(int ones, int zeroes, string str, int len)
{
// If length of current string becomes same as desired length
if (len == str.length())
{
cout << str << " ";
return;
}
// Append a 1 and recur
generate(ones+1, zeroes, str+"1", len);
// If there are more 1's, append a 0 as well, and recur
if (ones > zeroes)
generate(ones, zeroes+1, str+"0", len);
}
// Driver program to test above function
int main()
{
string str = "";
generate(0, 0, str, 4);
return 0;
}
Output:
1111 1110 1101 1100 1011 1010
Output:
EEKSFORGEEKSG
EEKSQUIZG
ABBCABDAD
Time complexity of the above solution is O(n2Logn) under the assumption that we have used a O(nLogn) sorting algorithm.
This problem can be solved using more efficient methods like Booths Algorithm which solves the problem in O(n) time. We will soon be covering
these methods as separate posts.
C++
#include <iostream>
using namespace std;
// Fills element in arr[] from its pair sum array pair[].
// n is size of arr[]
void constructArr(int arr[], int pair[], int n)
{
arr[0] = (pair[0]+pair[1]-pair[n-1]) / 2;
for (int i=1; i<n; i++)
arr[i] = pair[i-1]-arr[0];
}
// Driver program to test above function
int main()
{
int pair[] = {15, 13, 11, 10, 12, 10, 9, 8, 7, 5};
int n = 5;
int arr[n];
constructArr(arr, pair, n);
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
return 0;
}
Java
import java.io.*;
class PairSum {
// Fills element in arr[] from its pair sum array pair[].
// n is size of arr[]
static void constructArr(int arr[], int pair[], int n)
{
arr[0] = (pair[0]+pair[1]-pair[n-1]) / 2;
for (int i=1; i<n; i++)
arr[i] = pair[i-1]-arr[0];
}
// Driver program to test above function
public static void main(String[] args)
{
int pair[] = {15, 13, 11, 10, 12, 10, 9, 8, 7, 5};
int n = 5;
int[] arr = new int[n];
constructArr(arr, pair, n);
for (int i = 0; i < n; i++)
System.out.print(arr[i]+" ");
}
}
/* This code is contributed by Devesh Agrawal */
8 7 5 3 2
Output:
Value of 1+2*5+3 is 18
Value of 1+2*3 is 9
Value of 4-2+6*3 is 24
1++2 is Invalid
The above code doesnt handle spaces. We can handle spaces by first removing all spaces from the given string. A better solution is to handle
spaces in single traversal. This is left as an exercise.
Time Complexity is O(n) where n is length of the given expression.
Output:
No
Yes
We can find whether two strings are anagram or not in linear time using count array (see method 2 of this).
One simple idea to find whether all anagram pairs is to run two nested loops. The outer loop picks all strings one by one. The inner loop checks
whether remaining strings are anagram of the string picked by outer loop. Following is C++ implementation of this simple approach.
#include <iostream>
using namespace std;
#define NO_OF_CHARS 256
/* function to check whether two strings are anagram of each other */
bool areAnagram(string str1, string str2)
{
// Create two count arrays and initialize all values as 0
int count[NO_OF_CHARS] = {0};
int i;
// For each character in input strings, increment count in
// the corresponding count array
for (i = 0; str1[i] && str2[i]; i++)
{
count[str1[i]]++;
count[str2[i]]--;
}
// If both strings are of different length. Removing this condition
// will make the program fail for strings like "aaca" and "aca"
if (str1[i] || str2[i])
return false;
// See if there is any non-zero value in count array
for (i = 0; i < NO_OF_CHARS; i++)
if (count[i])
return false;
return true;
}
// This function prints all anagram pairs in a given array of strigns
void findAllAnagrams(string arr[], int n)
{
for (int i = 0; i < n; i++)
for (int j = i+1; j < n; j++)
if (areAnagram(arr[i], arr[j]))
cout << arr[i] << " is anagram of " << arr[j] << endl;
}
/* Driver program to test to pront printDups*/
int main()
{
string arr[] = {"geeksquiz", "geeksforgeeks", "abcd",
"forgeeksgeeks", "zuiqkeegs"};
int n = sizeof(arr)/sizeof(arr[0]);
findAllAnagrams(arr, n);
return 0;
}
Output:
geeksquiz is anagram of zuiqkeegs
geeksforgeeks is anagram of forgeeksgeeks
The time complexity of the above solution is O(n2*m) where n is number of strings and m is maximum length of a string.
Optimizations:
We can optimize the above solution using following approaches.
1) Using sorting: We can sort array of strings so that all anagrams come together. Then print all anagrams by linearly traversing the sorted array.
The time complexity of this solution is O(mnLogn) (We would be doing O(nLogn) comparisons in sorting and a comparison would take O(m)
time)
2) Using Hashing: We can build a hash function like XOR or sum of ASCII values of all characters for a string. Using such a hash function, we
can build a hash table. While building the hash table, we can check if a value is already hashed. If yes, we can call areAnagrams() to check if two
strings are actually anagrams (Note that xor or sum of ASCII values is not sufficient, see Kaushik Leles comment here)