Sorting
Sorting
SEQUENTIAL BUBBLESORT (A) for i 1 to length [A] do for j length [A] downto i +1 do If A[A] < A[j-1] then Exchange A[j] A[j-1]
Clearly, the graph shows the n nature of the bubble sort. In this algorithm the number of comparison is irrespective of data set i.e., input whether best or worst.
Memory Requirement
Clearly, bubble sort does not require extra memory.
Implementation
void bubbleSort(int numbers[], int array_size)
{ int i, j, temp; for (i = (array_size - 1); i >= 0; i--) { for (j = 1; j <= i; j++) { if (numbers[j-1] > numbers[j]) { temp = numbers[j-1]; numbers[j-1] = numbers[j]; numbers[j] = temp; } } } }
PARALLEL BUBBLE SORT (A) For k = 0 to n-2 2. If k is even then 3. for i = 0 to (n/2)-1 do in parallel 4. If A[2i] > A[2i+1] then 5. Exchange A[2i] A[2i+1] 6. Else 7. for i = 0 to (n/2)-2 do in parallel 8. If A[2i+1] > A[2i+2] then 9. Exchange A[2i+1] A[2i+2] 10. Next k
1.
Parallel Analysis
Steps 1-10 is a one big loop that is represented n
-1 times. Therefore, the parallel time complexity is O(n). If the algorithm, odd-numbered steps need (n/2) - 2 processors and even-numbered steps require (n/2) - 1 processors. Therefore, this needs O(n) processors.
Links
Bubble Sort Bubble Sort (Ordinary) Bubble Sort (Ordinary, with User Input) Bubble Sort (More Efficient) Bubble Sort (More Efficient, with User Input) Simple Sort Simple Sort (with User Input)
Divide-and-Conquer Algorithm
subproblems are easier to find and then composing the partial solutions into the solution of the original problem.
Breaking the problem into several sub-problems that are similar to the original problem but smaller in size, Solve the sub-problem recursively (successively and independently), and then Combine these solutions to subproblems to create a solution to the original problem.
Problem Let A[1 . . . n] be an array of non-decreasing sorted order; that is A [i] A [j] whenever 1 i j n. Let 'q' be the query point. The problem consist of finding 'q' in the array A. If q is not in A, then find the position where 'q'
might be inserted.
Sequential Search
Look sequentially at each element of A until either we reach at the end of an array A or find an item no smaller than 'q'.
Analysis
This algorithm clearly takes a (r), where r is the index returned. This is (n) in the worst case and O(1) in the best case. If the elements of an array A are distinct and query point q is indeed in the array then loop executed (n
+ 1) / 2 average number of times. On average (as well as the worst case), sequential search takes (n) time.
Binary Search
Look for 'q' either in the first half or in the second half of the array A. Compare 'q' to
= n/2 . If q A[k] , then search in the A[1 . . . k]; otherwise search T[k+1 . . n] for 'q'. Binary search for q in subarray A[i . . j] with the promise that
an element in the middle,
n/2
then return Binary Search [A [i-k], q] else return Binary Search [A[k+1 . . j], q]
Analysis
Binary Search can be accomplished in logarithmic time in the worst case , i.e., T(n)
= (log n). This version of the binary search takes logarithmic time in the best
case.
. . n]
if q > A [n] then return n + 1 i = 1; j = n; while i < j do k = (i + j)/2 if q A [k] then j = k else i = k + 1 return i (the index)
Analysis
The analysis of Iterative algorithm is identical to that of its recursive counterpart.
Dynamic programming takes advantage of the duplication and arrange to solve each subproblem only once, saving the solution (in table or something) for later use. The underlying idea of dynamic programming is: avoid calculating the same stuff twice, usually by keeping a table of known results of subproblems. Unlike divide-andconquer, which solves the subproblems top-down, a dynamic programming is a bottom-up technique.
Bottom-up means
i. ii. iii.
Start with the smallest subproblems. Combining theirs solutions obtain the solutions to subproblems of increasing size. Until arrive at the solution of the original problem.
Problem Statement
A thief robbing a store and can carry a maximal weight of W into their knapsack. There are n items and ith item weigh wi and is worth vi dollars. What items should thief take?
There are two versions of problem Fractional knapsack problem The setup is same, but the thief can take fractions of items, meaning that the items can be broken into smaller pieces so that thief may decide to carry only a fraction of xi of item i, where 0 xi 1. 0-1 knapsack problem The setup is the same, but the items may not be broken into smaller pieces, so thief may decide either to take an item or to leave it (binary choice), but may not take a fraction of an item.
Exhibit greedy choice property. Greedy algorithm exists. Exhibit optimal substructure property.
Exhibit No greedy choice property. No greedy algorithm exists. Exhibit optimal substructure property. Only dynamic programming algorithm exists.
= S - {i} is an optimal solution for W - wi pounds and the value to the solution S is Vi plus the value of the subproblem.
We can express this fact in the following formula: define c[i, for items 1,2, . . . , i and maximum weight w. Then
w] to be the solution
if i = 0 or w=0 c[i,w] =c[i-1, w] if wi 0 max [vi + c[i-1, w- if i>0 and wi], c[i-1, w]} w wi 0
This says that the value of the solution to i items either include i item, in which case it is vi plus a subproblem solution for (i
th
th
- 1) items and the weight excluding wi, or does not include i item, in which case it is a subproblem's solution for (i - 1) items and the same weight. That is, if the thief picks item i, thief takes vi value, and thief can choose from items w - wi, and get c[i - 1, w - wi] additional value. On other
hand, if thief decides not to take item i, thief can choose from item 1,2, . . . , i- 1 upto the weight limit w, and get c[i - 1, w] value. The better of these two choices should be made. Although the 0-1 knapsack problem, the above formula for c is similar to LCS formula: boundary values are 0, and other values are computed from the input and "earlier" values of c. So the 0-1 knapsack algorithm is like the LCS-length algorithm given in CLR for finding a longest common subsequence of two sequences. The algorithm takes as input the maximum weight W, the number of items n, and the two sequences v = <v1, v2, . . . , vn> and w = <w1, w2, . . . , wn>. It stores the c[i,
j] values in the table, that is, a two dimensional array, c[0 . . n, 0 . . w] whose entries are computed in a row-major order. That is, the first row of c is filled in from left to right, then the second row, and so on. At the end of the computation, c[n, w] contains the maximum value that can be picked into the knapsack.
Dynamic-0-1-knapsack (v, w, n, W) FOR w = 0 TO W DO c[0, w] = 0 FOR i=1 to n DO c[i, 0] = 0 FOR w=1 TO W DO IFf wi w THEN IF vi + c[i-1, w-wi] THEN c[i, w] = vi + c[i-1, w-wi] ELSE c[i, w] = c[i-1, w] ELSE c[i, w] = c[i-1, w]
The set of items to take can be deduced from the table, starting at c[n.
w] and tracing backwards where the optimal values came from. If c[i, w] = c[i-1, w] item i is not part of the solution, and we are continue tracing with c[i-1, w]. Otherwise item i is part of the solution, and we continue tracing with c[i-1, w-W].
Analysis
This dynamic-0-1-kanpsack algorithm takes (nw) times, broken up as follows:
(nw) times to fill the c-table, which has (n +1).(w +1) entries, each requiring (1) time to compute. O(n) time to trace the solution, because the tracing process starts in row n of the table and moves up 1 row at each step.
Problem Statement Given a set S of n activities with and start time, Si and fi, finish time of an ith activity. Find the maximum size set of mutually compatible
activities.
Compatible Activities
Activities i and j are compatible if the half-open internal [si, fi) and [sj, fj) do not overlap, that is, i and j are compatible if si fj and sj fi
Dynamic-Programming Algorithm
The finishing time are in a sorted array f[i] and the starting times are in array s[i]. The array m[i] will store the value mi, where mi is the size of the largest of mutually compatible activities among activities {1, 2, . . . , i}. Let BINARY-
SEARCH(f, s) returns the index of a number i in the sorted array f such that f(i) s f[i + 1].
for i =1 to n do m[i] = max(m[i-1], 1+ m [BINARY-SEARCH(f, s[i])]) We have P(i] = 1 if activity i is in optimal selection, and P[i] = 0 otherwise i=n while i > 0 do if m[i] = m[i-1] then P[i] = 0 i=i-1 else i = BINARY-SEARCH (f, s[i]) P[i] = 1
Analysis
The running time of this algorithm is O(n
takes lg(n) time as opposed to the O(n) running time of the greedy algorithm. This greedy algorithm assumes that the activities already sorted by increasing time.
Heap Sort
The binary heap data structures is an array that can be viewed as a complete binary tree. Each node of the binary tree corresponds to an element of the array. The array is completely filled on all levels except possibly lowest.
We represent heaps in level order, going from left to right. The array corresponding to the heap above is [25, 13, 17, 5, 8, 3].
The root of the tree A[1] and given index i of a node, the indices of its parent, left child and right child can be computed
PARENT (i) return floor(i/2 LEFT (i) return 2i RIGHT (i) return 2i + 1
Let's try these out on a heap to make sure we believe they are correct. Take this heap,
which is represented by the array [20, 14, 17, 8, 6, 9, 4, 1]. We'll go from the 20 to the 6 first. The index of the 20 is 1. To find the index of the left child, we calculate 1 * 2 = 2. This takes us (correctly) to the 14. Now, we go right, so we calculate 2 * 2 + 1 = 5. This takes us (again, correctly) to the 6. Now let's try going from the 4 to the 20. 4's index is 7. We want to go to the parent, so we calculate 7 / 2 = 3, which takes us to the 17. Now, to get 17's parent, we calculate 3 / 2 = 1, which takes us to the 20.
Heap Property
In a heap, for every node i other than the root, the value of a node is greater than or equal (at most) to the value of its parent.
By the definition of a heap, all the tree levels are completely filled except possibly for the lowest level, which is filled from the left up to a point. Clearly a heap of height h has the minimum number of elements when it has just one node at the lowest level. h The levels above the lowest level form a complete binary tree of height h -1 and 2 -
1 nodes. Hence the minimum number of nodes possible in a heap of height h is 2h. Clearly a heap of height h, has the maximum number of elements when its lowest level is completely filled. In this case the heap is a complete binary tree of height h h+1 and hence has 2 -1 nodes.
Following is not a heap, because it only has the heap property - it is not a complete binary tree. Recall that to be complete, a binary tree has to fill up all of its levels with the possible exception of the last one, which must be filled in from the left side.
Height of a node
We define the height of a node in a tree to be a number of edges on the longest simple downward path from a node to a leaf.
Height of a tree
The number of edges on a simple downward path from a root to a leaf. Note that the height of a tree with n node is lg n which is (lgn). This implies that an nelement heap has height
lg n
In order to show this let the height of the n-element heap be h. From the bounds obtained on maximum and minimum number of elements in a heap, we get
2h n 2h+1-1
Where n is the number of elements in a heap.
2h n 2h+1
Taking logarithms to the base 2
h lgn h +1
It follows that h =
lgn
We known from above that largest element resides in root, A[1]. The natural question to ask is where in a heap might the smallest element resides? Consider any path from root of the tree to a leaf. Because of the heap property, as we follow that path, the elements are either decreasing or staying the same. If it happens to be the case that all elements in the heap are distinct, then the above implies that the smallest is in a leaf of the tree. It could also be that an entire subtree of the heap is the smallest element or indeed that there is only one element in the heap, which in the smallest element, so the smallest element is everywhere. Note that anything below the smallest element must equal the smallest element, so in general, only entire subtrees of the heap can contain the smallest element.
Let's suppose we want to add a node with key 15 to the heap. First, we add the node to the tree at the next spot available at the lowest level of the tree. This is to ensure that the tree remains complete.
Let's suppose we want to add a node with key 15 to the heap. First, we add the node to the tree at the next spot available at the lowest level of the tree. This is to ensure that the tree remains complete.
Now we do the same thing again, comparing the new node to its parent. Since 14 < 15, we have to do another swap:
20.
< A[2i] or A[i] < A[2i +1]. The procedure 'Heapify' manipulates the tree rooted at A[i] so it becomes a heap. In other words, 'Heapify' is let the value at A[i] "float down" in a heap so that subtree rooted at index i becomes a heap.
Heapify (A, i)
1. 2. 3. 4. 5. 6. 7. 8. 9.
l left [i] r right [i] if l heap-size [A] and A[l] > A[i] then largest l else largest i if r heap-size [A] and A[i] > A[largest] then largest r if largest i then exchange A[i] A[largest]
10.
Analysis
If we put a value at root that is less than every value in the left and right subtree, then 'Heapify' will be called recursively until leaf is reached. To make recursive calls traverse the longest path to a leaf, choose value that make 'Heapify' always recurse on the left child. It follows the left branch when left child is greater than or equal to the right child, so putting 0 at the root and 1 at all other nodes, for example, will accomplished this task. With such values 'Heapify' will called h times, where h is the heap height so its running time will be (h) (since each call does
(1) work), which is (lgn). Since we have a case in which Heapify's running time (lg n), its worst-case running time is (lgn).
Example of Heapify
Suppose we have a complete binary tree somewhere whose subtrees are heaps. In the following complete binary tree, the subtrees of 6 are heaps:
The Heapify procedure alters the heap so that the tree rooted at 6's position is a heap. Here's how it works. First, we look at the root of our tree and its two children.
We then determine which of the three nodes is the greatest. If it is the root, we are done, because we have a heap. If not, we exchange the appropriate child with the root, and continue recursively down the tree. In this case, we exchange 6 and 8, and continue.
Building a Heap
We can use the procedure 'Heapify' in a bottom-up fashion to convert an array A[1
. n] into a heap. Since the elements in the subarray A[ n/2 +1 . . n] are all
leaves, the procedure BUILD_HEAP goes through the remaining nodes of the tree and runs 'Heapify' on each one. The bottom-up order of processing node guarantees that the subtree rooted at children are heap before 'Heapify' is run at their parent.
BUILD_HEAP (A)
1. 2. 3.
HEAPSORT (A)
1. 2.
BUILD_HEAP (A) for i length (A) down to 2 do exchange A[1] A[i] heap-size [A] heap-size [A] - 1 Heapify (A, 1)
lg n), since the call to BUILD_HEAP takes time O(n) and each of the n -1 calls to Heapify takes time O(lg n).
Now we show that there are at most n/2 nodes of height h in any n-element heap. We need two observations to show this. The first is that if we consider the set of nodes of height h, they have the property that the subtree rooted at these nodes are
h+1
disjoint. In other words, we cannot have two nodes of height h with one being an ancestor of the other. The second property is that all subtrees are complete binary trees except for one subtree. Let Xh be the number of nodes of height h. Since Xh-1 o ft h+1 hese subtrees are full, they each contain exactly 2 -1 nodes. One of the height h subtrees may not full, but contain at least 1 node at its lower level and has at least 2 nodes. The exact count is 1+2+4+
h
. . . + 2h+1 + 1 = 2h. The remaining nodes have height strictly more than h. To connect all subtrees rooted at node of height h., there must be exactly Xh -1 such nodes. The total of nodes is at least (Xh-1)(2h+1 + 1) + 2h + Xh-1 which is at most n. Simplifying gives Xh n/2h+1 + 1/2.
In the conclusion, it is a property of binary trees that the number of nodes at any level is half of the total number of nodes up to that level. The number of leaves in a binary heap is equal to n/2, where n is the total number of nodes in the tree, is even and
n/2 when n is odd. If these leaves are removed, the number of new leaves will be lgn/2/2 or n/4 . If this process is continued for h levels the number of leaves h+1 at that level will be n/2
Implementation
void heapSort(int numbers[], int array_size) { int i, temp; for (i = (array_size / 2)-1; i >= 0; i--) siftDown(numbers, i, array_size);
for (i = array_size-1; i >= 1; i--) { temp = numbers[0]; numbers[0] = numbers[i]; numbers[i] = temp; siftDown(numbers, 0, i-1); } } void siftDown(int numbers[], int root, int bottom) { int done, maxChild, temp; done = 0; while ((root*2 <= bottom) && (!done)) { if (root*2 == bottom) maxChild = root * 2; else if (numbers[root * 2] > numbers[root * 2 + 1]) maxChild = root * 2; else maxChild = root * 2 + 1; if (numbers[root] < numbers[maxChild]) { temp = numbers[root]; numbers[root] = numbers[maxChild]; numbers[maxChild] = temp; root = maxChild; } else done = 1; } }
Links
Heap Sort
Insertion Sort
If the first few objects are already sorted, an unsorted object can be inserted in the
sorted set in proper place. This is called insertion sort. An algorithm consider the elements one at a time, inserting each in its suitable place among those already considered (keeping them sorted). Insertion sort is an example of an incremental algorithm; it builds the sorted sequence one number at a time.
INSERTION_SORT (A)
1. 2. 3.
4. 5. 6. 7. 8.
For j = 2 to length [A] do key = A[j] {Put A[j] into the sorted sequence A[1 . . j-1] i j -1 while i > 0 and A[i] > key do A[i+1] = A[i] i = i-1 A[i+1] = key
Analysis
Best-Case The while-loop in line 5 executed only once for each j. This happens if given array A is already sorted.
T(n) = an + b = O(n)
It is a linear function of n.
Worst-Case The worst-case occurs, when line 5 executed j times for each j. This can happens if array A starts out in reverse order
Stability
Since multiple keys with the same value are placed in the sorted array in the same order that they appear in the input array, Insertion sort is stable.
Extra Memory
This algorithm does not require extra memory.
For Insertion sort we say the worst-case running time is (n2), and the best-case running time is (n). Insertion sort use no extra memory it sort in place. The time of Insertion sort is depends on the original order of a input. It takes a time in (n2) in the worst-case, despite the fact that a time in order of n is sufficient to solve large instances in which the items are already sorted.
Implementation void insertionSort(int numbers[], int array_size) { int i, j, index; for (i=1; i < array_size; i++) { index = numbers[i]; j = i; while ((j > 0) && (numbers[j-1] > index)) { numbers[j] = numbers[j-1]; j = j - 1; } numbers[j] = index; } }
Merge Sort
We can visualize Merge-sort by means of binary tree where each node of the tree represents a recursive call and each external nodes represent individual elements of given array A. Such a tree is called Merge-sort tree. The heart of the Merge-sort algorithm is conquer step, which merge two sorted sequences into a single sorted sequence.
To begin, suppose that we have two sorted arrays A1[1], A1[2], . . , A1[M] and A2[1], A2[2], . . . , A2[N]. The following is a direct algorithm of the obvious strategy of successively choosing the smallest remaining elements from A1 to A2 and putting it in A.
MERGE (A1, A2, A) i. j 1 A1[m+1], A2[n+1] INT_MAX For k 1 to m + n do if A1[i] < A2[j] then A[k] A1[i] i i +1 else A[k] A2[j] jj+1
MERGE_SORT (A) A1[1 . . n/2 ] A[1 . . n/2 ] A2[1 . . n/2 ] A[1 + n/2 . . n] Merge Sort (A1) Merge Sort (A1) Merge Sort (A1, A2, A)
Analysis
Let T(n) be the time taken by this algorithm to sort an array of n elements dividing
A into subarrays A1 and A2 takes linear time. It is easy to see that the Merge (A1, A2,
A) also takes the linear time. Consequently,
lg n), which is asymptotically optimal like Heap sort, Merge sort has a guaranteed n lg n running time. Merge sort required (n) extra space. Merge is not in-place algorithm. The
only known ways to merge in-place (without any extra space) are too complex to be reduced to practical program.
Implementation
void mergeSort(int numbers[], int temp[], int array_size) { m_sort(numbers, temp, 0, array_size - 1); }
void m_sort(int numbers[], int temp[], int left, int right) { int mid; if (right > left) { mid = (right + left) / 2; m_sort(numbers, temp, left, mid); m_sort(numbers, temp, mid+1, right); merge(numbers, temp, left, mid+1, right); } }
void merge(int numbers[], int temp[], int left, int mid, int right) { int i, left_end, num_elements, tmp_pos; left_end = mid - 1; tmp_pos = left; num_elements = right - left + 1;
{ if (numbers[left] <= numbers[mid]) { temp[tmp_pos] = numbers[left]; tmp_pos = tmp_pos + 1; left = left +1; } else { temp[tmp_pos] = numbers[mid]; tmp_pos = tmp_pos + 1; mid = mid + 1; } }
while (left <= left_end) { temp[tmp_pos] = numbers[left]; left = left + 1; tmp_pos = tmp_pos + 1; }
while (mid <= right) { temp[tmp_pos] = numbers[mid]; mid = mid + 1; tmp_pos = tmp_pos + 1; }
Links
Merge Sort (Breadth First, Input List size is a power of 2) Merge Sort (Breadth First, with User Input, Input List size must be a power of 2) Merge Sort (Depth First, Input List size is not a power of 2) Merge Sort (Depth First, with User Input, Input List size may not be a power of 2)
Quick Sort
The basic version of quick sort algorithm was invented by C. A. R. Hoare in 1960 and
formally introduced quick sort in 1962. It is used on the principle of divide-andconquer. Quick sort is an algorithm of choice in many situations because it is not difficult to implement, it is a good "general purpose" sort and it consumes relatively fewer resources during execution.
Good points
It is in-place since it uses only a small auxiliary stack. It requires only n log(n) time to sort n items. It has an extremely short inner loop This algorithm has been subjected to a thorough mathematical analysis, a very precise statement can be made about performance issues.
Bad Points
It is recursive. Especially if recursion is not available, the implementation is extremely complicated. 2 It requires quadratic (i.e., n ) time in the worst-case. It is fragile i.e., a simple mistake in the implementation can go unnoticed and cause it to perform badly.
. . r] into two non-empty sub array A[p . . q] and A[q+1 . . r] such that every key in A[p . . q] is less than or equal to every key in A[q+1 . . r]. Then the two subarrays are sorted by recursive
calls to Quick sort. The exact position of the partition depends on the given array and index q is computed as a part of the partitioning procedure.
QuickSort
1. 2. 3. 4.
If p < r then q Partition (A, p, r) Recursive call to Quick Sort (A, p, q) Recursive call to Quick Sort (A, q + r, r)
Note that to sort entire array, the initial call Quick Sort (A, 1, length[A])
As a first step, Quick Sort chooses as pivot one of the items in the array to be sorted. Then array is then partitioned on either side of the pivot. Elements that are less than or equal to pivot will move toward the left and elements that are greater than or equal to pivot will move toward the right.
PARTITION (A, p, r)
1. 2. 3. 4. 5.
6. 7. 8. 9. 10. 11.
until A[j] x Repeat i i+1 until A[i] x if i < j then exchange A[i] A[j] else return j
Partition selects the first key, A[p] as a pivot key about which the array will partitioned:
Keys A[p] will be moved towards the left . Keys A[p] will be moved towards the right.
The running time of the partition procedure is the number of keys in the array.
Another argument that running time of PARTITION on a subarray of size conveying somewhere in the middle. The total number of times that i can be
follows: Pointer i and pointer j start at each end and move towards each other, incremented and j can be decremented is therefore O(n). Associated with each increment or decrement there are O(1) comparisons and swaps. Hence, the total time is O(n).
i and j towards the middle one space. They meet in the middle, so q=
Floor(p+r/2). Therefore, when all elements in the array A[p . . r] have the same value equal to Floor(p+r/2).
Best Case
The best thing that could happen in Quick sort would be that each partitioning stage divides the array exactly in half. In other words, the best to be a median of the keys in A[p . . r] every time procedure 'Partition' is called. The procedure 'Partition' always split the array to be sorted into two equal sized arrays. If the procedure 'Partition' produces two regions of size n/2. the recurrence relation is then
T(n) = (n lg n)
(A, p, r) call always return p so successive calls to partition will split arrays of length
n, n-1, n-2, . . . , 2 and running time proportional to n + (n-1) + (n-2) + . . . + 2 = [(n+2)(n-1)]/2 = (n2). The worst-case also occurs if A[1 . . n]
starts out in reverse order.
In this version we choose a random key for the pivot. Assume that procedure Random (a, b) returns a random integer in the range [a, b); there are b-a+1 integers in the range and procedure is equally likely to return one of them. The new partition procedure, simply implemented the swap before actually partitioning.
RANDOMIZED_PARTITION (A, p, r) i RANDOM (p, r) Exchange A[p] A[i] return PARTITION (A, p, r)
Now randomized quick sort call the above procedure in place of PARTITION
RANDOMIZED_QUICKSORT (A, p, r) If p < r then q RANDOMIZED_PARTITION (A, p, r) RANDOMIZED_QUICKSORT (A, p, q) RANDOMIZED_QUICKSORT (A, q+1, r)
Like other randomized algorithms, RANDOMIZED_QUICKSORT has the property that no particular input elicits its worst-case behavior; the behavior of algorithm only depends on the random-number generator. Even intentionally, we cannot produce a bad input for RANDOMIZED_QUICKSORT unless we can predict generator will produce next.
Worst-case Let T(n) be the worst-case time for QUICK SORT on input size n. We have a recurrence
--------- 1
where q runs from 1 to n-1, since the partition produces two regions, each having size at least 1. Now we guess that T(n)
T(n) = max1qn-1 (cq2 ) + c(n - q2)) + (n) = c max (q2 + (n - q)2) + (n)
Since the second derivative of expression q
2
+ (n-q)2
q n -1 at one of 2= 2 the endpoints. This gives the bound max (q + (n - q) )) 1 + (n -1) n + 2(n -1).
Continuing with our bounding of T(n) we get
T(n) cn2
Thus the worst-case running time of quick sort is
(n2).
Average-case Analysis
If the split induced by RANDOMIZED_PARTITION puts constant fraction of elements on one side of the partition, then the recurrence tree has depth (lgn) and
(n) work is performed at (lg n) of these level. This is an intuitive argument why the average-case running time of RANDOMIZED_QUICKSORT is (n lg n).
Let T(n) denotes the average time required to sort an array of n elements. A call to RANDOMIZED_QUICKSORT with a 1 element array takes a constant time, so we have T(1) = (1). After the split RANDOMIZED_QUICKSORT calls itself to sort two subarrays. The average time to sort an array A[1 . . q] is T[q] and the average time to sort an array
A[q+1 . . n] is T[n-q]. We have T(n) = 1/n (T(1) + T(n-1) + n-1q=1 T(q) + T(n-q))) + (n) ---- 1
We know from worst-case analysis
T(1) = (1) and T(n -1) = O(n2) T(n) = 1/n ( (1) + O(n2)) + 1/n n-1q=1 (r(q) + T(n - q)) + (n) = 1/n n-1q=1(T(q) + T(n - q)) + (n) ------- 2 n-1 = 1/n[2 k=1(T(k)] + (n) = 2/n n-1k=1(T(k) + (n) --------3
Solve the above recurrence using substitution method. Assume inductively that T(n)
T(n) n-1k=1 2/n (aklgk + b) + (n) = 2a/n n-1k=1 klgk - 1/8(n2) + 2b/n (n -1) + (n)
------- 4
T(n) 2a/n [1/2 n2 lgn - 1/8(n2)] + 2/n b(n -1) + (n) anlgn - an/4 + 2b + (n) ---------- 5 (n) + b and an/4 are polynomials and we certainly can choose 'a' large enough so that an/4 dominates (n) + b.
In the above equation, we see that We conclude that QUICKSORT's average running time is
(n lg(n)).
Conclusion
Quick sort is an in place sorting algorithm whose worst-case running time is and expected running time is small.
(n2) (n lg n) are
Implementation
void quickSort(int numbers[], int array_size) { q_sort(numbers, 0, array_size - 1); } void q_sort(int numbers[], int left, int right) { int pivot, l_hold, r_hold;
l_hold = left; r_hold = right; pivot = numbers[left]; while (left < right) { while ((numbers[right] >= pivot) && (left < right)) right--; if (left != right) { numbers[left] = numbers[right]; left++; } while ((numbers[left] <= pivot) && (left < right)) left++; if (left != right) { numbers[right] = numbers[left]; right--; } } numbers[left] = pivot; pivot = left; left = l_hold; right = r_hold; if (left < pivot) q_sort(numbers, left, pivot-1); if (right > pivot) q_sort(numbers, pivot+1, right); }
Quick Sort Quick Sort (with User Input)
Selection Sort
This type of sorting is called "Selection Sort" because it works by repeatedly element.
It works as follows: first find the smallest in the array and exchange it with the element in the first position, then find the second smallest element and exchange it with the element in the second position, and continue in this way until the entire array is sorted.
SELECTION_SORT (A) for i 1 to n-1 do min j i; min x A[i] for j i + 1 to n do If A[j] < min x then min j j min x A[j] A[min j] A [i] A[i] min x
Selection sort is among the simplest of sorting techniques and it work very well for small files. Furthermore, despite its evident "nave approach "Selection sort has a quite important application because each item is actually moved at most once, Section sort is a method of choice for sorting files with very large objects (records) and small keys.
The worst case occurs if the array is already sorted in descending order. Nonetheless, the time require by selection sort algorithm is not very sensitive to the original order of the array to be sorted: the test "if A[j] < min x" is executed exactly the same number of times in every case. The variation in time is only due to the number of times the "then" part (i.e., min j j; min x A[j] of this test are executed. The Selection sort spends most of its time trying to find the minimum element in the "unsorted" part of the array. It clearly shows the similarity between Selection sort and Bubble sort. Bubble sort "selects" the maximum remaining elements at each stage, but wastes some effort imparting some order to "unsorted" part of the array. Selection sort is quadratic in both the worst and the average case, and requires no extra memory. For each i from 1 to n
- 1, there is one exchange and n - i comparisons, so there is a total of n -1 exchanges and (n -1) + (n -2) + . . . + 2 + 1 = n(n -1)/2
comparisons. These observations hold no matter what the input data is. In the worst case, this could be quadratic, but in the average case, this quantity is O(n log n). It implies that the running time of Selection sort is quite insensitive to the input.
Implementation
void selectionSort(int numbers[], int array_size) { int i, j; int min, temp; for (i = 0; i < array_size-1; i++) { min = i; for (j = i+1; j < array_size; j++) { if (numbers[j] < numbers[min]) min = j; } temp = numbers[i]; numbers[i] = numbers[min]; numbers[min] = temp; } }
Links
Sorting
The objective of the sorting algorithm is to rearrange the records so that their keys are
ordered according to some well-defined ordering rule.
Problem: Given an array of n real number A[1.. n]. Objective: Sort the elements of A in ascending order of their values.
Internal Sort
If the file to be sorted will fit into memory or equivalently if it will fit into an array, then the sorting method is called internal. In this method, any record can be accessed easily.
External Sort
Sorting files from tape or disk. In this method, an external sort algorithm must access records sequentially, or at least in the block.
Memory Requirement
1. Sort in place and use no extra memory except perhaps for a small stack or table. 2. Algorithm that use a linked-list representation and so use N extra words of
memory for list pointers. 3. Algorithms that need enough extra memory space to hold another copy of the array to be sorted.
Stability
A sorting algorithm is called stable if it is preserves the relative order of equal keys in the file. Most of the simple algorithm are stable, but most of the well-known sophisticated algorithms are not.
There are two classes of sorting algorithms namely, O(n )-algorithms and O(n
2
log
n)-algorithms. O(n )-class includes bubble sort, insertion sort, selection sort and shell sort. O(n log n)-class includes heap sort, merge sort and quick sort.
Now we show that comparison-based sorting algorithm has an (n log n) worstcase lower bound on its running time operation in sorting, then this is the best we can do. Note that in a comparison sort, we use only comparisons between elements to gain information about an input sequence <a1, a2, . . . , an>. That is, given two elements ai and aj we perform one of the tests, ai
aj , ai
aj and ai < aj are all equivalent. Therefore we assume that all comparisons have form ai aj.
In general, there are n! possible permutations of the n input elements, so decision tree must have at least n! leaves.
Theorem The running time of any comparison-based algorithm for sorting an n-element sequence is (n lg n) in the worst case.
Examples of comparison-based algorithms (in CLR) are insertion sort, selection sort, merge sort, quicksort, heapsort, and treesort.
Proof Consider a decision tree of height h that sorts n elements. Since there are n! permutation of n elements and the tree must have at least n! leaves. We have n! 2h Taking logarithms on both sides (lg(n!) h h lg(n!)
n! > (n/e)n
where e = 2.71828 . . .
h (n/e)n
which is (n lg n)