Quick Sort Notes
Quick Sort Notes
Quick sort algorithm was developed by computer scientist Tony Hoare in 1959.
It quickly became popular due to its efficiency and simplicity, and itremains one
of the most widely used sorting algorithms.
Efficiency: Quick sort is often the best choice for sorting because it
operates efficiently in O(n log n) average time.
All values in the left part are less than the pivot.
All values in the right part are greater than the pivot.
After the partition, the pivot will be placed in the correct position in the sorted
order. Now we have a situation similar to a sorted array with one significant
difference: the left and right halves are still unsorted! If we observe further, both
unsorted halves are smaller subproblems of the sorting problem.
If we sort both halves recursively using the same function, the entire array will
get sorted! The idea is simple: All values on the left side are less than the
pivot, the pivot is at the correct position, and all values on the right side are
greater than the pivot.
Combine Part: This is a trivial case because after sorting both smaller arrays, the
entire array will be sorted. In other words, there is no need to write code for the
combine part.
Divide part
We define the partition(X, l, r) function to divide the array around the pivot
and return the pivotIndex. We will select the pivot value inside the partition
function. The critical question is: Why are we choosing the pivot value inside the
partition function? Think and explore!
pivotIndex = partition(X, l, r)
Conquer part
We recursively sort the left subarray by calling the same function with l
and pivotIndex as the left and right ends, i.e., quickSort(X, l, pivotIndex
- 1).
We recursively sort the right subarray by calling the same function with
pivotIndex + 1 and r as the left and right ends,
i.e., quickSort(X, pivotIndex + 1, r).
Base case
In recursive algorithms, the base case is the smallest version of the problem where
recursion will terminate. So, in the case of quicksort, thebase case occurs when the
sub-array size is either 0 (empty) or 1. The critical question is: When do we
reach the empty subarray scenario?
Think and explore!
In other words, l >= r is the condition for the base case in quicksort. At this
point, no further partitioning is needed, and the recursion will terminate.
By the end of the loop, values in the subarray X[0 to i] will be less than the
pivot, and values in the subarray X[i + 1 to r - 1] will be greater than the
pivot.
Now we need to place the pivot at its correct position, i.e., i + 1. For this,
we swap X[r] with X[i + 1] and return the position of the pivot, i.e., i + 1.
Pseudocode of Quick sort partition algorithm
Does the above code work correctly when values are repeated? Ifnot, then
how to make it work for repeated values?
By the end of the loop, why is the correct position of the pivot i + 1?
Can you think of implementing the partition process using some otheridea?
We are running a single loop and doing constant operations at each iteration. In the
worst or best case, we are doing one comparison at each iteration. So time
complexity = O(n). We are using constant extra space, so space complexity =
O(1). Note: Here swapping operation will depend on comparison if (X[j] <
pivot).
Quick sort algorithm visualization
Divide part: Time complexity of the divide part is equal to the time complexity
of the partition algorithm, which is O(n).
T(n)
= O(n) + T(i) + T(n — i — 1) + O(1)
= T(i) + T(n — i — 1) + O(n)
= T(i) + T(n — i — 1) + cn
When we always choose the rightmost element as the pivot, the worst- case
scenario will arise when the array is already sorted in either increasing or
decreasing order. In this case, each recursive call will create an unbalanced
partition.
For calculating the time complexity in the worst case, we put i = n - 1 in the
above formula of T(n).
The recursion tree method is one of the popular techniques for recursion time
complexity analysis. Here we draw a tree structure of recursive callsand highlight
the extra cost associated with each recursive call. To get the overall time
complexity, we add the cost level by level.
We sum the total partitioning cost for each level => cn + c(n−1) + c(n−2)
+⋯+ 2c + c = c (n + n−1 + n−2 + ⋯+ 2 + 1) = c[n(n+1)/2] = O(n^2).
Best-case analysis of quick sort
The best-case scenario of quick sort will occur when partition process will
always pick the median element as the pivot. In other words, this is a case of
balanced partition, where both sub-arrays are approx n/2 size each.
T(n) = 2 T(n/2)+ cn
This recurrence is similar to the recurrence for merge sort, for which the solution
is O(n log n). So the best-case time complexity of quicksort = O(n log n). Note:
We can solve this using the recursion tree method or the master method.
Average-case analysis of quick sort
Suppose all permutations of input are equally likely. Now, when we run the
quicksort algorithm on random input, partitioning is highly unlikely to happen
in the same way at each level of recursion. The idea is simple: The behaviour of
quicksort will depend on the relative order of values inthe input.
Here some of the splits will be reasonably well balanced and some of the splits
will be pretty unbalanced. So the partition process will generate a mix of good
(balanced partition) and bad (unbalanced partition) splits in the average case.
These good and bad splits will be distributed randomly throughout the recursion
tree.
For a better understanding of the analysis, Suppose good and bad splitsappear at
alternate levels of the tree.
Suppose, at the root, the partition generates a good split, and at the next level,
the partition generates a bad split. The cost of the partition process will be O(n)
at both levels. Therefore, the combined partitioning cost of the bad split
followed by the good split is O(n).
For better visualization, let’s assume that the partition algorithm always
produces a partially unbalanced split in the ratio of 9 to 1. The recurrence
relation for this would be: T(n) = T(9n/10) + T(n/10) + cn.
Image source: CLRS Book.
We can notice following things from above diagram:
The left subtree is decreasing fast with a factor of 1/10. So the depth of left
subtree is equal to log10(n).
The right subtree is decreasing slowly with a factor of 9/10. So the depth of
right subtree is equal to log10/9(n). Note: log10/9(n) = O(logn).
At each level of recursion, the cost of partition is at most cn. After doing the
sum of cost at each level of recursion tree, quick sort costis O(nlogn).
In the best-case scenario, the partition will always be balanced, resulting in two
recursive calls at each level of recursion. In such a scenario, the generated
recursion tree will be balanced in nature. So, the height of therecursion tree will
be O(log n), and recursion will require a call stack of size O(log n). Therefore,
the best-case space complexity of quicksort is O(log n).
We can modify the partition algorithm and separate input values into three
groups: Values less than pivot, values equal to pivot and values greater than pivot.
Values equal to pivot are already sorted, so only less- than and greater-than
partitions need to be recursively sorted. Because of this, we will return two
index startIndex and endIndex in the modified partition algorithm.
void quickSort(X[], l, r)
{
if (l < r)
{
[leftIndex, rightIndex] = partition(X, l, r)quickSort(X, l, leftIndex - 1)
quickSort(X, rightIndex + 1, r)
}
}
Critical question: How can we implement the partition algorithm for thisidea?
In the above implementation, we have chosen the rightmost element as the pivot.
This can result in a worst-case situation when the input array is sorted. The best
idea would be to choose a random pivot that minimizes the chances of a worst-
case at each level of recursion. Selecting the median element as the pivot can also
be acceptable in the majority of cases.
Pseudocode snippet for median-of-three pivot selection
< X[l]
swap (X[l], X[mid])
X[r]