0% found this document useful (0 votes)
9 views

Sorting and Searching II

The document explains the Quicksort algorithm, which involves selecting a pivot element, partitioning the array into elements less than and greater than the pivot, and recursively sorting the partitions. It discusses the importance of choosing an effective pivot to optimize performance, as poor pivot choices can lead to worst-case scenarios with O(n^2) complexity. Additionally, it introduces Bin Sort (Bucket Sort) as a linear time sorting algorithm applicable under specific conditions and outlines basic searching techniques like linear and binary search.

Uploaded by

10423049
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Sorting and Searching II

The document explains the Quicksort algorithm, which involves selecting a pivot element, partitioning the array into elements less than and greater than the pivot, and recursively sorting the partitions. It discusses the importance of choosing an effective pivot to optimize performance, as poor pivot choices can lead to worst-case scenarios with O(n^2) complexity. Additionally, it introduces Bin Sort (Bucket Sort) as a linear time sorting algorithm applicable under specific conditions and outlines basic searching techniques like linear and binary search.

Uploaded by

10423049
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Quicksort: idea

Merge sort splits the array sub-lists and sorts them

The larger problem is split into two sub-problems


based on location in the array

Consider the following alternative:


– Chose an object in the array and partition the
remaining objects into two groups relative to the
chosen object => Quicksort
– Chosen object is called pivot
Quick Sort: idea (2)
1. We will choose a pivot element in the list.
2. Partition the array into two parts.
3. One contains elements smaller than this pivot, another contains
elements larger than this pivot.
4. Recursively sort and merge the two partitions.

How do we merge?

We do not need any additional space for merging. Thus


additional storage is O(1)
Quicksort
For example, given

80 38 95 84 66 10 79 44 26 87 96 12 43 81 3

we can select the middle entry, 44, and sort the


remaining entries into two groups, those less than 44
and those greater than 44:
38 10 26 12 43 3 44 80 95 84 66 79 87 96 81

Notice that 44 is now in the correct location if the list


was sorted
– Proceed by applying the algorithm to the first six
and last eight entries
Worst-case scenario
Moral of the story: We would like the partitions to be
as equal as possible.
This depends on the choice of the pivot => Quicksort
running time depends on the pivot
Equal partitions are assured if we can choose our
pivot as the element which is in the middle of the
sorted list.

It is not easy to find that.


Pivots are chosen randomly.

Worst case complexity?


Worst-case scenario
Suppose we choose the first element as our pivot
and we try ordering a sorted list:
80 38 95 84 66 10 79 2 26 87 96 12 43 81 3

Using 2, we partition into

2 80 38 95 84 66 10 79 26 87 96 12 43 81 3

We still have to sort a list of size n – 1

The run time is T(n) = T(n – 1) + Q(n) = Q(n2)


– Thus, the run time drops from n ln(n) to n2
Worst-case scenario

Our goal is to choose the median element in the list as our pivot:

80 38 95 84 66 10 79 2 26 87 96 12 43 81 3

Unfortunately, it’s difficult to find. Pivot is usually chosen randomly.

Alternate strategy: take the median of a subset of entries


– For example, take the median of the first, middle, and last
entries
Median-of-three
It is difficult to find the median so consider another
strategy:
– Choose the median of the first, middle, and last
entries in the list

This will usually give a better approximation of the


actual median
Partitioning process
Interchange the pivot with the last element
Have a pointer at the first element (P1), and one at the second last
element (P2).

Move P1 to the right skipping elements which are less than the pivot.
Move P2 to the left skipping elements which are more than the pivot.

Stop P1 when we encounter an element greater than or equal to the pivot.


Stop P2 when we encounter an element lesser than or equal to the pivot.
Partitioning process (2)
Interchange the elements pointed to by P1 and P2.
If P1 is right of P2, stop, otherwise move P1 and P2 as before
till we stop again.

When we stop, swap P1 with the last element which is the pivot

8 1 4 9 6 3 5 2 7 0

First = 8 Last = 0, Median = 6, Pivot = 6

8 1 4 9 0 3 5 2 7 6
P1 P2
8 1 4 9 0 3 5 2 7 6
P1 P2

2 1 4 9 0 3 5 8 7 6
P1 P2

2 1 4 9 0 3 5 8 7 6
P1 P2

2 1 4 5 0 3 9 8 7 6
P2 P1

2 1 4 5 0 3 6 8 7 9

Partition 1 Partition 2
At any time can you say anything about the
elements to the left of P1?
Elements to the left of P1 are less than or
equal to the pivot.

Also, for right of P2?

Elements to the right of P1 are greater than or equal


to the pivot.

When P1 and P2 cross, what can you say about the


elements in between P1 and P2?
They are all equal to the pivot.
Suppose P1 and P2 have crossed, and stopped and
the pivot is interchanged with P1.
How do we form the partition?

Everything including P1 and its right are in one


partition (greater).
Remaining are in the left partition.
Procedure Summary
Partition the array
Sort the partition recursively.

8 1 4 9 6 3 5 2 7 0

2 1 4 5 0 3 6 8 7 9
Partition 1 Partition 2

0 1 2 3 4 5 6 7 8 9
Partition 1 Sorted Partition 2 Sorted
Need to do any thing more? Merger is automatic
Pseudocode
Quicksort(A, left, right)
{ Find pivot;
Interchange pivot and A[right];
P1 = left; P2 = right – 1;
Partition(A, P1, P2, pivot); /*returns newP1*/
Interchange A[newP1] and A[right];
Quicksort(A, left, newP1-1);
Quicksort(A, newP1, right); }
Partition(A, P1, P2,pivot)
{
While (P1  P2)
{
While (A[P1] < pivot) increment P1;
While (A[P2] > pivot) decrement P2;
Swap A[P1] and A[P2];
increment P1; decrement P2;
}
newP1 = P1;
return(newP1);
}
Run time

T(n) = T(n1) + T(n2) + cn


T(1) = 1;
n1 + n2 = n

In good case, n1 = n2 = n/2 always


Thus T(n) = O(nlogn)
Run time (2)

In worst case, n1 = 1 n2 = n-1 always


T(n) = pn + T(n-1)
= pn + p(n-1) + T(n-2)
……..
= p(n + n-1 +……+1)
= pn2

Thus T(n) is O(n2 ) in worst case.

Average case complexity is O(nlog n)


Memory Requirements

The additional memory required is Q(ln(n))


– Each recursive function call places its local
variables, parameters, etc., on a stack
• The depth of the recursion tree is Q(ln(n))

– Unfortunately, if the run time is Q(n2), the


memory use is Q(n)
Run-time Summary
To summarize all three algorithms

Best Average Worst-case Average Worst-case


Run Run Time Run Time Memory Memory
Time

Merge Sort Q(nlog(n)) Q(n)

Quicksort Q(nlog(n)) Q(nlog(n)) Q(n2) Q(log(n)) Q(n)

Insertion Sort Q(n) Q(n+d) Q(n2) Q(1)


Comments

Quicksort performs well for large inputs, but not so good


for small inputs.
When the divisions become small, we can use insertion
sort to sort the small divisions instead.
5 <= cutoff <= 20
Is This The Best We Can Do?
• Sorting by Comparison
– Only information available to us is the set
of N items to be sorted
– Only operation available to us is pairwise
comparison between 2 items

General Lower Bound For Sorting is (nlog n)


Is This The Best We Can Do?
• Sorting by Comparison
– Only information available to us is the set
of N items to be sorted
– Only operation available to us is pairwise
comparison between 2 items

What happens if we relax these constraints?


Special Case Sorting

Now we will present a linear time sorting algorithm: Bin sort


or Bucket sort.
These apply only when the input has special constraints, e.g.,
inputs are integers and we already know the min and max of
each element.
BinSort (a.k.a. BucketSort)
Requires:
– Having an array with N elements
– Each element is in {1, …, K}

Works by:
Putting items into correct bin (cell) of array,
based on key
BinSort example
K=5 list=(5,1,3,4,3,2,1,1,5,4,5)

Bins in array
key = 1 1,1,1
key = 2 2
Sorted list:
key = 3 3,3 1,1,1,2,3,3,4,4,5,5,5
key = 4 4,4
key = 5 5,5,5
BinSort example (2)
• K=5 list=(5,1,3,4,3,2,1,1,5,4,5)

Bins in array
key = 1 3
key = 2 1
Sorted list:
key = 3 2 1,1,1,2,3,3,4,4,5,5,5
key = 4 2
key = 5 3
BinSort Pseudocode
procedure BinSort (List L,K)

bins[1..K][]
// Each element of array bins is linked list.
// Could also BinSort with array of arrays.

For Each number x in L


bins[x].Append(x)
End For
For i = 1..K
For Each number x in bins[i]
Print x
End For
End For
BinSort Running Time
• K is a constant
– BinSort is linear time
– O(n + K) = O(n)
• K is variable
– Not simply linear time
• K is very large (e.g. 232)
– Impractical
• Storage: O(K) or O(n)
– Storage could be large for large K.
Searching

 Searching is common task computers perform


 Two parameters that affect search algorithm selection:
 Whether the array is sorted
 Whether all the elements in the array are unique or have
duplicate values
 For now, our implementations will assume there are no duplicates
in the array
 We will use two types of searches:
 Linear search for unsorted arrays
 Binary search for sorted arrays
Searching: Linear Searching

 The simplest way to find an element in an array is to check if it


matches the sought after value
 Worst case: O(n): the entire array must be linearly searched. This
occurs when the value is in the last position or not found
 Best case:?
 Average case:?

 Code: Your homework


Searching: Binary Searching

 For binary search, begin searching at the middle of the array


 If the item is less than the middle, check the middle item between the
first item and the middle
 If it is more than the middle item, check the middle item of the
section between the middle and the last section
 The process stops when the value is found or when the remaining array
of elements to search consists of one value

 Time complexity: O(logn)


Proof of correctness of an
Algorithm: loop invariants
To use this, we need to prove three conditions:

1. Initialization: The loop invariant is satisfied at the beginning


of the for loop.
2. Maintenance: If the loop invariant is true before the ith
iteration, then the loop invariant will be true before the i + 1st
iteration.
3. Termination: When the loop terminates, the invariant gives
us a useful property that helps show that the algorithm is correct.

Note that this is basically mathematical induction (the


initialization is the base case, and the maintenance is the
inductive step).
Proof of correctness of Insertion sort
Initialization: Before the first iteration (which is when P = 1),
the subarray A[0…P - 1] is just the first element of the array,
A[0]. This subarray is sorted, and consists of the elements that
were originally in A[0].

Maintenance: Suppose A[0::P - 1] is sorted. Informally, the


body of the for loop works by moving A[P - 1], A[P - 2], A[P -
3] and so on by one position to the right until it finds the proper
position for A[P] (lines 4), at which point it inserts the value of
A[P] (line 5). The subarray A[0..P] then consists of the
elements originally in A[0..P], but in sorted order. Incrementing
P for the next iteration of the for loop then preserves the loop
invariant.

Termination: The condition causing the for loop to terminate


is that P > N. Because each loop iteration increases j by 1, we
must have j = P + 1 at that time. By the initialization and
maintenance steps, we have shown that the subarray A[0..N + 1
- 1] = A[0..N] consists of the elements originally in A[0..N], but
in sorted order.
Summary
• Quick Sort
• Bin Sort (Bucket Sort)
• Searching
• Next week: Online Assignment 1
– Content: Algorithm analysis and Sort Algorithms
– Duration: 45 minutes
– Open-book and no electric devices

You might also like