CMPU 102, Spring 2006 | Lecture 14 |
Quick sort is another divide-and-conquer sorting algorithm based on the partition operation. For each sub-sequence to be sorted using quick sort, a pivot element is selected. (The pivot can be any element of the subsequence.) The sub-sequence is arranged so that all elements less than the pivot are placed before the pivot, and all elements greater than the pivot are after the pivot. This can be done in O(n) time for n elements. After partitioning, the subsequence on each side of the pivot is sorted recursively.
Here is an example of a single step in executing quicksort:
Analysis: if the choice of pivot always results in equal-sized (or close to equal-sized) partitions before and after the pivot, then quick sort will run in O(n log n) time. Unfortunately, there is no way to guarantee that the pivot will result in this outcome, unless we sort the entire sequence! Choosing a bad pivot---either the min or max element in the subsequence---would result in subproblems of size 0 and size n-1. If a bad pivot is chosen at every step, then the total running time will be O(n^{2}). (The problem is that we are only eliminating one element, the pivot, at each step!)
One way to address the problem of pivot selection is to sample a small number of elements from the subsequence, and choose the median element as the pivot. While this does not guarantee that the pivot will result in equal sized subproblems, it makes it extremely unlikely that the subproblems will differ greatly in size. For this reason, the expected (average) case running time of quick sort is O(n log n).
Many algorithms can be expressed naturally using recursion. We have seen two sorting algorithms (merge sort and quick sort) that used recursion. There are several fundamental rules that must be followed to use recursion correctly and effectively:
One potentially confusing aspect of recursive methods is how it is possible to call the same method without disturbing the values of parameters and local variables.
The answer is that each recursive call creates a new stack frame, containing its own private set of parameters and local variables. (We covered stack frames as part of the discussion of exceptions in Lecture 5.)
Proof by induction is a very useful technique for proving that a property is true for integers 1, 2, ..., n, regardless of how big n is. Since n can be arbitrarily large, we can't try prove each case 1..n individually. Instead of requiring us to prove an arbitrary number of cases, proof by induction lets us prove every case with just two easy steps!
It is easy to see how these two steps can be used to "cover" all values of n starting from n=1, up to any arbitrary n.
Example: let's prove that the sum of integers 1..n is (n(n+1)) / 2. In the proof, we will refer to this formula as f(n).
Induction Step. We assume that the formula holds for f(n), and based on this assumption, prove that it also holds for f(n+1). We start out by noting that f(n+1) = ((n+1)(n+2))/2. This is the result we expect for f(n+1).
We apply the induction step by adding (n+1) to f(n). Because f(n) is the sum of the integers 1..n, then adding (n+1) to this sum will obviously result in the sum of the integers 1..n+1.
f(n+1) = f(n) + (n+1) = (n(n+1))/2 + (n+1) expand f(n) = (n(n+1))/2 + 2(n+1)/2 multiply (n+1) by 2/2 = (n^{2}+n)/2 + (2n+2)/2 expand terms = (n^{2}+3n+2)/2 combine terms = ((n+1)(n+2))/2 factor polynomial
By "plugging in" the formula for f(n), we arrived at the expected result for f(n+1). This proves the induction step.