Transcript Wednesday

CSE 3101: Introduction to the Design
and
Analysis of Algorithms
Suprakash Datta
datta[at]cse.yorku.ca
4/8/2015
CSE 3101
1
Quick Sort
• Characteristics
– sorts ”almost” in place, i.e., does not require an
additional array, like insertion sort
– Divide-and-conquer, like merge sort
– very practical, average sort performance O(n log
n) (with small constant factors), but worst case
O(n2) [CAVEAT: this is true for the CLRS version]
2
Quick Sort – the main idea
• To understand quick-sort, let’s look at a highlevel description of the algorithm
• A divide-and-conquer algorithm
– Divide: partition array into 2 subarrays such that
elements in the lower part <= elements in the
higher part
– Conquer: recursively sort the 2 subarrays
– Combine: trivial since sorting is done in place
3
Partitioning
• Linear time partitioning procedure
Partition(A,p,r)
01
02
03
04
05
06
07
08
09
10
11
j
i
xA[r]
17
ip-1
X=10 
jr+1
while TRUE
10
repeat jj-1
until A[j] x
repeat ii+1
until A[i] x
10
if i<j
then exchange A[i]A[j]
else return j
12
10
5
6
19 23
8
5
10
j
i
12
5
6
6
6
19 23
8
i
j
19 23
8
5
17
12 17
j
i
8
23 19 12 17
4
Quick Sort Algorithm
• Initial call Quicksort(A, 1, length[A])
Quicksort(A,p,r)
01 if p<r
02
then qPartition(A,p,r)
03
Quicksort(A,p,q)
04
Quicksort(A,q+1,r)
5
Analysis of Quicksort
• Assume that all input elements are distinct
• The running time depends on the distribution
of splits
6
Best Case
• If we are lucky, Partition splits the array
evenly T (n)  2T (n / 2)  (n)
7
Using the median as a pivot
The recurrence in the previous slide works out,
BUT……
Q: Can we find the median in linear-time?
A: Yes! Chapter 9 of the text
Note : Most implementations do not use the
median as pivot.
8
Worst Case
• What is the worst case?
• One side of the parition has only one element
T (n)  T (1)  T (n  1)  ( n)
 T (n  1)  ( n)

n
 ( k )
k 1
n
 (  k )
k 1
 ( n )
2
9
Worst Case (2)
10
Worst Case (3)
• When does the worst case appear?
– input is sorted
– input reverse sorted
• Same recurrence for the worst case of
insertion sort
• However, sorted input yields the best case for
insertion sort!
11
Analysis of Quicksort
• Suppose the split is 1/10 : 9/10
T (n)  T (n /10)  T (9n /10)  (n)  (n log n)!
12
An Average Case Scenario
• Suppose, we alternate
lucky and unlucky
cases to get an
average behavior
(n)
n
L(n)  2U ( n / 2)  ( n) lucky
U (n)  L(n  1)  (n) unlucky
we consequently get
L(n)  2( L( n / 2  1)  ( n / 2))  ( n)
 2 L(n / 2  1)   ( n)
 (n log n)
n
n-1
1
(n-1)/2
(n-1)/2
(n-1)/2+1
(n)
(n-1)/2
13
An Average Case Scenario (2)
• How can we make sure that we are usually
lucky?
– Partition around the ”middle” (n/2th) element?
– Partition around a random element (works well in
practice)
• Randomized algorithm
– running time is independent of the input ordering
– no specific input triggers worst-case behavior
– the worst-case is only determined by the output of
the random-number generator
14
Randomized Quicksort
• Assume all elements are distinct
• Partition around a random element
• Randomization is often used to design
algorithms with good average-case
complexity (the worst-case complexity may
not be as good)
15
The optimality question
Q: Can we do better that worst case (n log n)
time for sorting?
A: In general no, but in some special cases yes!
Q: Why not?
A: The well-known (n log n) lower bound.
16
On Lower Bounds
• “the best any algorithm can do” for a problem
• The proof must be algorithm independent
• In general, lower bound proofs are difficult
• Must make some assumptions – the sorting
lower bound assumes that sorting is
comparison based.
This will be covered later today, or by Prof.
Ruppert next week
If we relax the “comparison-based” assumption,
we can sort in linear time!
17
Next: Linear sorting
Q: How we beat the (n log n) lower bound for
sorting?
A: By making extra assumptions about the input
18
Non-Comparison Sort – Bucket Sort
• Assumption: uniform distribution
– Input numbers are uniformly distributed in [0,1).
– Suppose input size is n.
• Idea:
–
–
–
–
Divide [0,1) into n equal-sized subintervals (buckets).
Distribute n numbers into buckets
Expect that each bucket contains few numbers.
Sort numbers in each bucket (insertion sort as
default).
– Then go through buckets in order, listing elements
Can be shown to run in linear-time on average
19
Example of BUCKET-SORT
20
Generalizing Bucket Sort
Q: What if the input numbers are NOT uniformly
distributed in [0,1)?
A: Can be generalized in different ways, e.g. if the
distribution is known we can design (unequal
sized) bins that will have roughly equal number
of numbers on average.
21
Non-Comparison Sort – Counting Sort
• Assumption: n input numbers are integers in the
range [0,k], k=O(n).
• Idea:
– Determine the number of elements less than
x, for each input x.
– Place x directly in its position.
22
Counting Sort - pseudocode
Counting-Sort(A,B,k)
•
for i0 to k
•
do C[i] 0
•
for j 1 to length[A]
•
do C[A[j]] C[A[j]]+1
•
// C[i] contains number of elements equal to i.
•
for i 1 to k
•
do C[i]=C[i]+C[i-1]
•
// C[i] contains number of elements  i.
•
for j length[A] downto 1
•
do B[C[A[j]]] A[j]
•
C[A[j]] C[A[j]]-1
23
Counting Sort - example
24
Counting Sort - analysis
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
for i0 to k
(k)
do C[i] 0
(1)
for j 1 to length[A]
(n)
do C[A[j]] C[A[j]]+1
(1) ((1) (n)= (n))
// C[i] contains number of elements equal to i. (0)
for i 1 to k
(k)
do C[i]=C[i]+C[i-1]
(1) ((1) (n)= (n))
// C[i] contains number of elements  i.
(0)
for j length[A] downto 1
(n)
do B[C[A[j]]] A[j]
(1) ((1) (n)= (n))
C[A[j]] C[A[j]]-1
(1) ((1) (n)= (n))
Total cost is (k+n), suppose k=O(n), then total cost is (n).
So, it beats the (n log n) lower bound!
25
Stable sort
• Preserves order of elements with the same
key.
• Counting sort is stable.
Crucial question: can counting sort be used to
sort large integers efficiently?
26
Radix sort
Radix-Sort(A,d)
• for i1 to d
•
do use a stable sort to sort A on digit i
Analysis:
Given n d-digit numbers where each digit takes on
up to k values, Radix-Sort sorts these numbers
correctly in (d(n+k)) time.
27
Radix sort - example
1019
3075
2225
2231
2231
3075
2225
1019
1019
2225
2231
3075
1019
3075
2225
2231
1019
2225
2231
3075
1019
3075
2231
2225
1019
2231
2225
3075
Sorted!
Not
sorted!
28