Lecture 2: Unix at CS@NIU - Zhejiang Normal University

Download Report

Transcript Lecture 2: Unix at CS@NIU - Zhejiang Normal University

Sorting
7/17/2015
CSCI440
1
Introduction


The most common operation in
programs besides searching
Sorting by comparision:
1.
2.
3.
4.
7/17/2015
Bubble Sort
Selection Sort
Merge Sort
QuickSort
CSCI440
2
Bubble Sort Idea


We want A[1]  A[2]  …  A[N]
Bubble sort idea:



7/17/2015
If A[i-1] > A[i] then swap A[i-1] and A[i]
Do this for i = 1, …, n-1
Repeat this until it’s sorted
CSCI440
3
Bubble Sort
procedure BubbleSort (Array A, int N)
repeat {
isSorted = true;
for (i=1 to N-1) {
if ( A[i-1] > A[i] ){
swap( A[i-1], A[i] );
isSorted = false;
}
until isSorted
7/17/2015
CSCI440
4
Selection Sort Idea


We want A[1]  A[2]  …  A[N]
Selection sort idea:


7/17/2015
Find the the largest element in an array of i
elements
Do this for i = n,n-1, …, 2
CSCI440
5
Selection Sort
procedure SelectSort (Array A, int N)
for (i=N-1 to 1) {
/* find the the largest among A[0],...,A[i] */
/* place it in A[i] */
m = 0;
for (j=0 to i)
if ( A[m] < A[j] ) m = j;
swap(A[i], A[m]);
}
7/17/2015
CSCI440
6
Analysis of Bubble & Selection

Worst case running time:

7/17/2015
T(n) = O(
??
CSCI440
)
7
Insertion Sort
procedure InsertSort (Array A, int N)
for (i=1 to N-1) {
/* A[0], A[1], ..., A[i-1] is sorte */
/* now insert A[i] in the right place */
x = A[i];
for (j=i-1; j>0 && A[j] > x; j--)
A[j+1] = A[j];
A[j] = x;
}
7/17/2015
CSCI440
8
Analysis of Insertion Sort

Worst case running time:

7/17/2015
T(n) = O(
??
CSCI440
)
9
Merge Sort
The Merge Operation: given two sorted sequences:
A[0]  A[1]  ...  A[m-1]
B[0]  B[1]  ...  B[n-1]
Construct another sorted sequence that is their union
Complete the following function
void Merge (int *A, int aSize, int *B, int bSize)
A is a sorted (in ascending order) array of aSize number integers
and B is a sorted (in ascending order) array of bSize number of
integers. You simply should print the elements in both array out in
ascending order (merged)
7/17/2015
Photo from http://www.nrma.com.au/inside-nrma/m-h-m/road-rage.html
CSCI440
10
Merge Sort
Basic Ideas
If less two elements, stop
Otherwise, divide the list in
equal half, sort left half and right half independently
and then merge them.
Function MergeSort (Array A[0..n-1])
if n  1 return A
Merge(MergeSort(A[0..n/2-1]), MergeSort(A[n/2..n-1]))
7/17/2015
CSCI440
11
Analysis of Merge Sort
The Recurrence Relation
T(1) = b
T(n) = 2T(n/2) + cn
7/17/2015
for n>1
CSCI440
12
Analysis of Merge Sort
T(n) = 2T(n/2)+cn
T(n) = 4T(n/4) +cn +cn
substitute
T(n) = 8T(n/8)+cn+cn+cn
substitute
T(n) = 2kT(n/2k)+kcn
inductive leap
T(n) = nT(1) + cn log n where k = log n
select value for k
T(n) = (n log n)
simplify
7/17/2015
CSCI440
13
Merge Sort


Works great with lists, or files
Problems with arrays:

7/17/2015
We need a scratch array, cannot sort in
place
CSCI440
14
Heap Sort


Recall: a heap is a tree where the min is
at the root
A heap is stored in an array A[1], ...,
A[n]
7/17/2015
CSCI440
15
Heap Sort


Start with an unsorted array A[1], ...,
A[n]
Build a heap (buildHeap function)


How much time does it take ? O(N)
Get minimum, store in out array; repeat
n times:
7/17/2015
CSCI440
16
Heap Sort

Input: unordered array A[1..N]
1.
2.
Build a max heap (largest element is
A[1])
For i = 1 to N-1:
A[N-i+1] = Delete_Max()
7/17/2015
CSCI440
17
Heap Sort
7 50 22 15 4 40 20 10 35 25
50 40 20 25 35 15 10 22 4
7
40 35 20 25 7 15 10 22 4 50
35 25 20 22 7 15 10 4 40 50
7/17/2015
CSCI440
18
Properties of Heap Sort

Worst case time complexity O(n log n)



Build_heap O(n)
n Delete_Max’s for O(n log n)
In-place sort
7/17/2015
CSCI440
19
QuickSort
1. Pick a “pivot”.
2. Divide list into two lists:
• One less-than-or-equal-to pivot value
• One greater than pivot
3. Sort each sub-problem recursively
4. Answer is the concatenation of the two solutions
7/17/2015
CSCI440
20
QuickSort - Illustration
Sort 8 1 4 9 0 3 5 2 7 6 in ascending order
7/17/2015
CSCI440
21
QuickSort
Procedure quickSort(Array A, int N) {
quickSortRecursive(A, 0, N-1);
}
procedure quickSortRecursive (Array A, int left, int right)
if (left == right) return;
int pivot = choosePivot(A, left, right);
/* partition A s.t.:
A[left], A[left+1], …, A[i]  pivot
A[i+1], A[i+2], …, A[right]  pivot
*/
quickSortRecursive(A, left, i);
quickSortRecursive(A, i+1, right);
}
7/17/2015
CSCI440
22
QuickSort: The Partition
i = left; j = right;
repeat { while (i<j && A[i] <= pivot) i++;
while (j>i && A[j] >= pivot) j--;
if (i<j) swap(A[i], A[j]);
else break;
}
quickSortRecursive(A, left, i);
quickSortRecursive(A, i+1, right);
A[left]
…
A[i-1]
A[i]
…
 pivot
7/17/2015
A[j]
…
A[right]
 pivot
CSCI440
23
QuickSort: The Partition

Running time: T = O(right-left+1)

7/17/2015
Why ?
CSCI440
24
Analyzing QuickSort

Picking pivot: constant time



Will discuss later
Partitioning: linear time
Recursion: suppose there are i elements
 pivot:
T(1) = b
T(N) = T(i) + T(N-i) + cN

Can’t solve, it depends on i
7/17/2015
CSCI440
25
QuickSort:Worst case
Pivot is always smallest element, so i=1:
T(N) = T(i) + T(N-i) + cN
T(N) = T(N-1) + cN+b
= T(N-2) + cN + c(N-1) + b + b
= T(N-3) + cN + c(N-1) + c(N-2) + b + b + b
=...
= cN + c(N-1) + . . . c2 + c1 + b + b + . . . + b
= O(N2)
7/17/2015
CSCI440
26
QuickSort:Best Case
Pivot is always the median.
T(N) = T(i) + T(N-i) + cN
T(N) = 2T(N/2) + cN
T(N) = 4T(N/4) + cN + cN
T(N) = 8T(N/8) + cN + cN + cN
...
T(N) = 2log N T(1) + cN log N
T(N) = O(N log N)
7/17/2015
CSCI440
27
Choosing the Right Pivot

pivot = A[left] or A[right]


Randomly choose pivot


Not a very good idea. Why?
Very good, but random number generator
is slow
“Median-of-3” rule:

7/17/2015
pivot = Median(A[left], A[middle], A[last])
CSCI440
28
QuickSort: Average Case


Suppose pivot is picked at random
All the following cases are equally likely:



Pivot is smallest value in list: i=1
Pivot is 2nd smallest value in list i=2
Pivot is 3rd smallest value in list i=3
…


Pivot is largest value in list i=N-1
Same is true if pivot is e.g. always first
element, but the input itself is perfectly
random
7/17/2015
CSCI440
29
QuickSort Avg Case, cont.
Expected running time:
T(N) = 1/N (T(1)+T(N-1) + T(2)+T(N-2) + … + T(N-1)+T(1)) + cN
= 2/N (T(1) + T(2) + … T(N-1)) + cN
N T(N)
= 2 T(1) + 2 T(2) + . . . +2 T(N-2) + 2 T(N-1) + cN2
(N-1) T(N-1) = 2 T(1) + 2 T(2) + . . . + 2 T(N-2) + c(N-1)2
NT(N) – (N-1) T(N-1) = 2 T(N-1) + 2cN – c
NT(N) = (N+1)T(N-1) + 2cN – c
T(N)/(N+1) = T(N-1)/N + 2c/(N+1) – c/N(N+1)
T(N)/(N+1) = T(0)/1 + 2c(1/(N+1) + 1/N + … +1/2) – c(1/N(N+1) + … +1/1.2)
= O(log N)
T(N) = O(N log N)
7/17/2015
CSCI440
30
Summary

Naïve sorting algorithms:



Clever sorting algorithms:



Bubble sort, insertion sort, selection sort
Time = O(n2)
Merge sort, heap sort, quick sort
Time = O(N log N)
I want to sort in O(N) !

7/17/2015
Is this possible ?
CSCI440
31