Chapter 7 Sorting

Download Report

Transcript Chapter 7 Sorting

Chapter 7 Sorting
Part II
7.3 QUICK SORT
Example
2 < pivot and should
go to the other side.
pivot
14
left
i
1
25
41
52
6
4
Interchange a[i] and a[j]
a[j]
5
> will
pivoteventually
and should
stop at a position
go to the
where
a[j]other
< pivot.
side.
Stop, because i ==
Stop;
> j. j.
Interchange a[j] and pivot.
j
right
Algorithm
void QuickSort(int a[], int left, int right)
{
if (left < right)
{
pivot = a[left];
i = left;
j = right+1;
while (i < j)
{
for (i++; i<j and a[i] < pivot; i++) ;
for (j--; i<=j and a[j] >= pivot; j--) ;
if (i < j)
interchages a[i] and [i];
}
interchange a[j] and pivot;
QuickSort(a, left, j-1);
QuickSort(a, j+1 right;);
}
}
Analysis of QuickSort()

Worst case:
◦ Consider a list is stored.
1
2
4
5
6
 The smallest one is always chosen as pivot.
 In 1st iteration, n-1 elements are examined. In
second iterations, n-2 elements are examined...
 Totally, the execution steps are
n-1 + n-2 + … + 1 = O(n2)
 The time complexity is O(n2);
Lemma 7.1

Let Tavg(n) be the expect time for function
QuickSort() to sort a list with n records.
Then there exists a constant k such that
Tavg(n)≦knlogen for n≧2.
◦ In other words, Tavg(n) = O(nlogn)
Variations
The position of the pivot decides the time
complexity of QuickSort().
 The best choice for the pivot is the
median.

◦ Variations:
 Median-of-three: select the median among three
records: the most left, the most right, and the
middle one.
 Random: select the pivot randomly.
7.4 HOW FAST CAN WE SORT?
(DECISION TREE)
Consideration

What is the best computing time for
sorting that we can hope for?
◦ Suppose the only operations permitted on
keys are comparisons and exchanges.
 In this section, we shall prove O(n logn) is the best
possible time.
 Using decision tree.
Example 7.4

Decision tree for Insertion Sort working
on [K1, K2, K3]
K1 ≦K2
[1, 2, 3]
N
Y
[2, 1, 3]
[1, 2, 3]
K2 ≦K3
Y
K1 ≦K3
Y
N
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
K1 ≦K3
Stop
I
Y
[1, 3, 2]
N
[2, 3, 1]
K2 ≦K3
Stop
N
[3, 1, 2]
IV
[2, 3, 1]
Y
N
[3, 2, 1]
Stop
Stop
Stop
Stop
II
III
V
VI
Observations
The leaf nodes denote the states to
terminate.
 The number of permutations is 3! = 6.

◦ n! possible permutation for n records to sort.
 A path from the root to some leaf node represents
one of n! possibilities.

The maximum depth of the tree is 3.
◦ The depth represents the number of
comparisons.
Theorem 7.1

Any decision tree that sorts n distinct
elements has a height of at least
log2(n!)+1.
◦ When sorting n elements, there are n!
different possible results.
 Every decision tree for the sorting must at least
have n! leaves.
◦ A decision tree is a binary tree; therefore
 2k-1 leaves if its height is k.
n! 2k 1  log2 (n!)  k  1
 k  log2 (n!)  1
Corollary
Any algorithm that sorts only by
comparisons must have a worst case
computing time of Ω(n logn).
 Proof

◦ By theorem, there is a path of length log2n!
n! n(n 1)(n  2)...(3)(2)(1)  (n / 2)n / 2
So,
log2 n! (n / 2) log2 (n / 2)  (n log n)
7.5 MERGE SORT
Merging

Consider how to merge two ordered
lists.
sorted
sorted
initList
l
m
m+1
merge
mergeList
sorted
n
Example
m
l
initList
m+1
n
4
5
7
11
2
3
6
8
14
i1
i1
i1
i1
i2
i1
i2
i2
i2
i2
i2
Copy the small one to
mergeList
mergeList
2
3
4
5
6
7
8
11
14
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult = l
Copy the rest to
mergeList
void Merge(int *initList, int *mergeList, int l, int m, int n)
{
int i1=l, iResult=l, i2=m+1;
while (i1 <= m && i2 <= n)
{
if (initList[i1] <= initList[i2])
{
mergeList[iResult++] = initList[i1];
i1++;
}
else
{
mergeList[iResult++] = initList[i2];
i2++;
}
}
for (i1; i1<=m; i1++)
mergeList[iResult++] = initList[i1];
for (i2; i2<=n; i2++)
mergeList[iResult++] = initList[i2];
}
Analysis of Merge()

Time complexity:
◦ The while-loop and two for-loops examine
each element in initList exactly once.
◦ The time complexity is O(n-l+1).

Space complexity:
◦ The additional array mergeList is required to
store the merged result.
◦ Space complexity is O(n-l+1).
7.5.2 Iterative Merge Sort

L=1
26
L=2
5
L: the maximum number of records in a block.
5
26
L=4
1
L=8
1
L = 16
77
1
1
1
5
26
5
5
61
77
11
11 61
77
11
11
15
11
15
59
26
19
15
56
59
61
26
48
15
48
19
15 59
19 48
61
19 48
77
19 48
56
61
77

Merging blocks with length of L:
i
i+L-1 i+L
0
L=2
1
5
26
2
i+2L-1
i+2L
3
1
77
11 61
15 59
19 48
Adjacent pairs of blocks of size L are merged
from initList to resultList.
 n is the number of records in initList.

void MergePass(int *initList, int *resultList, int n, int L)
{
int i;
for (i=0; i<=n-2*L; i+=2*L)
Merge(initList, resultList, i i+L-1, i+2L-1);
Merge remaining
if (i + L - 1 < n-1)
blocks of L < size < 2L
Merge(initList, resultList, i, i+L-1, n-1);
else {
for (i; i<n; i++)
resultList[i] = resultList[i];
}
}
i<=n-2L.
Merge adjacent blacks.
0
1
2
3
4
5
6
i+L-1 < n-1.
Merge from i to n-1.
0
1
2
3
4
5
6
i+L-1 >= n-1.
Copy the rest.
0
1
2
3
4
5
7
Merge Sort

L denotes the length of block currently
being merged.
void MergeSort(int *a, int n)
{
int *tempList = new int[n];
for (int L=1; L<n; L*=2)
{
MergePass(a, tempList, n, L);
L *= 2;
MergePass(tempList, a, n, L);
}
delete [] tempList;
}
The result is put into
tempList
Do the next pass
directly. Merge records
from tempList to a
Analysis of MergeSort()
Suppose there are n records.
 Space complexity: O(n).
 Time Complexity

◦ MergePass():
L
 O(n).
◦ MergeSort():
L
n
 A total log2 n passes are made over the data.
 Therefore, the time complexity O(nlogn).
7.5.2 Recursive Merge Sort

We divide the list into two roughly equal
parts and sort them recursively.
left
right
merge
sorted
26
26
26
26
26
5
5
5
77
77
1
1
77
61
1
59
11
5
26
5
11
61
1
5
5
61
61
11
59
11
59
11
26
77
1
1
5
26
61
77
1
5
11
15
19
59
11
77
61
26
15
48
15
19
48
15
48
59
59
19
48
19
15
11
15
59
11
15
19
48
19
59
48
48
61
19
59
77
void Merge(int *initList, int s, int m, int e)
{
int *temp = new int[e-s+1];
int i1=s, iResult=0, i2=m+1;
while (i1 <= m && i2 <= e)
{
if (initList[i1] <= initList[i2])
{
temp[iResult++] = initList[i1];
i1++;
}
else
{
temp[iResult++] = initList[i2];
i2++;
}
}
for (i1; i1<=m; i1++) temp[iResult++] = initList[i1];
for (i2; i2<=n; i2++)
temp[iResult++] = initList[i2];
for (i=0; i<iResult; i++) initList[i] = temp[i];
delete [] temp;
}
s
MergeSort()

start and end respectively denote the left
end and right end to be sorted in the
array a.
void MergeSort(int *a, int start, int end)
{
if (end <= start) return;
middle = (start + end) / 2;
MergeSort(a, start, middle);
MergeSort(a, middle+1, end);
Merge(a, start, middle, end);
}
Analysis of Recursive Merge Sort
Suppose there are n records to be
sorted.
 Time complexity
 O(nlogn)

T (n)  cn  2T (n / 2)
 cn  2( n  2T ( n ))  3cn  4T ( n )
2
4
4
...
 k  cn  2 k T ( n
2
k
)
...

 cn  2(
)
T (1)
Variation

Natural Merge Sort
◦ Make an initial pass over the data
 To determine the sublists of records that are in
order.
7.6 HEAP SORT
Discussion

Merge Sort
◦ In worst case and average case, the time
complexity is O(nlogn).
◦ However, additional storage is required.
 There is O(1) space merge algorithm, but it is much
slower than the original one.

Heap Sort
◦ Only a fixed amount of additional storage is
required.
◦ The time complexity is also O(nlogn).
◦ Using max heap.
Selection Sort
Suppose there are n records in the list.
 How to sort the list?

◦ First, find the largest and put it at the position
n-1.
◦ Second, find the largest from 0 to n-2 (the
second largest), and put it at the position n-2.
…
◦ In the i-th iteration, the i-th largest is selected
and is put at the position of n-i.

Consider how to sort [3, 4, 1, 5, 2].
Select 5.
3
3
3
4
4
2
1
1
1
5
2
4
2
5
5
3
4
1
2
5
3
2
1
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
Select 4.
Select 3.
Select 2.
1
2
3
4
5
Select 1.
1
2
3
4
5
Analysis of Selection

In the i-th iteration, O(n-i+1) computing time is
required to select the i-th largest.
O(n) + O(n-1) + … + O(1) = O(n2).
◦
How do we decrease the time complexity?


Improve the approach of selecting the maximum.
Using max heap


The deletion of the maximum from a max heap is O(log n),
when there are n elements in the heap.
Note: when using heap sort, the data is stored in [1;n].
Example
[1]
[2]
[4]
1
5
[5]
15
48
[8]
[9]
26
[1]
[3]
61
[6]
11
77
[7]
[2]
59
[4]
48
15
1
[10]
[8]
[9]
Initial Array
61
[5]
19
77
[3]
19
[6]
11
5
[10]
Initial Heap
59
[7]
26
[1]
[2]
[4]
5
[8]
15
61
48
[5]
[1]
[3]
19
[6]
1
11
59
[7]
[2]
26
[4]
15
59
48
[5]
[3]
19
[6]
26
11
5
[9]
[8]
Heap Size = 9
Sorted = [77]
Heap Size = 8
Sorted = [61, 77]
[7]
1
[1]
[2]
[4]
15
48
19
[5]
[1]
[3]
5
[6]
11
Heap Size = 7
Sorted = [59, 61, 77]
26
[7]
[2]
1
[4]
15
26
19
[5]
[3]
5
[6]
11
1
Heap Size = 69
Sorted = [48, 59, 61, 77]
Adjust()

Adjust binary tree with root to satisfy
heap property.
void Adjust(int *a, int root, int n)
{
int e = a[root];
for (int j=2*root; j<=n; j*=2)
{
if (j < n && a[j] < a[j+1]) //j is max child of its parent
j++;
if (e >= a[j])
break;
a[j/2] = a[j];
}
a[j/2] = e;
}
Heap Sort
void HeapSort(int *a, int n)
{
for (int i=n/2; i>=1; i--) //heapify
Adjust(a, i, n);
for (int i=n-1; i>=1; i--) //sort
{
swap(a[1], a[i+1]);
Adjust(a, 1, i);
}
}
Analysis of HeapSort()
Space complexity: O(1).
 Time complexity:

◦ Suppose the tree has k levels.
 The number of nodes on level i is ≦2i-1.
◦ In the first loop, Adjust() is called once for
each node that has a child.
Analysis of HeapSort()
Space complexity: O(1).
 Time complexity:

◦ Suppose the tree has k levels.
 The number of nodes on level i is ≦2i-1.
◦ In the first loop, Adjust() is called once for
each node that has a child.
 Time complexity:
2
i 1
(k  i)  20 (k  1)  21 (k  2)  ...  2k (0)
1i  k

2
k i
i  2k
0i  k 1
n

0i  k 1

0i  k 1
i
2i
i
2i
 2n  O ( n)
Analysis of HeapSort()
◦ In the next loop, n-1 times of Adjust() are
made with maximum depth k = log2 (n  1)
and swap is invoked n-1 times.
◦ Consequently, time computing time for the
loop is O(n logn).

Overall, the time complexity for
HeapSort() is O(n logn).