Chapter 7 Sorting
Download
Report
Transcript Chapter 7 Sorting
Chapter 7 Sorting
Part II
7.3 QUICK SORT
Example
2 < pivot and should
go to the other side.
pivot
14
left
i
1
25
41
52
6
4
Interchange a[i] and a[j]
a[j]
5
> will
pivoteventually
and should
stop at a position
go to the
where
a[j]other
< pivot.
side.
Stop, because i ==
Stop;
> j. j.
Interchange a[j] and pivot.
j
right
Algorithm
void QuickSort(int a[], int left, int right)
{
if (left < right)
{
pivot = a[left];
i = left;
j = right+1;
while (i < j)
{
for (i++; i<j and a[i] < pivot; i++) ;
for (j--; i<=j and a[j] >= pivot; j--) ;
if (i < j)
interchages a[i] and [i];
}
interchange a[j] and pivot;
QuickSort(a, left, j-1);
QuickSort(a, j+1 right;);
}
}
Analysis of QuickSort()
Worst case:
◦ Consider a list is stored.
1
2
4
5
6
The smallest one is always chosen as pivot.
In 1st iteration, n-1 elements are examined. In
second iterations, n-2 elements are examined...
Totally, the execution steps are
n-1 + n-2 + … + 1 = O(n2)
The time complexity is O(n2);
Lemma 7.1
Let Tavg(n) be the expect time for function
QuickSort() to sort a list with n records.
Then there exists a constant k such that
Tavg(n)≦knlogen for n≧2.
◦ In other words, Tavg(n) = O(nlogn)
Variations
The position of the pivot decides the time
complexity of QuickSort().
The best choice for the pivot is the
median.
◦ Variations:
Median-of-three: select the median among three
records: the most left, the most right, and the
middle one.
Random: select the pivot randomly.
7.4 HOW FAST CAN WE SORT?
(DECISION TREE)
Consideration
What is the best computing time for
sorting that we can hope for?
◦ Suppose the only operations permitted on
keys are comparisons and exchanges.
In this section, we shall prove O(n logn) is the best
possible time.
Using decision tree.
Example 7.4
Decision tree for Insertion Sort working
on [K1, K2, K3]
K1 ≦K2
[1, 2, 3]
N
Y
[2, 1, 3]
[1, 2, 3]
K2 ≦K3
Y
K1 ≦K3
Y
N
[1, 2, 3]
[1, 3, 2]
[2, 1, 3]
K1 ≦K3
Stop
I
Y
[1, 3, 2]
N
[2, 3, 1]
K2 ≦K3
Stop
N
[3, 1, 2]
IV
[2, 3, 1]
Y
N
[3, 2, 1]
Stop
Stop
Stop
Stop
II
III
V
VI
Observations
The leaf nodes denote the states to
terminate.
The number of permutations is 3! = 6.
◦ n! possible permutation for n records to sort.
A path from the root to some leaf node represents
one of n! possibilities.
The maximum depth of the tree is 3.
◦ The depth represents the number of
comparisons.
Theorem 7.1
Any decision tree that sorts n distinct
elements has a height of at least
log2(n!)+1.
◦ When sorting n elements, there are n!
different possible results.
Every decision tree for the sorting must at least
have n! leaves.
◦ A decision tree is a binary tree; therefore
2k-1 leaves if its height is k.
n! 2k 1 log2 (n!) k 1
k log2 (n!) 1
Corollary
Any algorithm that sorts only by
comparisons must have a worst case
computing time of Ω(n logn).
Proof
◦ By theorem, there is a path of length log2n!
n! n(n 1)(n 2)...(3)(2)(1) (n / 2)n / 2
So,
log2 n! (n / 2) log2 (n / 2) (n log n)
7.5 MERGE SORT
Merging
Consider how to merge two ordered
lists.
sorted
sorted
initList
l
m
m+1
merge
mergeList
sorted
n
Example
m
l
initList
m+1
n
4
5
7
11
2
3
6
8
14
i1
i1
i1
i1
i2
i1
i2
i2
i2
i2
i2
Copy the small one to
mergeList
mergeList
2
3
4
5
6
7
8
11
14
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult
iResult = l
Copy the rest to
mergeList
void Merge(int *initList, int *mergeList, int l, int m, int n)
{
int i1=l, iResult=l, i2=m+1;
while (i1 <= m && i2 <= n)
{
if (initList[i1] <= initList[i2])
{
mergeList[iResult++] = initList[i1];
i1++;
}
else
{
mergeList[iResult++] = initList[i2];
i2++;
}
}
for (i1; i1<=m; i1++)
mergeList[iResult++] = initList[i1];
for (i2; i2<=n; i2++)
mergeList[iResult++] = initList[i2];
}
Analysis of Merge()
Time complexity:
◦ The while-loop and two for-loops examine
each element in initList exactly once.
◦ The time complexity is O(n-l+1).
Space complexity:
◦ The additional array mergeList is required to
store the merged result.
◦ Space complexity is O(n-l+1).
7.5.2 Iterative Merge Sort
L=1
26
L=2
5
L: the maximum number of records in a block.
5
26
L=4
1
L=8
1
L = 16
77
1
1
1
5
26
5
5
61
77
11
11 61
77
11
11
15
11
15
59
26
19
15
56
59
61
26
48
15
48
19
15 59
19 48
61
19 48
77
19 48
56
61
77
Merging blocks with length of L:
i
i+L-1 i+L
0
L=2
1
5
26
2
i+2L-1
i+2L
3
1
77
11 61
15 59
19 48
Adjacent pairs of blocks of size L are merged
from initList to resultList.
n is the number of records in initList.
void MergePass(int *initList, int *resultList, int n, int L)
{
int i;
for (i=0; i<=n-2*L; i+=2*L)
Merge(initList, resultList, i i+L-1, i+2L-1);
Merge remaining
if (i + L - 1 < n-1)
blocks of L < size < 2L
Merge(initList, resultList, i, i+L-1, n-1);
else {
for (i; i<n; i++)
resultList[i] = resultList[i];
}
}
i<=n-2L.
Merge adjacent blacks.
0
1
2
3
4
5
6
i+L-1 < n-1.
Merge from i to n-1.
0
1
2
3
4
5
6
i+L-1 >= n-1.
Copy the rest.
0
1
2
3
4
5
7
Merge Sort
L denotes the length of block currently
being merged.
void MergeSort(int *a, int n)
{
int *tempList = new int[n];
for (int L=1; L<n; L*=2)
{
MergePass(a, tempList, n, L);
L *= 2;
MergePass(tempList, a, n, L);
}
delete [] tempList;
}
The result is put into
tempList
Do the next pass
directly. Merge records
from tempList to a
Analysis of MergeSort()
Suppose there are n records.
Space complexity: O(n).
Time Complexity
◦ MergePass():
L
O(n).
◦ MergeSort():
L
n
A total log2 n passes are made over the data.
Therefore, the time complexity O(nlogn).
7.5.2 Recursive Merge Sort
We divide the list into two roughly equal
parts and sort them recursively.
left
right
merge
sorted
26
26
26
26
26
5
5
5
77
77
1
1
77
61
1
59
11
5
26
5
11
61
1
5
5
61
61
11
59
11
59
11
26
77
1
1
5
26
61
77
1
5
11
15
19
59
11
77
61
26
15
48
15
19
48
15
48
59
59
19
48
19
15
11
15
59
11
15
19
48
19
59
48
48
61
19
59
77
void Merge(int *initList, int s, int m, int e)
{
int *temp = new int[e-s+1];
int i1=s, iResult=0, i2=m+1;
while (i1 <= m && i2 <= e)
{
if (initList[i1] <= initList[i2])
{
temp[iResult++] = initList[i1];
i1++;
}
else
{
temp[iResult++] = initList[i2];
i2++;
}
}
for (i1; i1<=m; i1++) temp[iResult++] = initList[i1];
for (i2; i2<=n; i2++)
temp[iResult++] = initList[i2];
for (i=0; i<iResult; i++) initList[i] = temp[i];
delete [] temp;
}
s
MergeSort()
start and end respectively denote the left
end and right end to be sorted in the
array a.
void MergeSort(int *a, int start, int end)
{
if (end <= start) return;
middle = (start + end) / 2;
MergeSort(a, start, middle);
MergeSort(a, middle+1, end);
Merge(a, start, middle, end);
}
Analysis of Recursive Merge Sort
Suppose there are n records to be
sorted.
Time complexity
O(nlogn)
T (n) cn 2T (n / 2)
cn 2( n 2T ( n )) 3cn 4T ( n )
2
4
4
...
k cn 2 k T ( n
2
k
)
...
cn 2(
)
T (1)
Variation
Natural Merge Sort
◦ Make an initial pass over the data
To determine the sublists of records that are in
order.
7.6 HEAP SORT
Discussion
Merge Sort
◦ In worst case and average case, the time
complexity is O(nlogn).
◦ However, additional storage is required.
There is O(1) space merge algorithm, but it is much
slower than the original one.
Heap Sort
◦ Only a fixed amount of additional storage is
required.
◦ The time complexity is also O(nlogn).
◦ Using max heap.
Selection Sort
Suppose there are n records in the list.
How to sort the list?
◦ First, find the largest and put it at the position
n-1.
◦ Second, find the largest from 0 to n-2 (the
second largest), and put it at the position n-2.
…
◦ In the i-th iteration, the i-th largest is selected
and is put at the position of n-i.
Consider how to sort [3, 4, 1, 5, 2].
Select 5.
3
3
3
4
4
2
1
1
1
5
2
4
2
5
5
3
4
1
2
5
3
2
1
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
Select 4.
Select 3.
Select 2.
1
2
3
4
5
Select 1.
1
2
3
4
5
Analysis of Selection
In the i-th iteration, O(n-i+1) computing time is
required to select the i-th largest.
O(n) + O(n-1) + … + O(1) = O(n2).
◦
How do we decrease the time complexity?
Improve the approach of selecting the maximum.
Using max heap
The deletion of the maximum from a max heap is O(log n),
when there are n elements in the heap.
Note: when using heap sort, the data is stored in [1;n].
Example
[1]
[2]
[4]
1
5
[5]
15
48
[8]
[9]
26
[1]
[3]
61
[6]
11
77
[7]
[2]
59
[4]
48
15
1
[10]
[8]
[9]
Initial Array
61
[5]
19
77
[3]
19
[6]
11
5
[10]
Initial Heap
59
[7]
26
[1]
[2]
[4]
5
[8]
15
61
48
[5]
[1]
[3]
19
[6]
1
11
59
[7]
[2]
26
[4]
15
59
48
[5]
[3]
19
[6]
26
11
5
[9]
[8]
Heap Size = 9
Sorted = [77]
Heap Size = 8
Sorted = [61, 77]
[7]
1
[1]
[2]
[4]
15
48
19
[5]
[1]
[3]
5
[6]
11
Heap Size = 7
Sorted = [59, 61, 77]
26
[7]
[2]
1
[4]
15
26
19
[5]
[3]
5
[6]
11
1
Heap Size = 69
Sorted = [48, 59, 61, 77]
Adjust()
Adjust binary tree with root to satisfy
heap property.
void Adjust(int *a, int root, int n)
{
int e = a[root];
for (int j=2*root; j<=n; j*=2)
{
if (j < n && a[j] < a[j+1]) //j is max child of its parent
j++;
if (e >= a[j])
break;
a[j/2] = a[j];
}
a[j/2] = e;
}
Heap Sort
void HeapSort(int *a, int n)
{
for (int i=n/2; i>=1; i--) //heapify
Adjust(a, i, n);
for (int i=n-1; i>=1; i--) //sort
{
swap(a[1], a[i+1]);
Adjust(a, 1, i);
}
}
Analysis of HeapSort()
Space complexity: O(1).
Time complexity:
◦ Suppose the tree has k levels.
The number of nodes on level i is ≦2i-1.
◦ In the first loop, Adjust() is called once for
each node that has a child.
Analysis of HeapSort()
Space complexity: O(1).
Time complexity:
◦ Suppose the tree has k levels.
The number of nodes on level i is ≦2i-1.
◦ In the first loop, Adjust() is called once for
each node that has a child.
Time complexity:
2
i 1
(k i) 20 (k 1) 21 (k 2) ... 2k (0)
1i k
2
k i
i 2k
0i k 1
n
0i k 1
0i k 1
i
2i
i
2i
2n O ( n)
Analysis of HeapSort()
◦ In the next loop, n-1 times of Adjust() are
made with maximum depth k = log2 (n 1)
and swap is invoked n-1 times.
◦ Consequently, time computing time for the
loop is O(n logn).
Overall, the time complexity for
HeapSort() is O(n logn).