Chapter 7 Sorting

Download Report

Transcript Chapter 7 Sorting

Chapter 7 Sorting
Part I
7.1 Motivation
list: a collection of records.
 keys: the fields used to distinguish among
the records.
 One way to search for a record with the
specified key is to examine the list in leftto-right or right-to-left order.

◦ Sequential search.
Sequential Search
int SeqSearch(int a[], int n, key)
{
int i;
for (i=0; i<n && a[i] != key; i++)
;
if (i >= n)
return -1;
return i;
}

To search the key ‘22’:
0
1
2
3
4
5
23
2
7
15
42
12
The search makes n key comparisons when it is unsuccessful.
Analysis of Time Complexity

Worst case:
◦ O(n) when the search is unsuccessful.
◦ Each element is examined exactly once.

Average case:
◦ When the search is successful, the number of
comparison depends on the position of the
search key.


 n(n  1)  1 n  1
 i  n  
 
2
 2 n
 1i n 
Binary Search
int BinarySearch(int a[], int n, in key)
{
int left = 0, right = n-1;
while (left <= right)
{
int middle = (left + right) / 2;
if (key < a[middle])
right = middle - 1;
else if (key > a[middle])
left = middle + 1;
else return middle;
}
return -1;
}
To find 23,
left
middle right
middle
0
1
2
3
4
5
2
7
12
15
23
42
found.
Binary Search

Even when the search is unsuccessful, the
time complexity is still O(log n).
◦ Something is to be gained by maintaining the
list in an order manner.
Sorting Methods

Internal:
◦ Can be carried out in memory.






Insertion sort.
Quick sort.
Merge sort.
Heap sort.
Radix sort.
External:
◦ The dataset is much more bigger so the data
cannot be fully carried out in memory.
7.2 INSERTION SORT
Idea

Consider how to insert a new integer
into a sorted array so that all the
elements in the array are sorted.
0
1
2
3
4
5
6
7
8
9
0
1
5
6
7
89
11
9
12
11
23
12
23
8
Insertion into a Sorted List
Suppose a is an integer array with n elements.
 void Insert(int key, int a[], int i);

◦ Insert a new value, key, into the first i integers in a,
where the first i integers should be sorted.
void Insert(int key, int a[], int i)
{
int j;
for (j = i-1; j >= 0 && key < a[j]; j--)
a[j+1] = a[j];
a[j+1] = key;
}
Insertion Sort

At the ith iteration of this algorithm, the first i
elements in the original array will be sorted.
void InsertionSort(int key, int a[], int i)
{
for (j = 1; j < n; j++)
int temp = a[j]
Insert(temp, a, j-1);
}
i = 21
a:
temp:
0
1
2
3
4
5
6
7
8
12
1
12
1
6
0
7
23
11
9
5
1
6
Analysis of InsertionSort()

Worst case
j
0
1
2
3
4
-
5
4
3
2
1
1
4
5
3
2
1
2
3
4
5
2
1
3
2
3
4
5
1
4
1
2
3
4
5
◦ As each new record is inserted into the
sorted part of the list, the entire sorted part
is shifted right by one position.
Analysis of InsertionSort()
In worst case, InsertSort() makes O(i)
comparison before a[i] is inserted.
 For i=1, 2, …, n-1. The time complexity of
InsertionSort() is

 n1   n(n  1) 
2
O  i   

O
(
n
)

 i 1   2 
◦ In fact, insertion sort is about the fastest
sorting method for small n (n ≦ 30).
Variations

Binary Insertion Sort:
◦ To reduce the number of comparison.
◦ Therefore, apply binary search in Insert().

Linked Insertion Sort
◦ The elements of the list are represented as a
linked list.
 The number of record move can become zero
because only link fields require adjustment.
7.3 QUICK SORT
Introduction
Quick sort has the best average behavior
among the sorting methods.
 Concept:

pivot
a:
a’:
Find a pivot from a.
≦ pivot
≧pivot
pivot
Quick Sort
Quick Sort
Quick sort is essentially a recursive approach.
Definition

left and right are used to indicate the scope of
the array.
◦ left should be less than right; if not, return directly.

a[left] is initially chosen as pivot.
pivot
4
left
1
5
1
2
6
4
j
i
right
i: examined as index to scan the array from left to right.
j: examined as index to scan the array from right to left.
Initially, i = left, j=right+1.
Example
2 < pivot and should
go to the other side.
pivot
14
left
i
1
25
41
52
6
4
Interchange a[i] and a[j]
a[j]
5
> will
pivoteventually
and should
stop at a position
go to the
where
a[j]other
< pivot.
side.
Interchange
Stop.
a[j] and pivot.
j
right
Algorithm
void QuickSort(int a[], int left, int right)
{
if (left < right)
{
pivot = a[left];
i = left;
j = right+1;
while (i < j)
{
for (i++; i<=j and a[i] < pivot; i++) ;
for (j++; i>=j and a[i] >= pivot; i++) ;
if (i < j)
interchages a[i] amd [i];
}
QuickSort(a, left, j-1);
QuickSort(a, j+1 right;);
}
}
Analysis of QuickSort()

Best case
◦ If the pivot is correctly positioned (left size =
right size), the time complexity is
T (n)  cn  2T (n / 2), for some constantc.
 cn  2(cn / 2  2T (n / 4))  2cn  4t (n / 4)
......
 cnlog2 n  nT (1)
 O(n log n)
Analysis of QuickSort()

Worst case:
◦ Consider a list is stored.
1
2
4
5
6
 The smallest one is always chosen as pivot.
 n iterations are required to reach base case (left >=
right).
 Each iteration takes O(n) time.
 The time complexity is O(n2);
Lemma 7.1

Let Tavg(n) be the expect time for function
QuickSort() to sort a list with n records.
Then there exists a constant k such that
Tavg(n)≦knlogen for n≧2.
Lemma 7.1

Proof:
◦ Assume pivot is at j.
◦ The expected time to sort two sides is
Tavg ( j 1)  Tavg (n  j)
j-1
j
n-j
◦ The average time is
1 n
Tavg (n)  cn   (Tavg ( j  1)  Tavg (n  j ))
n j 1
where c is a constant.
1 n
  (Tavg ( j  1)  Tavg (n  j )) 
n j 1
1
(Tavg (0)  Tavg (1)  ...  Tavg (n  1))  (Tavg (n  1)  Tavg (n  2)  ...  Tavg (0)) 
n
2 n 1
(Tavg ( j ))

n j 0
1 n
2 n1
 Tavg (n)  cn   (Tavg ( j  1)  Tavg (n  j ))  cn   Tavg ( j ) for n  2.
n j 1
n j 0
Assume Tavg (0)  b and Tavg (1)  b for some const antb.
Use induct ion to prove
Tavg (n)  kn ln n for n  2 and k  2(b  c)

Induction base: for n = 2
Tavg (2)  2c  2b  k  2 ln 2

Induction hypothesis:
◦ Assume that
Tavg (n)  kn ln n for1  n  m.

Induction step
2 m 1
Tavg (m)  cm   Tavg ( j ) 
m j 0
4b 2 m 1
4b 2k m 1
cm    Tavg ( j )  cm  
j ln j

m m j 2
m m j 2
Since j ln j is an increasingfunctionof j ,
4b 2k m
Tavg (m)  cm 

x ln xdx

2
m m
Use  udv  uv   vdu.
1
dx
x
1 2
Let dv  xdx.  v  x
2
1 2
1
1 2
x2
  x ln xdx  x ln x   xdx  x ln x 
2
2
2
4
Let u  ln x.  du 

m
2
m
1 2
x 
m 2 ln m m 2
x ln xdx   x ln x   

4 2
2
4
2
2
4b 2k m
Tavg (m)  cm 

x ln xdx 

m m 2
4b
km
cm 
 km ln m 

m
2
km ln m, for m  2