CSS 342 - University of Washington

Download Report

Transcript CSS 342 - University of Washington

CSS 342
DATA ST R UC TURES, A LG OR ITHMS, A N D DI S CR ETE M AT HEMATI CS I
LEC T UR E 1 4 . 1 5 0 302.
CA R R A NO CHA PT 1 2
Agenda
• Lab 3: how to run the test cases
• Lab 5: Preview
• Sorts
• Shell Sort
• Radix Sort
• Sort Summary
•Complexity theory example.
Lab 5
• The Jolly Banker
• Purpose:
• Learn Queues
• Learn Binary Search Trees
• Practice class design
• Two phase project
• Design Review: Wednesday, 3/4.
• Lab turned in 3/13.
Sorts and sorting
Sorting the Sorts
Selection Sort
worst/average O(n2)
Bubble Sort
worst/average O(n2)
Insertion Sort
worst/average O(n2)
Shell Sort
worst O(n2)/average O(n3/2)
Merge Sort
worst/average O(n log n)
Quick Sort
worst O(n2)/average O(n log n)
Radix Sort
worst/average O(n)
The Quick Sort
Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013
Quicksort: Recursive Overview
void quicksort(vector<int> &a, int first, int last)
{
int pivotIndex;
if ( first < last )
{
pivotIndex = choosePivot(a, first, last);
partition( a, fist, last, pivotIndex );
quicksort( a, first, pivotIndex - 1 );
quicksort( a, pivotIndex + 1, last );
}
}
Quicksort: Efficiency Analysis
Worst case: If the pivot is the smallest item in the
array segment, S1 will remain empty.
◦ S2 decreases in size by only 1 at each recursive call.
◦ Level 1 requires n-1 comparisons.
◦ Level 2 requires n-2 comparisons.
◦ Thus, (n-1) + (n-2) + …. + 2 + 1 = n(n-1)/2 = O(n2)
Average case: S1 and S2 contain the same number of
items.
◦ log n or log n + 1 levels of recursions occur.
◦ Each level requires n-k comparisons
◦ Thus, at most (n-1) * (log n + 1) = O(n log n )
Shell Sort
• Generalization of the Insertion Sort
• Optimized to reduce data movement
• Developed 1959 by Donald Shell
• Choose an Interleave/gap size (n) and sort the arrays chosen by that size
• This moves data large distances quickly
• Complexity has not been fully determined
• Depends on gap size (see appendix)
• Works best on partially sorted data
Shell Sort Example
Using gaps of size 5, 3, 1.
Computer Scientist of last week
Andrew Tanenbaum
•
•
•
•
Professor Vrije Universiteit, Amsterdam
Minix, father of Microkernel
Famous Tanenbaum-Tovalds debate
Recognized for creating computer science textbooks
0
16
81 94 11 96 12 35 17 95 28 58 41 75 15 85 87 38 20
15
11
12
12
11
15
20 58 11 75 12 35 17 38
17
17
28 94 41 96 15 85 87 95
38
20
81
28
28
gap = 17/2 = 8
81 94 11 96 12 35 17 95
28 58 41 75 15 85 87 38
20
gap = 8/2.2 = 3
sort
gap = 3/2.2 = 1
20
20 58 11
15 12 11
75 12 35
17 38 28
sort
17 38 28
20 41 35
94 41 96
75 58 87
15 85 87
94 81 96
95 81
95 85
sort
35
41
38
35
41
75
58
58
75
87
81
94
85
81
87
96
94
95
95
85
96
for (int gap = size / 2; gap > 0; gap = (gap == 2) ? 1 : int(gap / 2.2))
{
for (int i = gap; i < size; i++)
{
int tmp = arr[i];
int j = i;
for ( ; (j >= gap) && (tmp < arr[j - gap]); j -= gap)
{
arr[j] = arr[j - gap];
}
arr[j] = tmp;
}
}
http://en.wikipedia.org/wiki/Shellsort#mediaviewer/File:Sorting_shellsort_anim.gif
Mind the Gap
General term (k ≥
1)
Concrete gaps
Worst-case
time
complexity
N=2p]
Author and year of
publication
[when Shell, 1959[2]
Frank & Lazarus,
1960[6]
Hibbard, 1963[7]
, prefixed
with 1
Papernov &
Stasevich, 1965[8]
Mind the Gap (more)
successive numbers of the form
, not greater than
Pratt,
1971[9]
Knuth,
1973[1]
Incerpi
&
Sedgewi
ck,
1985,[10]
Knuth[1]
, prefixed with
1
Sedgewi
ck,
1986[4]
Sedgewi
ck,
1986[4]
The Radix Sort
Uses the idea of forming groups, then combining them to sort a collection of
data.
Consider collection of three letter groups
ABC, XYZ, BWZ, AAC, RLT, JBX, RDT, KLT, AEO, TLJ
Group strings by rightmost letter
(ABC, AAC) (TLJ) (AEO) (RLT, RDT, KLT) (JBX) (XYZ, BWZ)
Combine groups
ABC, AAC, TLJ, AEO, RLT, RDT, KLT, JBX, XYZ, BWZ
DATA STRUCTURES AND PROBLEM SOLVING WITH C++: WALLS AND MIRRORS, CARRANO AND HENRY, © 2013
The Radix Sort
Group strings by middle letter
(AAC) (A B C, J B X) (R D T) (A E O) (T L J, R L T, K L T) (B W Z) (X Y Z)
Combine groups
AAC, ABC, JBX, RDT, AEO, TLJ, RLT, KLT, BWZ, XYZ
Group by first letter, combine again
( A AC, A BC, A EO) ( B WZ) ( J BX) ( K LT) ( R DT, R LT) ( T LJ) ( X YZ)
Sorted strings
AAC, ABC, AEO, BWZ, JBX, KLT, RDT, RLT, TLJ, XYZ
DATA STRUCTURES AND PROBLEM SOLVING WITH C++: WALLS AND MIRRORS, CARRANO AND HENRY, © 2013
Radix Sort: Overview)
0123 2154 0222 0004 0283 1560 1061 2150 Original integers
1560 2150 1061 0222 0123 0283 2154 0004 Grouped by 4th digit
1560 2150 1061 0222 0123 0283 2154 0004 Combined
0004 0222 0123 2150 2154 1560 1061 0283 Grouped by 3rd digit
0004 0222 0123 2150 2154 1560 1061 0283 Combined
0004 1061 0123 2150 2154 0222 0283 1560 Grouped by 2nd digit
0004 1061 0123 2150 2154 0222 0283 1560 Combined
0004 0123 0222 0283 1061 1560 2150 2154 Grouped by 1st digit
0004 0123 0222 0283 1061 1560 2150 2154 Combined (sorted)
Radix Sort
(Efficiency Analysis)
Each grouping work requires n shuffles.
# grouping and combining steps is # digits.
◦ The previous case is 4.
Thus, for k digit number, the performance is:
◦ K * n = O( n )
where k is irrelevant to n
Disadvantage:
◦
◦
◦
◦
◦
Memory inefficient
K is not really a constant
Need to compare digits in the same order rather than items
Need to accommodate 10 groups for numbers
Need to accommodate 27 groups for strings (alphabet + blank)
1. Assume a List class which is a singly linked-list of nodes. The node and class is defined as
follows:
Item a
Class List
{
public:
Item b
Item c
Item d
Item e
head
List();
~List();
… member functions …
void Delete(Item it);
private:
struct Node
{
Item *pItem;
Node *next;
};
Node *head;
}
Write a member function, MoveToEnd which takes in an item by value, finds the first occurrence and moves it
to the end of the list. Return false if item could not be found; otherwise true.
Efficiency comparisons
http://www.sorting-algorithms.com/
https://www.youtube.com/watch?v=kPRA0W1kECg
STL
WHY REINVENT THE WHEEL?
STL Sequence Containers: the Big 3 (recap)
• Vector
•
•
•
•
Flexibly sized array
Access any element in constant time (index into array)
Add/Remove from the end of array
Data kept contiguous in memory
• Deque
•
•
•
•
Double ended queue
Can add/look from front or back
Access any element in constant time
Not guaranteed to be contiguous in memory
• List
• Linked list
• Need iterator to traverse
• Can add anywhere in list in constant time
Recall the stack
• Last In First Out (LIFO)
• We implemented with following structures:
push
• Array
• Linked List
• (aStack.push(newItem)).pop() is equal to aStack
• STL has a stack implementation as a Container Adapter
• Container adapter on vector, deque, or list
• Default is deque
• Functions: empty, size, push, pop, top
pop
#include <vector>
#include <stack>
#include <deque>
int main( )
{
stack<int> aStack;
stack<int, vector<int>> bStack;
for (int i = 0; i < 3; i++)
{
bStack.push(i);
}
cout << "stack size is: " << bStack.size() << endl;
cout << "top element is: " << bStack.top() << endl;
bStack.pop();
cout << "Popped! " << endl;
cout << "top element is: " << bStack.top() << endl;
return 0;
}