L20Searching Sorting.ppt
Download
Report
Transcript L20Searching Sorting.ppt
Searching & Sorting
Manipulating lists of data
for quick retrieval
CMSC 104
1
Common Problems
CMSC 104
There are some very common problems that
computers are asked to solve:
Searching through a lot of records for a
specific record.
Who uses this ?
o Airlines
o Companies that take phone orders
o Credit Card Companies
o ... almost any company
2
Searching
Does this search have to be fast ?
How can we make the search faster ?
o By keeping the records in some order
o By using an efficient search algorithm
Search algorithms
o Sequential search
o Binary search
CMSC 104
3
Sequential Search
on an Unordered File
Get the search criterion from the user
Get the first record from the file
While the record doesn’t match the
criterion && there are still more records
in the file get the next record
CMSC 104
When do we know that there wasn’t a
record in the file that matched.
4
Sequential Search
of an Ordered File
Get the search criterion from the user
Get the first record from the file
While the record is less than the
criterion get the next record
If the record matches the criterion then
success else there is no match in the
file.
When do we know that there wasn’t a
CMSC 104 record in the file that matched ?
5
Sequential Search of
Ordered vs. Unordered List
If the order was ascending alphabetical
on customer’s last names, how would
the search for John Adams on the
ordered list compare to the search on
the unordered list
o if John Adams was in the list ?
o if John Adams was not in the list ?
CMSC 104
6
Ordered vs Unordered
(continued)
How about George Washington ?
o unordered
• in the file
• not in the file
o ordered
• in the file
• not in the file
CMSC 104
James Madison ?
7
More Searching
Overall, we don’t really see much
improvement if we’re using the
sequential search.
Maybe we need a better search
algorithm.
How else could we search an ordered
file ?
CMSC 104
8
Binary Search
If we have an ordered list and we know
how many things are in the list (i.e. # of
records in a file), we can use a different
strategy.
Binary Search gets it’s name, because
we are always going to divide things
into two parts.
CMSC 104
9
How Binary Search Works
Always look at the
center value.
Each time you get to
get to discard half of
the remaining list.
CMSC 104
Is this fast ?
10
How fast is Binary Search ?
Worst case : 11 Items in the list took 4
tries
How about a list with 32 items ?
o 1st try - list has 16 items
o 2nd try - list has 8 items
o 3rd try - list has 4 items
o 4th try - list has 2 items
o 5th try - list has 1 item
CMSC 104
11
More examples
List has 250 items
1st try - 125 items
2nd try - 63 items
3rd try - 32 items
4th try - 16 items
5th try - 8 items
6th try - 4 items
7th try - 2 items
8th try - 1 item
CMSC 104
List has 512 items
1st try - 256 items
2nd try - 128 items
3rd try - 64 items
4th try - 32 items
5th try - 16 items
6th try - 8 items
7th try - 4 items
8th try - 2 items
9th try - 1 item
12
What’s the pattern ?
List of 11 took 4 tries
List of 32 took 5 tries
List of 250 took 8 tries
List of 512 took 9 tries
32 = 25 and 512 = 29
8 < 11 < 16 23 < 11 < 24
128 < 250 < 256 27 < 250 < 28
CMSC 104
13
The fastest !
How long (worst case) will it take to find
an item in a list 30,000 items long ?
210 = 1024
211 = 2048
212 = 4096
213 = 8192
214 = 16384
215 = 32768
So it will take 15 tries.
It only takes 15 tries to find what we
want out of 30,000 items - that’s
awesome !!!
CMSC 104
14
Lg n
We say that the binary search algorithm
runs in lg n time.
Lg n means the log to the base 2 of
some value of n
8 = 23 lg 8 = 3 16 = 24 lg 16 = 4
There are no algorithms that run faster
than lg n time.
CMSC 104
15
Searching and Sorting
(continued)
We have a very fast search algorithm Binary search
But, the list has to be sorted, before we
can search it with binary search.
To be really efficient, we also need a
fast sort algorithm.
CMSC 104
16
Some Sort Algorithms
Bubble Sort
Selection Sort
Insertion Sort
Heap Sort
Merge Sort
Quick Sort
In an effort to find a very fast sorting
algorithm, we have many known sorting
algorithms. Bubble sort is the slowest,
running in n2 time. Too slow !
CMSC 104
17
Speed of Sorting Algorithms
Most Sorting algorithms run in n lg n
time for the worst case.
Quick Sort runs a little faster for the
average case. So it is usually the sort
that’s used. The algorithm for Quick
Sort is quite complicated. It is shown in
your book. There is a pre-written
function called qsort in the C standard
library.
CMSC 104
18
Bubble Sort
void BubbleSort (int a[ ] , int size)
{
int i, j, temp;
for (i = 0; i < size; i++)
{
for (j = 0; j < size - 1; j++)
{
if (a[j] > a[j+1])
{
temp = a[j];
a[j] = a[j + 1];
a[j+1] = temp;
}
}
}
CMSC 104
}
19
Insertion Sort
Insertion sort is slower than quick sort,
but not as slow as bubble sort, and it is
easy to understand
Insertion sort works the same way as
arranging your hand when playing
cards.
CMSC 104
o Out of the pile of unsorted cards that were
dealt to you, you pick up a card and place
it in your hand in the correct position
relative to the cards you’re already holding.
20
Arranging Your Hand
7
5
CMSC 104
7
21
Arranging Your Hand
5
CMSC 104
7
5
6
7
5
6
7
K
5
6
7
8
K
22
Insertion Sort
7
K
7
5
7
2
>
7
5 < 7
CMSC 104
Unsorted - shaded
Look at 2nd item - 5
1 Compare 5 to 7
5 is smaller, so move 5
v
to temp, leaving
5
an empty slot in
position 2
Move 7 into the empty
slot, leaving position 1
open
3
Move 5 into the open
position
23
Insertion Sort
5
5
5
7
>
CMSC 104
6
Look at next item - 6
v
6
7
2
K
1
7
5
5
6
<
7
7
Compare to 1st - 5
6 is larger, so leave 5
Compare to next - 7,
6 is smaller, so move
6 to temp, leaving an
empty slot
Move 7 into the empty
slot, leaving position 2
open
3 Move 6 to the open
2nd position
24
Insertion Sort
5
6
7
K
Look at next item - King
Compare to 1st - 5
King is larger, so
leave 5 where it is
Compare to next - 6,
King is larger, so
leave 6 where it is
Compare to next - 7
King is larger, so
leave 7 where it is
CMSC 104
25
Insertion Sort
5
6
7
K
8
5
6
7
K
8
5
6
7
5
6
7
5
CMSC 104
6
7
v
8
K
2
8
1
>
<
K
K
3
26
Courses at UMBC
Algorithms - CMSC 441
o Studies algorithms and their speed
Cryptology - CMSC 443
o The study of making & breaking codes write programs that can break the code like
Pdeo eo pda yknnayp wjosan
CMSC 104
27