Algorithms and data structures 7.11.2015. Protected by http://creativecommons.org/licenses/by-nc-sa/3.0/hr/ Creative Commons n You are free to: share — copy and redistribute the material in any medium.

Download Report

Transcript Algorithms and data structures 7.11.2015. Protected by http://creativecommons.org/licenses/by-nc-sa/3.0/hr/ Creative Commons n You are free to: share — copy and redistribute the material in any medium.

Algorithms and data structures
7.11.2015.
Protected by http://creativecommons.org/licenses/by-nc-sa/3.0/hr/
Creative Commons
n
You are free to:
share — copy and redistribute the material in any medium or format
 adapt — remix, transform, and build upon the material

n
Under the following terms:



Attribution — You must give appropriate credit, provide a link to the license, and
indicate if changes were made. You may do so in any reasonable manner, but
not in any way that suggests the licensor endorses you or your use.
NonCommercial — You may not use the material for commercial purposes.
ShareAlike — If you remix, transform, or build upon the material, you must
distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological
measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is
permitted by an applicable exception or limitation.
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For
example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
Text copied from http://creativecommons.org/licenses/by-nc-sa/3.0/
Algorithms and data structures, FER
7.11.2015.
2 / 36
Sorting algorithms
7.11.2015.
Algorithms

Selected for illustration:
 Selection sort
 Bubble sort
 Insertion sort
 Shell sort
 Merge sort
 Quick sort
 Heap sort - later!
 Sortovi (Sorting)
http://www.solidware.com/sort/
Algorithms and data structures, FER
7.11.2015.
4 / 36
Selection sort


Find the smallest element in the array and swap it with the first array element
Repeat it with the rest of the array, while decreasing the unsorted portion of the
array
6
1
1
1
1
1
1
1
1
Algorithms and data structures, FER
4
4
2
2
2
2
2
2
2
1
6
6
3
3
3
3
3
3
8
8
8
8
4
4
4
4
4
7
7
7
7
7
5
5
5
5
7.11.2015.
5
5
5
5
5
7
6
6
6
3
3
3
6
6
6
7
7
7
2
2
4
4
8
8
8
8
8
5 / 36
Algorithm and complexity

Implementation - 2 loops:


The outer loop determines the range of the sorted portion of the array
The inner loop finds the minimum array element
void SelectionSort (int A [], int N) {
int i, j, min;
for (i = 0; i < N; i++) {
min = i;
for (j = i+1; j < N; j++) {
if (A[j] < A[min]) min = j;
}
Swap(&A[i], &A[min]);
}
}
Algorithms and data structures, FER
7.11.2015.
n 1
 O(n  i  1)
i 0
O(n-i-1)
6 / 36
Analysis of the execution time



Comparison prevails; there are less swaps
The execution time does not depend on the initial arrangement, but
on the number of array elements
O(n2) – roughly n2/2 comparisons and n swaps in an average
and in a worst case


Acceleration:


Execution is not essentially faster if the input elements are closer to their
final placements
Start sorting simultaneously from both ends
The worst case: reversely sorted sequence
Algorithms and data structures, FER
7.11.2015.
7 / 36
Bubble sort

The main idea: swapping of neighbouring elements if they are in wrong
sequence
 Start from the array beginning and progress towards the end
 Swap 2 elements if the first is larger than the second one
 Acceleration: if no swap has occurred while traversing through the whole
array, the array is sorted
Algorithms and data structures, FER
7.11.2015.
8 / 36
Bubble sort - example
1. traversal
2. traversal
3. traversal
4. traversal
64187532
46187532
41687532
41687532
41678532
41675832
41675382
41675328
41675328
14675328
14675328
14675328
14657328
14653728
14653278
14653278
14653278
14653278
14563278
14536278
14532678
14532678
14532678
14532678
14352678
14325678
Algorithms and data structures, FER
5. traversal
6. traversal
7. traversal
14325678
14325678
13425678
13245678
13245678
13245678
12345678
12345678
12345678
7.11.2015.
9 / 36
Algorithm
void BubbleSort (int A [], int N) {
int i, j;
for (i = 0; i < N-1; i++) {
for (j = 0; j < N-1-i; j++) {
if (A[j+1] < A[j])
Swap(&A[j], &A[j+1]);
}
}
}
Algorithms and data structures, FER
7.11.2015.
n2
 O(n  i  1)
i 0
O(n-1-i)
10 / 36
Improved bubble sort

If in a traversal no swapping has occurred, the array is sorted
void BubbleSort (int A [], int N) {
int i, j, SwapOccurred;
for (i = 0, SwapOccurred = 1; SwapOccurred; i++) {
SwapOccurred = 0;
for (j = 0; j < N-1-i; j++) {
if (A[j+1] < A[j]) {
Swap(&A[j], &A[j+1]);
SwapOccurred = 1;
}
}
}
}
Algorithms and data structures, FER
7.11.2015.
11 / 36
Analysis of the execution time

O(n2) – roughly n2/2 comparisons and n2/2 swaps in an
average and a worst case




If the input elements are close to their final positions, sorting can be quickly
completed
The worst case: reversely sorted sequence
The best case: already sorted sequence
The position of elements is essential for efficiency


Large elements at the beginning of the array are not a problem – they
quickly move towards the end - hares
Small elements at the end of array are a problem - they slowly progress
towards the beginning - turtles
Algorithms and data structures, FER
7.11.2015.
12 / 36
Insertion sort


The idea: there are two parts of an array: the sorted and the non-sorted
one
 In each step, the first element from the non-sorted part is inserted into the
sorted part at the right place
The way the cards are (usually) sorted in card games
<=x
>x
<=x
x
x
non-sorted
>x
non-sorted
Algorithms and data structures, FER
7.11.2015.
13 / 36
Insertion sort - example
6
6
4
1
1
1
1
1
1
Algorithms and data structures, FER
4
4
6
4
4
4
4
3
2
1
1
1
6
6
6
5
4
3
8
8
8
8
8
7
6
5
4
7
7
7
7
7
8
7
6
5
7.11.2015.
5
5
5
5
5
5
8
7
6
3
3
3
3
3
3
3
8
7
2
2
2
2
2
2
2
2
8
14 / 36
Algorithm and complexity

Implementation - 2 loops:


The outer loop determines the range of the sorted portion of the array
The inner loop inserts an element into the sorted part and shifts the rest of
elements
void InsertionSort (int A [], int N) {
int i, j;
int aux;
for (i = 1; i < N; i++) {
aux = A[i];
for (j = i; j >= 1 && A[j-1] > aux; j--)
A[j] = A[j-1];
A[j] = aux;
}
}
Algorithms and data structures, FER
7.11.2015.
n 1
O( O(i))
i 0
O(i)
15 / 36
Analysis of the execution time



O(n2) – roughly n2/4 comparisons and n2/4 swaps in an
average case and double that much in a worst case
When the input elements are close to their final positions, it is
quickly completed
Sorting is stable
Elements with the same key value are not swapped
 if a i b bear the same key value and a was positioned before b, after a
stable sort a shall maintain its relative position to b


Best and worst cases?


The worst case: a reversely sorted array
The best case: an already sorted array
Algorithms and data structures, FER
7.11.2015.
16 / 36
Shell sort

The oldest among quick sorting algoritms, a modified insertion sort


Idea:




author: Donald Shell
For a k-sorted array A it is valid A[i]  A [i + k], i, i+k where i and i+k
are regular indices
If an array is k-sorted and then additionally t-sorted (t<k), it remains also ksorted
Completely sorted array is 1-sorted
Generally, an incremental sequence of numbers h1, h2, h3, … ,ht is used in
reverse order
 Choice of the sequence is crucial for the algorithm efficiency
 Animation: http://www.cis.fiu.edu/~weiss/Shellsort.html
Algorithms and data structures, FER
7.11.2015.
17 / 36
Shell sort – example
step= 4
64187532
64187532
64187532
64127538
Algorithms and data structures, FER
step = 2
step = 1
14627538
12647538
12647538
12647538
12346578
12346578
7.11.2015.
12346578
12346578
12346578
12346578
12345678
12345678
12345678
18 / 36
Algorithm
void ShellSort (int A [], int N) {
int i, j, step, aux;
for (step = N / 2; step > 0; step /= 2) {
for (i = step; i < N; i++) {
aux = A [i];
for (j = i; j >= step && A[j-step] >
aux; j -= step) {
A [j] = A [j - step];
}
A [j] = aux;
}
}
}
Algorithms and data structures, FER
7.11.2015.
19 / 36
Complexity analysis


The average execution time has been an open (unsolved) problem for
a long time
 Worst case O(n2)
Hibbard’s sequence: {1, 3, 7, …, 2k -1} results in the worst case with
O(n3/2)

The average O(n5/4) has been determined by simulation; there is no proof
for it yet!

Sedgwick’s sequence: {1, 5, 19, 41, 109,…}, namely 9*4i - 9*2i + 1
alternating with 4i - 3*2i +1
 Worst case O(n4/3), and the average is O(n7/6)

It is not known whether a better sequence can be found
A simple algorithm with an extremely complicated complexity analysis

Algorithms and data structures, FER
7.11.2015.
20 / 36
Comparison of sorting algorithms with complexity O(n2)
Bubble
Selection
Insertion
Shell
Algorithms and data structures, FER
7.11.2015.
21 / 36
Mergesort



Divide-and-conquer strategy is applied, with recursion
author: John von Neumann, in year 1945
Idea:


Un-sorted sequence is divided into two approximately equal parts
Each sub-array is sorted recursively, until the sub-array is reduced to a
single element
–

Two sorted sub-arrays are merged into a sorted array
–

This single element array is sorted!
From two sorted arrays (A and B) a third array (C) is formed
By branching, log2 n levels are created, and in each level a process of
complexity O(n) is performed
 The execution time is O(n log2 n)
Algorithms and data structures, FER
7.11.2015.
22 / 36
Mergesort - example
31 24 47 1 6 78 12 65
31 24 47 1
31 24
31
6 78 12 65
47 1
24
24 31
47
1
1 47
6 78
6
78
6 78
1 24 31 47
12 65
12
65
12 65
6 12 65 78
1 6 12 24 31 47 65 78
Algorithms and data structures, FER
7.11.2015.
23 / 36
Algorithm

Exercise: write the function Merge

The function merges the left and the right (approximate) half into a sorted
array
void MSort(int A [], int AuxArray[], int left, int right) {
int middle;
if (left < right) {
middle = left + (right - left) / 2;
MSort (A, AuxArray, left, middle);
MSort (A, AuxArray, middle + 1, right);
Merge (A, AuxArray, left, middle + 1, right);
}
}
Algorithms and data structures, FER
7.11.2015.
24 / 36
Remarks

Price of faster sorting: memory


Rarely used for sorting in core memory




An auxiliary array is created!
Increased demand for additional memory and copying
This is the principal algorithm for sorting using external memory
Complexity in the average and the worst case: O(nlog2n)
Not faster if the input sequence is already sorted!
Algorithms and data structures, FER
7.11.2015.
25 / 36
Quicksort


The fastest algorithm known so far
Recursion: “divide-and-conquer“



http://euler.slu.edu/~goldwasser/demos/quicksort/
http://www.cs.queensu.ca/home/cisc121/2004f/lecturenotes/malamb/SortingDemos/QuickSortDemo.html
4 steps – quicksort (S)
 If the number of members of array S equals 0 or 1, return to the calling
program
 Choose any member v in the array S. Let it be the pivot.
 Distribute the remaining members of the array S, S \ {v} into two disjoint
sets:
– S1 = { x  S \ {v}  x  v} (everything less than pivot move to the left)
– S2 = { x  S \ {v}  x  v} (everything bigger than pivot, move to the right)
 Return the array constructed from {quicksort (S1), v, quicksort (S2)}
Algorithms and data structures, FER
7.11.2015.
26 / 36
Pivot selection


Not uniquely determined
Not uniquely determined either what to do with array members equal to the
pivot


The question of algorithm implementation
A part of good implementation is an efficient solution to this question, see:
–

Weiss: "Data Structures and Algorithm Analysis in C".
Possible methods for pivot selection:

Estimation of the median, based on 3 elements (the first element, the last element, the
element in the middle of the array)
–


Other possibilities: randomly chosen element, the first element, the last element
E.g. the array: 8 1 4 9 6 3 5 2 7 0
–


At the estimation of the median, these elements are immediately sorted
pivot = med3 (8, 6, 0) = 6
What would be the worst pivot?
What is the actual median?
Algorithms and data structures, FER
7.11.2015.
27 / 36
Quicksort - example
Selection of the pivot
8 1 4 9 6 3 5 2 7 0
^
^
^
0 1 4 9 6 3 5 2 7 8
Pivot penultimate
0 1 4 9 7 3 5
i->
0 1 4 9 7 3 5
i
0 1 4 2 7 3 5
i
j
0 1 4 2 5 3 7
Algorithms and data structures, FER
2 6 8
<-j
2 6 8
j
9 6 8
i i j are bypassing
0 1 4 2 5 3 7 9 6 8
j i
0 1 4 2 5 3 7 9 6 8
Returning the pivot to the position i
0 1 4 2 5 3 7 9 6 8
i
0 1 4 2 5 3 6 9 7 8
Selection of pivot
Selection of pivot
9 6 8
0 1 4 2 5 3
7.11.2015.
6
9 7 8
28 / 36
Algorithm complexity

Average execution time is O(n log n)


Sorting is very quick, mostly due to very efficient inner loop
The worst case is O(n2)


For a wrong choice of pivot (min or max member), n partitions are obtained
and for each of them the execution time is O(n)
It can be arranged that the probability of such a case exponentially
decreases
Algorithms and data structures, FER
7.11.2015.
29 / 36
Comparison of sorts with complexity O(nlogn)
Heap
Merge
Quick
Algorithms and data structures, FER
7.11.2015.
30 / 36
Sorting procedures



Sorting of about a million of records is not rare in practice
If a single execution of a loop takes 1 s:
 Classical sort would need about 106 s (i.e. more than 11 days)
 Quick sort takes about 20 s
The solution should not be always attempted in acquisition of faster
and more expensive computers
 Investment in development and application of better algorithms
may pay off
Algorithms and data structures, FER
7.11.2015.
31 / 36
Indirect sorting

For sorting of large data structures it would not be efficient to swap
many records

Examples for such structures
– Student’s ID, family name, name, address, enrolled courses and
grades
 If the data are sorted e.g. by the ID, then a separate array of IDs
is formed with adjoined pointers to the rest of the data.
 Only the formed array is sorted (using any appropriate algorithm)
Algorithms and data structures, FER
7.11.2015.
32 / 36
Comparison
Sorting
algorithm
best
average
worst
stable
method
selection sort
O(n2)
O(n2)
O(n2)
no
selection
insertion sort
O(n)
O(n2)
O(n2)
yes
insertion
bubble sort
O(n)
-
O(n2)
yes
swapping
-
-
?
no
insertion
merge sort
O(nlogn)
O(nlogn)
O(nlogn)
yes
merging
quick sort
O(nlogn)
O(nlogn)
O(n2)
no
division
heap sort
O(nlogn)
O(nlogn)
O(nlogn)
no
selection
shell sort
Algorithms and data structures, FER
7.11.2015.
33 / 36
Animations of algorithms

http://www.geocities.com/SiliconValley/Network/1854/Sort1.html
http://www.solidware.com/sort/
http://www.cs.hope.edu/~dershem/ccaa/animator/Animator.html
http://cg.scs.carleton.ca/~morin/misc/sortalg/
http://homepages.dcc.ufmg.br/~dorgival/applets/SortingPoints/SortingPoints.html
http://www.cis.fiu.edu/~weiss/Shellsort.html
http://www.educypedia.be/education/mathematicsjavasorting.htm

Google...






Algorithms and data structures, FER
7.11.2015.
34 / 36
Exercises

Write a program to find the kth largest member in an integer array of
n members.
Sort the input array in descending sequence and print out the
member with index k-1.
b) Enter k array members, sort them in descending sequence.
Enter the remaining array members. If a member is smaller than
the one with index k-1, ignore it, if it is larger insert it into the
appropriate place, and throw out the array member that would
obtain the index k now.
Apply various sorting algorithms and determine the corresponding a
priori execution times and measure the a posteriori execution times.
a)

Algorithms and data structures, FER
7.11.2015.
35 / 36
Exercises
Write a program to merge two sequential sorted files into a third
sorted sequential file.
 UpariDatoteke (MergeFiles)

Algorithms and data structures, FER
7.11.2015.
36 / 36