#### Transcript N - jgypk

```A.E. Csallner
Department of Applied Informatics
University of Szeged
Hungary


Algorithm:
 Finite sequence of finite steps
 Provides the solution to a given problem
Properties:
 Finiteness
 Definiteness
 Executability

Communication:
 Input
 Output
Algorithms and Data Structures I
2

Design strategies:

Bottom-up: synthesize smaller algorithmic
parts into bigger ones

Top-down: formulate the problem and
repeatedly break it up into smaller and
smaller parts
Algorithms and Data Structures I
3
Example : Shoe a horse
shoe a horse
a horse has four hooves
shoe a hoof
need a horseshoe
hammer a
horseshoe
need to fasten the horseshoe
to the hoof
drive a cog
into a hoof
need cogs
hammer
a cog
Structured programming
Algorithms and Data Structures I
4
Basic elements of structured programming

Sequence: series of actions

Selection: branching on a decision

Iteration: conditional repetition
All structured algorithms can be defined using
only these three elements (E.W. DIJKSTRA
1960s)
Structured programming
Algorithms and Data Structures I
5
An algorithm description method defines an
algorithm so that the description code should

be unambiguous;

programming language independent;

still easy to implement;

state-of-the-art
Algorithm description
Algorithms and Data Structures I
6
Some possible types of classification:

Age (when the description method was
invented)

Purpose (e.g. structural or object-oriented)

Formulation (graphical or text code, etc.)

...
Algorithm description
Algorithms and Data Structures I
7
Most popular and useful description methods

Flow diagram

old

not definitely structured(!)

graphical

very intuitive and easy to use
Algorithm description
Algorithms and Data Structures I
8
A possible notation of flow diagrams

Circle:
START
STOP
STOP
Algorithm description
Algorithms and Data Structures I
9
A possible notation of flow diagrams

Rectangle:
Any action execution
can be given here
Algorithm description
Algorithms and Data Structures I
10
A possible notation of flow diagrams

Diamond:
Any
yes/no
question
yes
Algorithm description
no
yes
Algorithms and Data Structures I
11
A possible notation of flow diagrams
An example:
START
Iteration
Selection
Sequence
Need more
horseshoes?
yes
Hammer a
horseshoe
no
Shoe a hoof
STOP
Algorithm description
Algorithms and Data Structures I
12
Most popular and useful description methods

Pseudocode

old

definitely structured

text based

very easy to implement
Algorithm description
Algorithms and Data Structures I
13
Properties of a possible pseudocode

Assignment instruction: 

Looping constructs as in Pascal:

for-do instruction (counting loop)
for variable  initial value to/downto final value
do body of the loop
Algorithm description
Algorithms and Data Structures I
14
Properties of a possible pseudocode

while-do instruction (pre-test loop)
while stay-in test
do body of the loop

repeat-until instruction (post-test loop)
repeat body of the loop
until exit test
Algorithm description
Algorithms and Data Structures I
15
Properties of a possible pseudocode

Conditional constructs as in Pascal:


if-then-else instruction (else clause is optional)
if test
then test passed clause
else test failed clause
Blocks are denoted by indentation
Algorithm description
Algorithms and Data Structures I
16
Properties of a possible pseudocode

Object identifiers are references

Field of an object separator is a dot:
object.field
object.method
object.method(formal parameter list)

Empty reference is NIL
Algorithm description
Algorithms and Data Structures I
17
Properties of a possible pseudocode

Arrays are objects

Parameters are passed by value
Algorithm description
Algorithms and Data Structures I
18
Properties of a possible pseudocode
An example:
ShoeAHorse(Hooves)
hoof  1
Iteration
while hoof ≤ Hooves.Count
do horseshoe  HammerAHorseshoe
Hooves[hoof]  horseshoe
hoof  hoof + 1
Algorithm description
Algorithms and Data Structures I
Sequence
19
Algorithm classification on the I/O structure

Sequence → Value

Sequence → Sequence

More sequences → Sequence

Sequence → More sequences
Type algorithms
Algorithms and Data Structures I
20
Sequence → Value

sequence calculations (e.g. summation, product of a

decision (e.g. checking whether a sequence contains
any element with a given property),

selection (e.g. determining the first element in a
sequence with a given property provided we know
that there exists at least one),
Type algorithms
Algorithms and Data Structures I
21
Sequence → Value (continued)

search (e.g. finding a given element),

counting (e.g. counting the elements having a
given property),

minimum or maximum search (e.g. finding the
least or the largest element).
Type algorithms
Algorithms and Data Structures I
22
Sequence → Sequence

selection (e.g. collect the elements with a given
property of a sequence),

copying (e.g. copy the elements of a sequence to
create a second sequence),

sorting (e.g. arrange elements into an increasing
order).
Type algorithms
Algorithms and Data Structures I
23
More sequences → Sequence

union (e.g. set union of sequences),

intersection (e.g. set intersection of sequences),

difference (e.g. set difference of sequences),

uniting sorted sequences (merging / combing
two ordered sequences).
Type algorithms
Algorithms and Data Structures I
24
Sequence → More sequences

filtering (e.g. filtering out elements of a
sequence having given properties).
Type algorithms
Algorithms and Data Structures I
25

Iterative algorithm
Consists of two parts:

Initialization (usually initializing data)

Iteration (repeated part)
Special algorithms
Algorithms and Data Structures I
26

Recursive algorithms
Basic types:

direct (self-reference)

indirect (mutual references)
Two alternative parts depending on the base criterion:

Base case (if the problem is small enough)

Recurrences (direct or indirect self-reference)
Special algorithms
Algorithms and Data Structures I
27
An example of recursive algorithms:
Towers of Hanoi
Aim:
Move n disks from a rod to another, using a third one
Rules:

One disk moved at a time

No disk on top of a smaller one
Special algorithms
Algorithms and Data Structures I
28
Recursive solution of the problem
1st step: move n–1 disks
2nd step: move
1 disk
Special algorithms
3rd step:
move n–1 disks
Algorithms and Data Structures I
29
Pseudocode of the recursive solution
TowersOfHanoi(n,FirstRod,SecondRod,ThirdRod)
1 if n > 0
2
then TowersOfHanoi(n – 1,FirstRod,ThirdRod,SecondRod)
3
write “Move a disk from ” FirstRod “ to ” SecondRod
4
TowersOfHanoi(n – 1, ThirdRod,SecondRod,FirstRod)
line 2
line 3
Special algorithms
line 4
Algorithms and Data Structures I
30

Backtracking algorithms
Backtracking algorithm:

Sequence of systematic trials

Builds a tree of decision branches

Steps back (backtracking) in the tree if no branch at a
point is effective
Special algorithms
Algorithms and Data Structures I
31
An example of the backtracking algorithms
Eight Queens Puzzle:
eight chess queens to be
placed on a chessboard so
that no two queens attack
each other
Special algorithms
Algorithms and Data Structures I
32
Pseudocode of the iterative solution
EightQueens
1 column  1
2 RowInColumn[column]  0
3 repeat
4
repeat inc(RowInColumn[column])
5
until IsSafe(column, RowInColumn)
6
if RowInColumn[column] > 8
7
then column  column – 1
8
else if column < 8
9
then column  column + 1
10
RowInColumn[column]  0
11
else draw chessboard
12 until column = 0
Special algorithms
Algorithms and Data Structures I
33
Questions regarding an algorithm:
 Does it solve the problem?
 How fast does it solve the problem?
 How much storage place does it occupy to
solve the problem?
Complexity issues
of the algorithm
Analysis of algorithms
Algorithms and Data Structures I
34
Elementary storage or time: independent from the
size of the input.
Example 1
If an algorithm needs 500 kilobytes to store some
internal data, this can be considered as
elementary.
Example 2
If an algorithm contains a loop whose body is
executed 1000 times, it counts as an elementary
algorithmic step.
Analysis of algorithms
Algorithms and Data Structures I
35
Hence a block of instructions count as a single
elementary step if none of the particular
instructions depends on the size of the input.
A looping construct counts as a single elementary
step if the number of iterations it executes does
not depend on the size of the input and its
body is an elementary step.
⇒ to shoe a horse can be considered as an
elementary step ⇔ it takes constant time (one
step) to shoe a horse
Analysis of algorithms
Algorithms and Data Structures I
36
The time complexity of an algorithm is a function
depending on the size of the input.
Notation: T(n) where n is the size of the input
Function T can depend on more than one
variable, e.g. T(n,m) if the input of the
algorithm is an n⨯m matrix.
Analysis of algorithms
Algorithms and Data Structures I
37
Example: Find the minimum of an array.
Minimum(A)
1 min  A
1
2 i1
3 repeat
4
ii+1
5
if A[i] < min
6
then min  A[i]
7 until i  A.Length
8 return min
Analysis of algorithms
Algorithms and Data Structures I
1
n−1
38
Hence T(n) = n (where n = A.Length)
Does this change if line 8 (return min) is
considered as an extra step?
?
In other words: n ≈ n + 1
this counts as
a single
elementary step
It does not change!
Proof:
n + 1 = (n − 1) + 2 ≈ (n − 1) + 1 = n
Analysis of algorithms
Algorithms and Data Structures I
39
This so-called asymptotic behavior can be
formulated rigorously in the following way:
We say that f (x) = O(g(x)) (big O notation) if
(∃C, x0 > 0) (∀x ≥ x0) 0 ≤ f (x) ≤ C∙g(x)
means that g is an asymptotic upper bound of f
Analysis of algorithms
Algorithms and Data Structures I
40
C∙g(x)
f (x)
g(x)
x0
Analysis of algorithms
Algorithms and Data Structures I
41
The O notation denotes an upper bound.
If g is also a lower bound of f then we say that
f (x) = θ (g(x)) if
(∃c, C, x0 > 0) (∀x ≥ x0) 0 ≤ c∙g(x) ≤ f (x) ≤ C∙g(x)
means that f asymptotically equals g
Analysis of algorithms
Algorithms and Data Structures I
42
f (x)
C∙g(x)
g(x)
c∙g(x)
x0C
Analysis of algorithms
x0c =x0
Algorithms and Data Structures I
43
What does the asymptotic notation show us?
We have seen:
T(n) = θ (n) for the procedure Minimum(A)
where n = A.Length
However, due to the definition of the θ function
T(n) = θ (n),
T(2n) = θ (n),
T(3n) = θ (n) ...
?
Minimum does not run slower on more data?
Analysis of algorithms
Algorithms and Data Structures I
44
What does the asymptotic notation show us?
Asymtotic notation shows us the tendency:
T(n) = θ (n
(n)2)linear
tendency
tendency
n data → a certain amount of time t
2n data → time ≈ 22t2t = 4t
3n data → time ≈ 33t2t = 9t
Analysis of algorithms
Algorithms and Data Structures I
45
Recursive algorithm – recursive function T
Example: Towers of Hanoi
TowersOfHanoi(n,FirstRod,SecondRod,ThirdRod)
1 if n > 0
2
then TowersOfHanoi(n – 1,FirstRod,ThirdRod,SecondRod)
3
write “Move a disk from ” FirstRod “ to ” SecondRod
4
TowersOfHanoi(n – 1, ThirdRod,SecondRod,FirstRod)
T(n)=
Analysis of algorithms
T(n−1)
+T(n−1)
+1
Algorithms and Data Structures I
=2T(n−1)+1
46
T(n) = 2T(n−1) + 1 is a recursive function
In general it is very difficult (sometimes
insoluble) to determine the explicit form of an
implicit (recursive) formula
If the algorithm is recursive, the solution can be
achieved using recursion trees.
T(n)=
Analysis of algorithms
=2T(n−1)+1
Algorithms and Data Structures I
47
Recursion tree of TowersOfHanoi:
n
n−1
n−2
1
1
1
1
n−1
1
n−2
n−2
1
1
1
1
1
1
n−2
2
1
4
2n−1
2n−1
Analysis of algorithms
Algorithms and Data Structures I
48
Time complexity:
T(n) = 2n − 1 = θ (2n) − exponential time
(very slow)
Example: n = 64 (from the original legend)
T(n) = 2n − 1 = 264 − 1 =
(assuming one disk
move per second)
≈ 1.8∙1019 seconds =
≈ 3∙1017 minutes =
≈ 5.1∙1015 hours =
≈ 2.1∙1014 days =
≈ 5.8∙1011 years > half a trillion years
Analysis of algorithms
Algorithms and Data Structures I
49
Problem (example): search a given element in a
sequence (array).
LinearSearch(A,w)
1 i0
2 repeat i  i + 1
3 until A[i] = w or i = A.Length
4 if A[i] = w then return i
5
else return NIL
Analysis of algorithms
Algorithms and Data Structures I
50
Array:
8
1
3
9
5
6
2
Best case
Element wanted: 8
Time complexity: T(n) = 1 = θ (1)
Worst case
Element wanted: 2
Time complexity: T(n) = n = θ (n)
Average case?
Analysis of algorithms
Algorithms and Data Structures I
51
Array:
8
1
3
9
5
6
2
The mean value of the time complexities on all
possible inputs:
T(n) = ( 1 + 2+ 3+ 4+ ... + n ) / n =
= n∙(n + 1) / 2n = (n + 1) / 2 = θ (n)
(The same as in the worst case)
Average case?
Analysis of algorithms
Algorithms and Data Structures I
52
To store a set of data of the same type in a linear
structure, two basic solutions exist:
 Arrays: physical sequence in the memory
18 29 22
together using links (pointers or indices)

18
key
29
22
Algorithms and Data Structures I
53
Time complexity of some operations on arrays
and linked lists in the worst case
Search
Insert
Delete
Minimum
Maximum
Successor
Predecessor
Array
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
O(n)
list
O(n)
O(1)
O(1)
O(n)
O(n)
O(n)
O(n)
Algorithms and Data Structures I
54
18

22
dummy

29
X
18
29
22
pointer


Indirection (indirect reference): pointer.key
to be continued...
Algorithms and Data Structures I
55
dummy
X
18
1
key
dummy
3
2
3
29
4
5
6
22
7
22 X
18
29
0
7
2
5
8
Problem: a lot of garbage
Algorithms and Data Structures I
56
Garbage collection for array-represented lists
The empty cells are linked to a separate garbage
1
dummy
3
4
22 X
key
2
8
3
0
5
5
6
18
0
garbage
7
8
29
7
1
2
4
6
Algorithms and Data Structures I
57
To allocate place for a new key and use it:
 the first element of the garbage list is linked out
from the garbage
 and linked into the proper list with a new key
(33 here) if necessary.
1
dummy
3
4
22 X
key
2
8
3
0
5
6
5
6
7
8
18 33 29
0
garbage
7
1
5
6
1
Algorithms and Data Structures I
2
4
new
6
58
Pseudocode for garbage management
2
then return 0
3
4
5
return new
Algorithms and Data Structures I
59

extra case:
2
the first element
is to be deleted
3
4
5
6
to step forward
7
while toDelete  0 and key[toDelete]  toFind
8
do pointer  toDelete
9
10
if toDelete  0
11
12
Algorithms and Data Structures I
60
3
5
6
7
Algorithms and Data Structures I
61
Common properties:
 only two operations are defined:



store a new key (called push and enqueue, resp.)
extract a key (called pop and dequeue, resp.)
all (both) operations work in constant time
Different properties:
 stacks are LIFO structures
 queues are FIFO (or pipeline) structures
Stacks and queues
Algorithms and Data Structures I
62
Two erroneous cases:


an empty data structure is intended to be
extracted from: underflow
no more space but insertion attempted:
overflow
Stacks and queues
Algorithms and Data Structures I
63
Stack management using arrays
push(8)
Stack:
push(1)
3
push(3)
1
push(9) Stack overflow
pop
8
 top
pop
pop
pop Stack underflow
Stacks and queues
Algorithms and Data Structures I
64
Stack management using arrays
Push(key,Stack)
1 if Stack.top = Stack.Length  stack overflow
2
then return Overflow error
3
else Stack.top  Stack.top + 1
4
Stack[Stack.top]  key
Stacks and queues
Algorithms and Data Structures I
65
Stack management using arrays
Pop(Stack)
1 if Stack.top = 0  stack underflow
2
then return Underflow error
3
else Stack.top  Stack.top − 1
4
return Stack[Stack.top + 1]
Stacks and queues
Algorithms and Data Structures I
66
Queue management using arrays
end ↓
Queue:
8
3
1
6
4
7
Empty ?queue:
•beginning = n
2 9 5
•end = 0
← beginning
Stacks and queues
Algorithms and Data Structures I
67
Queue management using arrays
Enqueue(key,Queue)
1 if Queue.beginning = Queue.end  queue
overflow
2
then return Overflow error
3
else if Queue.end = Queue.Length
4
then Queue.end  1
5
else Queue.end  Queue.end + 1
6
Queue[Queue.end]  key
Stacks and queues
Algorithms and Data Structures I
68
Queue management using arrays
Dequeue(Queue)
1 if Queue.end = 0  queue underflow
2
then return Underflow error
3
else if Queue.beginning = Queue.Length
4
then Queue.beginning  1
5
else inc(Queue.beginning)
6
key  Queue[Queue.beginning]
7
if Queue.beginning = Queue.end
8
then Queue.beginning  Queue.Length
9
Queue.end  0
10
return key
Stacks and queues
Algorithms and Data Structures I
69
Linear data structures cannot provide better time
complexity than n in some cases
Idea: let us use another kind of structure
Solution:
 rooted trees (especially binary trees)
 special order of keys (‘search trees’)
Binary search trees
Algorithms and Data Structures I
70
A binary tree:
Notions:
depth (height)
levels
root
vertex (node)
edge
parent - child
twins
(siblings)
Binary search trees
leaf
Algorithms and Data Structures I
71
A binary tree:
search
28
all keys in the
left subtree
are smaller
12
7
30
21
14
Binary search trees
for all vertices
all keys in the
right subtree
are greater
49
26
Algorithms and Data Structures I
50
72
Implementation of binary search trees:
28
12
30
key and other data
7
21
14
Binary search trees
49 to the
left child
26
Algorithms and Data Structures I
right child
50
73
increasing order
Binary search tree operations:
 tree walk
28
 inorder:
1. left
2. root
3. right
12
7
21
14
Binary search trees
30
49
26
Algorithms and Data Structures I
7
12
14
21
26
28
30
49
50
50
74
InorderWalk(Tree)
1 if Tree  NIL
2
then InorderWalk(Tree.Left)
3
visit Tree, e.g. check it or list it
4
InorderWalk(Tree.Right)
The so-called preorder and postorder tree walks
only differ by the order of lines 2-4:


preorder: root → left → right
postorder: left → right → root
Binary search trees
Algorithms and Data Structures I
75
Binary search tree operations:
 tree search
28
<
<
12
<
21
49
<
14
Binary search trees
TreeSearch(45)
30
<
7
TreeSearch(14)
<
26
Algorithms and Data Structures I
50
76
TreeSearch(toFind,Tree)
1 while Tree  NIL and Tree.key  toFind
2
do if toFind < Tree.key
3
then Tree  Tree.Left
4
else Tree  Tree.Right
5 return Tree
Binary search trees
Algorithms and Data Structures I
77
Binary search tree operations:
 insert
28
<
12
TreeInsert(14)
30
<
7
21
49
<
new vertices are
always inserted
14
as leaves
Binary search trees
26
Algorithms and Data Structures I
50
78
Binary search tree operations:
 tree minimum
28
 tree maximum
12
7
21
14
Binary search trees
30
49
26
Algorithms and Data Structures I
50
79
TreeMinimum(Tree)
1 while Tree.Left  NIL
2
do Tree  Tree.Left
3 return Tree
TreeMaximum(Tree)
1 while Tree.Right  NIL
2
do Tree  Tree.Right
3 return Tree
Binary search trees
Algorithms and Data Structures I
80
Binary search tree operations:
 successor of an
28
element
parent-left child
relation
12
7
30
21
tree
14minimum 26
Binary search trees
Algorithms and Data Structures I
TreeSuccessor(12)
if the element has no
right child:
TreeSuccessor(26)
49
50
81
TreeSuccessor(Element)
1 if Element.Right  NIL
2 then return TreeMinimum(Element.Right)
3 else Above  Element.Parent
4
while Above  NIL and
Element = Above.Right
5
do Element  Above
6
Above  Above.Parent
7 return Above
Finding the predecessor is similar.
Binary search trees
Algorithms and Data Structures I
82
Binary search tree operations:
 delete
1. if the element has
28
no children:
TreeDelete(26)
12
7
21
14
Binary search trees
30
49
26
Algorithms and Data Structures I
50
83
Binary search tree operations:
 delete
2. if the element has
28
only one child:
TreeDelete(30)
12
7
21
14
Binary search trees
30
49
26
Algorithms and Data Structures I
50
84
Binary search tree operations:
 delete
3. if the element has
28
12 is substituted
for a close key,
e.g. the
successor, 14 12
7
two children:
TreeDelete(12)
30
21
the successor,
found in the right
tree
subtree has at
14minimum 26
most one child
Binary search trees
Algorithms and Data Structures I
49
50
85
The case if Element has no children:
TreeDelete(Element,Tree)
1 if Element.Left = NIL and Element.Right = NIL
2
then if Element.Parent = NIL
3
then Tree  NIL
4
else if Element = (Element.Parent).Left
5
then (Element.Parent).Left  NIL
6
else (Element.Parent).Right  NIL
7
Free(Element)
8
return Tree
9- next page
Binary search trees
Algorithms and Data Structures I
86
The case if Element has only a right child:
-8 previous page
9 if Element.Left = NIL and Element.Right  NIL
10
then if Element.Parent = NIL
11
then Tree  Element.Right
12
(Element.Right).Parent  NIL
13
else (Element.Right).Parent  Element.Parent
14
if Element = (Element.Parent).Left
15
then (Element.Parent).Left  Element.Right
16
else (Element.Parent).Right  Element.Right
17
Free(Element)
18
return Tree
19- next page
Binary search trees
Algorithms and Data Structures I
87
The case if Element has only a left child:
-18 previous page
19 if Element.Left  NIL and Element.Right = NIL
20
then if Element.Parent = NIL
21
then Tree  Element.Left
22
(Element.Left).Parent  NIL
23
else (Element.Left).Parent  Element.Parent
24
if Element = (Element.Parent).Left
25
then (Element.Parent).Left  Element.Left
26
else (Element.Parent).Right  Element.Left
27
Free(Element)
28
return Tree
29-next page
Binary search trees
Algorithms and Data Structures I
88
The case if Element has two children:
-28 previous page
29 if Element.Left  NIL and Element.Right  NIL
30 then Substitute  TreeSuccessor(Element)
31
if Substitute.Right  NIL
from its place
32
then (Substitute.Right).Parent  Substitute.Parent
33
if Substitute = (Substitute.Parent).Left
34
then (Substitute.Parent).Left  Substitute.Right
35
else (Substitute.Parent).Right  Substitute.Right
36
Substitute.Parent  Element.Parent
37
if Element.Parent = NIL
38
then Tree  Substitute
Elements place
39
else if Element = (Element.Parent).Left
40
then (Element.Parent).Left  Substitute
41
else (Element.Parent).Right  Substitute
42
Substitute.Left  Element.Left
43
(Substitute.Left).Parent  Substitute
44
Substitute.Right  Element.Right
45
(Substitute. Right).Parent  Substitute
27
Free(Element)
28
return Tree
Binary search trees
Algorithms and Data Structures I
89
Time complexity of
binary search tree operations



T(n) = O(d) for all operations (except for the
walk), where d denotes the depth of the tree
The depth of any randomly built binary search
tree is d = O(log n)
Hence the time complexity of the search tree
operations in the average case is
T(n) = O(log n)
Stacks and queues
Algorithms and Data Structures I
90
If insert and delete is used rarely then it is more
convenient and faster to use an oredered array
instead of a binary search tree.
Faster: the following operations have T(n) = O(1)
constant time complexity:


minimum, maximum,
successor, predecessor.
Search has the same T(n) = O(log n) time
complexity as on binary search trees:
Binary search
Algorithms and Data Structures I
91
Search has the same T(n) = O(log n) time
complexity as on binary search trees:
Let us search key 29 in the ordered array below:
central element
2
3
<
7 12 29 31 45
search here
Binary search
Algorithms and Data Structures I
92
Search has the same T(n) = O(log n) time
complexity as on binary search trees:
Let us search key 29 in the ordered array below:
<
2
3
central element
7 12 29 31 45
search here
Binary search
Algorithms and Data Structures I
93
Search has the same T(n) = O(log n) time
complexity as on binary search trees:
Let us search key 29 in the ordered array below:
central element
=
2
3
found!
7 12 29 31 45
search here
Binary search
Algorithms and Data Structures I
94
Search has the same T(n) = O(log n) time
complexity as on binary search trees:
O(log n)
2
3
7 12 29 31 45
This result can also be derived from:
if we halve n elements k times, we get 1 ⇔
n / 2k = 1 ⇔ k = log2 n = O(log n)
Binary search
Algorithms and Data Structures I
95
Problem
There is a set of data from a base set with a given
order over it (e.g. numbers, texts). Arrange them
according to the order of the base set.
Example
12 2
Sorting
7
3
Algorithms and Data Structures I
sorting
96
Sorting sequences
We sort sequences in a lexicographical order:
from two sequences the sequence is ‘smaller’
which has a smaller value at the first position
where they differ.
Example (texts)
g
o
n
e
<?
g
o
o
d
n < o in the alphabet
Sorting
Algorithms and Data Structures I
97
Principle
14
8
69
22
75
Insertion sort
Algorithms and Data Structures I
98
Implementation of insertion sort with arrays

insertion step:
22 69 75 38 14
sorted part
Insertion sort
unsorted part
Algorithms and Data Structures I
99
InsertionSort(A)
1 for i  2 to A.Length
2
do ins  A[i]
3
ji–1
4
while j > 0 and ins < A[j]
5
do A[j + 1]  A[j]
6
jj–1
7
A[j + 1]  ins
Insertion sort
Algorithms and Data Structures I
100
Time complexity of insertion sort
Best case
In each step the new element is inserted to the
end of the sorted part:
T(n) = 1 + 1 + 1 +...+ 1 = n − 1 = θ (n)
Worst case
In each step the new element is inserted to the
beginning of the sorted part:
T(n) = 2 + 3 + 4 +...+ n = n(n + 1)/2 − 1 = θ (n2)
Insertion sort
Algorithms and Data Structures I
101
Time complexity of insertion sort
Average case
In each step the new element is inserted
somewhere in the middle of the sorted part:
T(n) = 2/2 + 3/2 + 4/2 +...+ n/2 =
= (n(n + 1)/2 − 1) / 2 = θ (n2)
The same as in the worst case
Insertion sort
Algorithms and Data Structures I
102
Another implementation of insertion sort


The input is providing elements continually
(e.g. file, net)
The sorted part is a linked list where the
elements are inserted one by one
The time complexity is the same in every case.
Insertion sort
Algorithms and Data Structures I
103
Another implementation of insertion sort
The linked list implementation delivers an on-line
algorithm:
 after each step the subproblem is completely
solved
 the algorithm does not need the whole input to
partially solve the problem
Cf. off-line algorithm:
 the whole input has to be known prior to the
substantive procedure
Insertion sort
Algorithms and Data Structures I
104
Principle
8 14
14 69
69
8
75 75
25
2
2 22
22 25
36 36
sort the parts recursively
Merge sort
Algorithms and Data Structures I
105
8
14
69
75
2
22
25
36
merge (comb)
Merge sort
Algorithms and Data Structures I
106
Time complexity of merge sort
Merge sort is a recursive algorithm, and so is its
time complexity function T(n)
What it does:
 First it halves the actual (sub)array: O(1)
 Then calls itself for the two halves: 2T(n/2)
 Last it merges the two ordered parts: O(n)
Hence T(n) = 2T(n/2) + O(n) = ?
Merge sort
Algorithms and Data Structures I
107
Recursion tree of merge sort:
n
n
n/2
n/4
1
n/2
n/4
1
n/4
1
2(n/2)
n/4
1
4(n/4)
n
n∙log n
Merge sort
Algorithms and Data Structures I
108
Time complexity of merge sort is
T(n) = θ (n∙logn)
This worst case time complexity is optimal among
comparison sorts (using only pair comparisons)
⇒ fast
but unfortunately merge sort does not sort
in-place, i.e. it uses auxiliary storage of a size
comparable with the input
Merge sort
Algorithms and Data Structures I
109
An array A is called heap if for all its elements
A[i] ≥ A[2i] and A[i] ≥ A[2i + 1]
45 27 34 20 23 31 18 19 3 14
This property is called heap property
It is easier to understand if a binary tree is built
from the elements filling the levels row by row
Heapsort
Algorithms and Data Structures I
110
45 27 34 20 23 31 18 19 3 14
Heapsort
Algorithms and Data Structures I
111
1
2
4
8
20
19
Heapsort
3
27
5
9
3
45
10
14
23
6
31
34
7
18
The heap property turns
into a simple parent-child
relation in the tree
representation
Algorithms and Data Structures I
112
An important application of heaps is realizing
priority queues:
A data structure supporting the operations



insert
maximum (or minimum)
extract maximum (or extract minimum)
Heapsort
Algorithms and Data Structures I
113
First we have to build a heap from an array.
Let us suppose that only the kth element infringes
the heap property.
In this case it is sunk level by level to a place
where it fits. In the example k = 1 (the root):
Heapsort
Algorithms and Data Structures I
114
1
2
4
8
20
19
Heapsort
3
37
5
9
3
15
10
14
23
6
31
34
7
18
k=1
•The key and its children
are compared
•It is exchanged for the
greater child
Algorithms and Data Structures I
115
1
2
4
8
20
19
Heapsort
3
15
5
9
3
37
10
14
23
6
31
34
7
18
k=2
•The key and its children
are compared
•It is exchanged for the
greater child
Algorithms and Data Structures I
116
1
2
4
8
20
19
Heapsort
3
23
5
9
3
37
10
14
15
6
31
34
7
18
k=5
•The key and its children
are compared
•It is the greatest ⇒ ready
Algorithms and Data Structures I
117
Sink(k,A)
1 if 2*k ≤ A.HeapSize and A[2*k] > A[k]
2
then greatest  2*k
3
else greatest  k
4 if 2*k + 1 ≤ A.HeapSize and
A[2*k + 1] > A[greatest]
5
then greatest  2*k + 1
6 if greatest  k
7
then Exchange(A[greatest],A[k])
8
Sink(greatest,A)
Heapsort
Algorithms and Data Structures I
118
To build a heap from an arbitrary array, all
elements are mended by sinking them:
this is the array’s last element
that has any children
BuildHeap(A)
1 A.HeapSize  A.Length
2 for k  A.Length / 2 downto 1
3
do Sink(k,A)
we are stepping backwards; this way every
visited element has only ancestors which
fulfill the heap property
Heapsort
Algorithms and Data Structures I
119
Time complexity of building a heap



To sink an element costs O(logn) in the worst
case
Since n/2 elements have to be sunk, an upper
bound for the BuildHeap procedure is
T(n) = O(n∙logn)
It can be proven that the sharp bound is
T(n) = θ (n)
Heapsort
Algorithms and Data Structures I
120
Time complexity of the priority queue
operations if the queue is realized using heaps

insert



append the new element to the array O(1)
exchange it for the root O(1)
sink the root O(logn)
The time complexity is
T(n) = O(logn)
Heapsort
Algorithms and Data Structures I
121
Time complexity of the priority queue
operations if the queue is realized using heaps

maximum

read out the key of the root O(1)
The time complexity is
T(n) = O(1)
Heapsort
Algorithms and Data Structures I
122
Time complexity of the priority queue
operations if the queue is realized using heaps

extract maximum



exchange the root for the array’s last element O(1)
extract the last element O(1)
sink the root O(logn)
The time complexity is
T(n) = O(logn)
Heapsort
Algorithms and Data Structures I
123
The heapsort algorithm


build a heap θ (n)
iterate the following (n−1)∙O(logn) = O(n∙logn):



exchange the root for the array’s last element O(1)
exclude the heap’s last element from the heap O(1)
sink the root O(logn)
The time complexity is
T(n) = O(n∙logn)
Heapsort
Algorithms and Data Structures I
124
HeapSort(A)
1 BuildHeap(A)
2 for k  A.Length downto 2
3
do Exchange(A,A[A.HeapSize])
4
A.HeapSize  A.HeapSize – 1
5
Sink(1,A)
Heapsort
Algorithms and Data Structures I
125
Principle
22
69
8
75
25
12
14
36
Rearrange and part the elements so that every key
in the first part is smaller than any in the second
part.
Quicksort
Algorithms and Data Structures I
126
Principle
14
12
8
75
25
69
22
36
Rearrange and part the elements so that every key
in the first part is smaller than any in the second
part.
Quicksort
Algorithms and Data Structures I
127
Principle
148
12
148
75
22
25
69
36
22
69
36
75
Sort each part recursively,
this will result in the whole array being sorted.
Quicksort
Algorithms and Data Structures I
128
The partition algorithm


choose any of the keys stored in the array; this
will be the so-called pivot key
exchange the large elements at the beginning of
the array to the small ones at the end of it
not lesspivot
thankey
the pivot key
22
Quicksort
69
8
not greater than
the pivot key
75
25
Algorithms and Data Structures I
12
14
36
129
Partition(A,first,last)
1 left  first – 1
2 right  last + 1
3 pivotKey  A[RandomInteger(first,last)]
4 repeat
5
repeat left  left + 1
6
until A[left] ≥ pivotKey
7
repeat right  right – 1
8
until A[right] ≤ pivotKey
9
if left < right
10
then Exchange(A[left],A[right])
11
else return right
12 until false
Quicksort
Algorithms and Data Structures I
130
The time complexity of the partition algorithm is
T(n) = θ (n)
because each element is visited exactly once.
The sorting is then:
QuickSort(A,first,last)
1 if first < last
2
then border  Partition(A,first,last)
3
QuickSort(A,first,border)
4
QuickSort(A,border+1,last)
Quicksort
Algorithms and Data Structures I
131



Quicksort is a divide and conquer algorithm
like merge sort, however, the partition is
unbalanced (merge sort always halves the
subarray).
The time complexity of a divide and conquer
algorithm highly depends on the balance of the
partition.
In the best case the quicksort algorithm halves
the subarrays at every step ⇒
T(n) = θ (n∙logn)
Quicksort
Algorithms and Data Structures I
132
Recursion tree of the worst case
n
n
n−1
1
n−2
1
1
n−1
1
n−2
0
n∙(n + 1) / 2
Quicksort
Algorithms and Data Structures I
133


Thus, the worst case time complexity of
quicksort is
T(n) = θ (n2)
The average case time complexity is
T(n) = θ (n∙logn)
the same as in the best case!
The proof is difficult but let’s see a special case
to understand quicksort better.
Quicksort
Algorithms and Data Structures I
134
Let λ be a positive number smaller than 1:
0<λ<1
Assumption: the partition algorithm never
provides a worse partition ratio than
(1− λ) : λ
Example 1: Let λ := 0.99
The assumption demands that the partition
algorithm does not leave less than 1% as the
smaller part.
Quicksort
Algorithms and Data Structures I
135
Example 2: Let λ := 0.999 999 999
Due to the assumption, if we have at most one
billion(!) elements then the assumption is
fulfilled for any functioning of the partition
algorithm.
(Even if it always cuts off only one element
from the others).
In the following it is assumed for the sake of
simplicity that λ ≥ 0.5, i.e. always the λ part is
bigger.
Quicksort
Algorithms and Data Structures I
136
Recursion tree of the λ ratio case
n
n
(1 − λ)n
λn
(1 − λ)λn
n
λ2n
λdn
≤n
≤n
≤ n∙logn
Quicksort
Algorithms and Data Structures I
137


In the special case if none of the parts arising at
the partitions are bigger than a given λ ratio
(0.5 ≤ λ < 1), the time complexity of quicksort is
T(n) = O(n∙logn)
The time complexity of quicksort is practically
optimal because the number of elements to be
sorted is always bounded by a number N
(finite storage). Using the value λ = 1 − 1/N it
can be proven that quicksort finishes in
O(n∙logn) time in every possible case.
Quicksort
Algorithms and Data Structures I
138
Problem


Optimization problem: Let a function f(x) be
given. Find an x where f is optimal (minimal or
maximal) ‘under given circumstances’
‘Given circumstances’: An optimization
problem is constrained if functional constraints
have to be fulfilled such as g(x) ≤ 0
Greedy algorithms
Algorithms and Data Structures I
139


Feasible set: the set of those x values where the
given constraints are fulfilled
Constrained optimization problem:
minimize f(x)
subject to g(x) ≤ 0
Greedy algorithms
Algorithms and Data Structures I
140
Example
Problem: There is a city A and other cities
B1,B2,...,Bn which can be reached from A by bus
directly. Find the farthest of these cities where
you can travel so that your money suffices.
A
B1
Greedy algorithms
B2
...
Algorithms and Data Structures I
Bn
141
Model:
 Let x denote any of the cities: x ∊ {B1,B2,...,Bn},
 f(x) the distance between A and x,
 t(x) the price of the bus ticket from A to x,
 m the money you have, and
 g(x) = t(x) − m the constraint function.
The constrained optimization problem to solve:
minimize (− f(x))
s.t.
g(x) ≤ 0
Greedy algorithms
Algorithms and Data Structures I
142
In general, optimization problems are much more
difficult!
However, there is a class of optimization
problems which can be solved using a step-bystep simple straightforward principle:
greedy algorithms:


at each step the same kind of decision is made,
striving for a local optimum, and
decisions of the past are never revisited.
Greedy algorithms
Algorithms and Data Structures I
143
Question:Which problems can be solved using
greedy algorithms?
Problems which obey the following two rules:
 Greedy choice property: If a greedy choice is
made first, it can always be completed to
achieve an optimal solution to the problem.
 Optimal substructure property: Any
substructure of an optimal solution provides an
optimal solution to the adequate subproblem.
Greedy algorithms
Algorithms and Data Structures I
144
Counter example
Find the shortest route from Szeged to Budapest.
The greedy choice property is infringed:
You cannot simply choose the closest town first 
Greedy algorithms
Algorithms and Data Structures I
145
Budapest
Szeged
Deszk
Deszk is the closest to Szeged but
situated in the opposite direction
Greedy algorithms
Algorithms and Data Structures I
146
Proper example
Activity-selection problem:
Let’s spend a day watching TV.
Aim: Watch as many programs (on the wole) as
you can.
Greedy strategy:
Watch the program ending first, then the next you
can watch on the whole ending first, etc.
Activity-selection problem
Algorithms and Data Structures I
147
Let’smore
Include
Exclude
No
sortthe
those
programs
thefirst
programs
which
oneleft:
have
by
their ending
begun
times
The optimum is 4 (TV programs)
Activity-selection problem
Algorithms and Data Structures I
148
Check the greedy choice property:
 The first choice of any optimal solution can
be exchanged for the greedy one
Activity-selection problem
Algorithms and Data Structures I
149
Check the optimal substructure property:
 The part of an optimal solution is optimal
also for the subproblem
If this was not
optimal for the
subproblem,
the whole solution
could be improved
by improving the
subproblem’s
solution
Activity-selection problem
Algorithms and Data Structures I
150
Notions


C is an alphabet if it is a set of symbols
F is a file over C if it is a text built up of the
characters of C
Huffman codes
Algorithms and Data Structures I
151
Assume we have the following alphabet
C = {a, b, c, d, e}
Code it with binary codewords of equal length
How many bits per codeword do we need at least?
2 are not enough (only four codewords: 00, 01, 10, 11)
Build codewords using 3 bit coding
a = 000
b = 001
c = 010
d = 011
e = 100
Huffman codes
Algorithms and Data Structures I
152
a = 000
b = 001
c = 010
d = 011
e = 100
Build the T binary tree of the coding
0
0
1
0
1
a
b
Huffman codes
1
0
c
0
1
d
0
e
Algorithms and Data Structures I
153
Further notation



For each cC character its frequency in the file
is denoted by f(c)
For each cC character its length is defined by
its depth in the T tree of coding, d T (c)
Hence the length of the file (in bits) equals
B(T)= cC f(c)d T (c)
Huffman codes
Algorithms and Data Structures I
154
Problem
Let a C alphabet and a file over it given. Find a T
coding of the alphabet with minimal B(T)
Huffman codes
Algorithms and Data Structures I
155
Example
Consider an F file of 20,000 characters over the
alphabet C = {a, b, c, d, e}
Assume the frequencies of the particular
characters in the file are
f(a) = 5,000
f(b) = 2,000
f(c) = 6,000
f(d) = 3,000
f(e) = 4,000
Huffman codes
Algorithms and Data Structures I
156
Using the 3 bit coding defined previously, the bitlength of the file equals
B(T)= cC f(c)d T (c)=
5,0003+2,0003+6,0003+3,0003+4,0003=
(5,000+2,000+6,000+3,000+4,000)3=
20,0003=60,000
This is a so-called fixed-length code since for all
x,yC d T (x)=d T (y) holds
Huffman codes
Algorithms and Data Structures I
157
The fixed-length code is not always optimal
0
0
1
0
1
a
b
Huffman codes
1
0
c
0
1
d
0
e
Algorithms and Data Structures I
B(T’)=
B(T)−f(e)1=
60,000−4,0001 =
56,000
158
Idea


Construct a variable-length code, i.e., where the
code-lengths for different characters can differ
from each other
We expect that if more frequent characters get
shorter codewords then the resulting file will
become shorter
Huffman codes
Algorithms and Data Structures I
159
Problem: How do we recognize when a
codeword ends and a new begins. Using
delimiters is too “expensive”
Solution: Use prefix codes, i.e., codewords none
of which is also a prefix of some other
codeword
Result: The codewords can be decoded without
using delimiters
Huffman codes
Algorithms and Data Structures I
160
For instance if
a = 10
b = 010
c = 00
then the following codes’ meaning is
1000010000010010 = a c b c c a b
However, what if a variable-length code was not
prefix-free:
Huffman codes
Algorithms and Data Structures I
161
Then if
a = 10
b = 100
c=0
then
100= b
or 1 0 0 = a c ?
An extra delimiter would be needed
Huffman codes
Algorithms and Data Structures I
162
Realize the original idea with prefix codes
f(a) = 5,000
f(b) = 2,000
f(c) = 6,000
f(d) = 3,000
f(e) = 4,000
rare
frequent
Frequent codewords should be shorter, e.g.,
a = 00, c = 01, e = 10
Rare codewords can be longer, e.g.,
b = 110, d = 111
Huffman codes
Algorithms and Data Structures I
163
Question: How can such a coding be done
algorithmically?
Answer: The Huffman codes provide exactly this
solution
Huffman codes
Algorithms and Data Structures I
164
The bitlength of the file using this K prefix code is
B(K)= cC f(c)d K (c)=
5,0002+2,0003+6,0002+3,0003+4,0002=
(5,000+6,000+4,000)2+(2,000+3,000 )3=
30,000+15,000=45,000
(cf. the fix-length codes gave 60,000,
the improved one 56,000)
Huffman codes
Algorithms and Data Structures I
165
The greedy method producing Huffman codes
1.
Sort the characters of the C alphabet in
increasing order according to their frequency
in the file and link them to an empty list
2.
Delete the two leading characters, some x and
y from the list and connect them with a
common parent z node. Let f(z)=f(x)+f(y),
insert z into the list and repeat step 2 until the
the list runs empty.
Huffman codes
Algorithms and Data Structures I
166
Example
character
List:
Huffman codes
a:5
frequency (thousands)
b:2
c:6
d:3
Algorithms and Data Structures I
e:4
167
Example
1. Sort
List:
Huffman codes
a:5
b:2
c:6
d:3
Algorithms and Data Structures I
e:4
168
Example
2. Merge and rearrange
5
List:
Huffman codes
b:2
d:3
e:4
a:5
Algorithms and Data Structures I
c:6
169
Example
2. Merge and rearrange
9
List:
5
e:4
b:2
Huffman codes
a:5
c:6
d:3
Algorithms and Data Structures I
170
Example
2. Merge and rearrange
11
List:
a:5
c:6
9
5
e:4
b:2
Huffman codes
Algorithms and Data Structures I
d:3
171
Example
2. Merge and rearrange
20
0
List:
Huffman codes
1
9
11
0
1
0
1
e:4
5
a:5
c:6
0
1
b:2
d:3
Algorithms and Data Structures I
172
Example
20
0
a = 10
11 b = 010
c1 = 11
0
d = 011
a:5
c e: 6= 00
9
Huffman codes
0
1
e:4
5
1
0
1
b:2
d:3
Algorithms and Data Structures I
173
Example
Length of file in bits
f(a) = 5,000
f(b) = 2,000
f(c) = 6,000
f(d) = 3,000
f(e) = 4,000
a = 10
B(H)= cC f(c)d H (c)= b = 010
c = 11
5,0002+2,0003+6,0002+3,0003+4,0002=
d = 011
(5,000+6,000+4,000)2+(2,000+3,000
e = 00 )3=
30,000+15,000=45,000
Huffman codes
Algorithms and Data Structures I
174
Optimality of the Huffman codes
Assertion 1. There exists an optimal solution
where the two rarest characters are deepest
twins in the tree of the coding
Assertion 2. Merging two (twin) characters leads
to a problem similar to the original one
Corollary. The Huffman codes provide an
optimal character coding
Huffman codes
Algorithms and Data Structures I
175
Proof of Assertion 1 (There exists an optimal solution where the
two rarest characters are deepest twins in the tree of the coding).
Two
rarest
characters
Huffman codes
Changing
nodes this
way the total
lenght does
not increase
Algorithms and Data Structures I
176
Proof of Assertion 2 (Merging two (twin) characters leads to a
problem similar to the original one).
Twin
characters
Huffman codes
The new
problem is
smaller than
the original
one but
similar to it
Algorithms and Data Structures I
177


Graphs can represent different structures,
connections and relations
1
Weighted graphs can
represent capacities or
actual flow rates
7
4
2
4
2
5
3
Graphs
Algorithms and Data Structures I
178

1
7
1
2
3
4
1
0
1
2
0
71
2
1
2
0
0
1
4
3
0
0
0
1
5
4
1
7
1
4
51
0
4
2
4
2
5
3
Drawback
1:
there is an
1: edge
redundant
elements
from ‘row’ to ‘column’
0: there is no
Drawback
2: superfluous
such edge elements
Graphs
Algorithms and Data Structures I
179

1
4
2
3


Graphs
1
2
4
2
4
1
3
4
4
1
3
2
Optimal storage usage
Drawback: slow search operations
Algorithms and Data Structures I
180



Graphs
Problem: find the shortest path between two
vertices in a graph
Source: the starting point (vertex)
Single-source shortest path method: algorithm
to find the shortest path to all vertices in a
graph running out
Algorithms and Data Structures I
181
Walk a graph:

choose an initial vertex as the source

visit all vertices starting from the source
Graph walk methods:

depth-first search

Graph walk
Algorithms and Data Structures I
182
Depth-first search


Backtrack algorithm
It goes as far as it can without revisiting any
vertex, then backtracks
source
Graph walk
Algorithms and Data Structures I
183


Like an explosion in a mine
The shockwave reaches the adjacent vertices
first, and starts over from them
Graph walk
Algorithms and Data Structures I
184
The breadth-first search is not only simpler to
implement but it is also the basis for several
important graph algorithms (e.g. Dijkstra)
Notation in the following pseudocode:
 A is the adjacency-matrix of the graph
 s is the source
 D is an array containing the distances from the
source
 P is an array containing the predecessor along a
path
 Q is the queue containing the unprocessed
Graph walk
Algorithms and Data Structures I
185
1 for i  1 to A.CountRows
2
do P[i]  0
3
D[i]  ∞
4 D[s]  0
5 Q.Enqueue(s)
6 repeat
7
v  Q.Dequeue
8
for j  1 to A.CountColumns
9
do if A[v,j] > 0 and D[j] = ∞
10
then D[j]  D[v] + 1
11
P[j]  v
12
Q.Enqueue(j)
13 until Q.IsEmpty
Graph walk
Algorithms and Data Structures I
186
The D,P pairs are displayed in the figure.
1,4
1,4
2,6
6
1
9
0,0
4
3,9
1,4
2
1,4

3,9
5
3

8
2,6
2,6
10
7
D is the shortest distance from the source
The shortest paths can be reconstructed using P
Graph walk
Algorithms and Data Structures I
187
Problem: find the shortest path between two
vertices in a weighted graph
Idea: extend the breadth-first search for graphs
having integer weights:
unweighted edges (total weight = 3∙1 = 3)
3
virtual vertices
Dijkstra’s algorithm
Algorithms and Data Structures I
188
Dijkstra(A,s,D,P)
1 for i  1 to A.CountRows
2
do P[i]  0
minimum priority queue
3
D[i]  ∞
4 D[s]  0
5 for i  1 to A.CountRows
6
do M.Enqueue(i)
7 repeat
8
v  M.ExtractMinimum
9
for j  1 to A.CountColumns
10
do if A[v,j] > 0
11
then if D[j] > D[v] + A[v,j]
12
then D[j]  D[v] + A[v,j]
13
P[j]  v
14 until M.IsEmpty
Dijkstra’s algorithm
Algorithms and Data Structures I
189
Time complexity of Dikstra’s algorithm



Initialization of D and P: O(n)
Building a heap for the priority queue: O(n)
Search: n∙O(logn + n) = O(n(logn + n)) = O(n2)
Grand total: T(n) = O(n2)
extracting the minimum
checking all neighbors
number of loop executions
Dijkstra’s algorithm
Algorithms and Data Structures I
190
```