ALGORITHMICS - West University of Timișoara

Download Report

Transcript ALGORITHMICS - West University of Timișoara

LECTURE 5:
Analysis of Algorithms Efficiency (II)
Algorithmics - Lecture 5
1
In the previous lecture we saw …
Which are the basic steps in analyzing the time efficiency of an
algorithm:
• Identify the input size
• Identify the dominant operation
• Count for how many times the dominant operation is executed
(estimate the running time)
• If this number depends on the properties of input data we analyze:
– Best case => lower bound of the running time
– Worst case => upper bound of the running time
– Average case => averaged running time
Algorithmics - Lecture 5
2
Today we will see that …
… the main aim of efficiency analysis is to find out how increases
the running time when the problem size increases
… we don’t need very detailed expressions of the running time, but
we need to identify:
– The order of growth of the running time
– The efficiency class to which an algorithm belongs
Algorithmics - Lecture 5
3
Outline
• What is the order of growth ?
• What is asymptotic analysis ?
• Some asymptotic notations
• Efficiency analysis of basic processing structures
• Efficiency classes
• Empirical analysis of the algorithms
Algorithmics - Lecture 5
4
What is the order of growth?
In the expression of the running time one of the terms will become
significantly larger than the other ones when n becomes large :
this is the so-called dominant term
T1(n)=an+b
T2(n)=a log n+b
T3(n)=a n2+bn+c
T4(n)=an+b n +c
(a>1)
Dominant term: a n
Dominant term: a log n
Dominant term: a n2
Dominant term: an
Algorithmics - Lecture 5
5
What is the order of growth?
Let us analyze what happens with the dominant term when the input
size is multiplied by k:
T’1(n)=an
T’1(kn)= a kn=k T’1(n)
T’2(n)=a log n
T’2(kn)=a log(kn)=T’2(n)+alog k
T’3(n)=a n2
T’3(kn)=a (kn)2=k2 T’3(n)
T’4(n)=an
T’4(kn)=akn=(an)k =T’4(n)k
Algorithmics - Lecture 5
6
What is the order of growth?
The order of growth expresses how increases the dominant term of
the running time with the input size
Order of growth
Linear
T’1(kn)= a kn=k T’1(n)
T’2(kn)=a log(kn)=T’2(n)+a log k
Logarithmic
T’3(kn)=a (kn)2=k2 T’3(n)
Quadratic
T’4(kn)=akn=(an)k =(T’4(n))k
Exponential
Algorithmics - Lecture 5
7
How can be interpreted the order of growth?
Between two algorithms it is considered that the one having a smaller
order of growth is more efficient
However, this is true only for large enough input sizes
Example. Let us consider
T1(n)=10n+10 (linear order of growth)
T2(n)=n2
(quadratic order of growth)
If n<=10 then T1(n)>T2(n)
In this case the order of growth is relevant only for n>10
Algorithmics - Lecture 5
8
A comparison of orders of growth
The multiplicative constants in the dominant term can be ignored
n
log2n
nlog2n
n2
2n
10
3.3
33
100
1024
100
6.6
664
10000
1030
1000
10
9965
1000000
10301
10000
13
132877
100000000
103010
Algorithmics - Lecture 5
9
Comparing orders of growth
The order of growth of two running times T1(n) and T2(n) can be
compared by computing the limit of T1(n)/T2(n) when n goes to
infinity.
- If the limit is 0 then T1(n) has a smaller order of growth than
T2(n)
- If the limit is a finite constant c (c>0) then T1(n) and T2(n) have
the same order of growth
- If the limit is infinity then T1(n) has a larger order of growth than
T2(n)
Algorithmics - Lecture 5
10
Outline
• What is the order of growth ?
• What is asymptotic analysis ?
• Some asymptotic notations
• Efficiency analysis of basic processing structures
• Efficiency classes
• Empirical analysis of the algorithms
Algorithmics - Lecture 5
11
What is asymptotic analysis ?
• The differences between orders of growths are more significant for
larger input size
• Analyzing the running times on small inputs does not allow us to
distinguish between efficient and inefficient algorithms
• Asymptotic analysis deals with analyzing the properties of the
running time when the input size goes to infinity
(this means a large input size)
Algorithmics - Lecture 5
12
What is asymptotic analysis ?
• Depending on the behavior of the running time when the input size
becomes large, the algorithm can belong to different classes of
efficiency
• There are several standard notations used in algorithms efficiency
analysis :
 (big – Theta)
O (big – O)
Ω (big – Omega)
Algorithmics - Lecture 5
13
Outline
• What is the order of growth ?
• What is asympotic analysis ?
• Some asymptotic notations
• Efficiency analysis of basic processing structures
• Efficiency classes
• Empirical analysis of the algorithms
Algorithmics - Lecture 5
14
 - notation
Let f,g: N-> R+
Definition.
f(n) (g(n)) iff there exist c1, c2 > 0 and n0 N such that
c1g(n) ≤f(n) ≤ c2g(n) for all n≥n0
Notation. f(n)=  (g(n)) (same order of growth as g(n))
Examples.
1.
T(n) = 3n+3  T(n)   (n)
c1=2, c2=4, n0=3, g(n)=n
2.
T(n)= n2+10 nlgn +5  T(n)   (n2)
c1=1, c2=2, n0=40, g(n)=n2
Algorithmics - Lecture 5
15
 - notation
Graphical illustration. f(n) is bounded, for large values of n, both
above and below by g(n) multiplied by some positive constants
c2g(n)=2n2
20000
c1g(n) ≤ f(n) ≤ c2g(n)
15000
f(n)= n2+10 nlgn +5
10000
Doesn’t matter
c1g(n)=n2
5000
(n2)
0
0
20
40
n0
60
Algorithmics - Lecture 5
80
100
16
 - notation. Properties
1.
If T(n)=aknk+ak-1nk-1+…+a1n+a0 then T(n)  (nk)
Proof. Since T(n)>0 for all n it follows that ak>0.
Then T(n)/nk ->ak (as n->).
Thus for all ε>0 there exists N(ε) such that
|T(n)/nk- ak|< ε => ak- ε<T(n)/nk<ak+ ε for all n>N(ε)
Let us suppose that ak- ε>0.
Then by taking c1=(ak- ε), c2=ak+ ε and n0=N(ε) one obtains
c1nk < T(n) <c2nk for all n>n0, i.e. T(n)  (nk)
Algorithmics - Lecture 5
17
 - notation. Properties
2. (c g(n))= (g(n)) for all constant c
Proof. Let f(n)  (cg(n)).
Then c1cg(n) ≤ f(n) ≤ c2cg(n) for all n≥n0.
By taking c’1= cc1 and c’2= c c2 we obtain that f(n)  (g(n)).
Thus (cg(n)) (g(n)).
Similarly we can prove that  (g(n))   (cg(n)), so  (cg(n))=  (g(n)).
Particular cases:
a) (c)= (1)
b) (logah(n))= (logbh(n)) for all a,b >1
The logarithm’s base is not significant in specifying the efficiency class.
We will use logarithms in base 2
Algorithmics - Lecture 5
18
 - notation. Properties
3. f(n)  (f(n))
(reflexivity)
4. f(n)  (g(n)) => g(n)  (f(n)) (symmetry)
5. f(n)  (g(n)) , g(n)  (h(n)) => f(n)  (h(n)) (transitivity)
6. (f(n)+g(n)) = (max{f(n),g(n)})
Algorithmics - Lecture 5
19
 - notation. More examples
3.
3n<=T(n) <=4n-1  T(n)  (n)
c1=3, c2=4, n0=1
4.
Multiplying two matrices: T(m,n,p)=4mnp+5mp+4m+2
Extension of the definition:
f(m,n,p)  (g(m,n,p)) iff
there exist c1, c2 >0 and m0,n0 ,p0  N such that
c1g(m,n,p) <=f(m,n,p) <=c2g(m,n,p) for all m>=m0, n>=n0, p>=p0
Thus T(m,n,p)  (mnp)
5.
Sequential search: 6<= T(n) <= 3(n+1)
If T(n)=6 then we cannot find c1 such that 6 >= c1n for large values of n.
Thus T(n) does not belong to (n).
There exist running times which do not belong to a big-theta class.
Algorithmics - Lecture 5
20
O - notation
Definition.
f(n) O(g(n)) iff there exist c >0 and n0  N such that
f(n) <=c g(n) for all n>=n0
Notation. f(n)= O(g(n)) (an order of growth at most as that of g(n))
Examples.
T(n) = 3n+3  T(n)  O(n)
c=4, n0=3, g(n)=n
2. 6<= T(n) <= 3(n+1) T(n)  O(n)
c=4, n0=3, g(n)=n
1.
Algorithmics - Lecture 5
21
O - notation
Graphical illustration. f(n) is bounded above, for large values of n, by g(n)
multiplied by a positive constant
10000
cg(n)=n2
8000
6000
f(n)<=cg(n)
doesn’t matter
4000
f(n)= 10nlgn +5
2000
O(n2)
0
0
20
40
n0=36
60
80
Algorithmics - Lecture 5
100
22
O – notation. Properties
1. If T(n)=aknk+ak-1nk-1+…+a1n+a0
then T(n)  O(nd) for all d>=k
Proof. Since T(n)>0 for all n it follows that ak>0.
Then T(n)/nk -> ak (as n->).
Thus for all ε>0 there exists N(ε) such that
T(n)/nk <= ak+ ε for all n>N(ε)
Hence T(n) <= (ak+ ε)nk <= (ak+ε)nd
Then by taking c=ak+ ε and n0=N(ε) one obtains
T(n) <cnd for all n>n0, i.e. T(n)  O(nd)
Example.
n  O (n2)
(it is correct but is more useful to write n  O (n))
Algorithmics - Lecture 5
23
O – notation. Properties
2. f(n)  O(f(n)) (reflexivity)
3. f(n)  O(g(n)) , g(n)  O(h(n)) => f(n)  O(h(n)) (transitivity)
4. (g(n)) is a subset of O(g(n)
Remark. The inclusion is a strict one: there exist elements of O(g(n))
which do not belong to (g(n))
Example:
f(n)=10nlgn+5, g(n)=n2
f(n)<=g(n) for all n>=36  f(n)O(g(n))
But it don’t exist constants c and n0 such that:
cn2 <= 10nlgn+5 for all n >= n0
Algorithmics - Lecture 5
24
O – notation. Properties
When by a worst case analysis we obtain that
T(n) <= g(n) we can say that the running time of the algorithm belongs
to O(g(n))
Example. Sequential search: 6<= T(n) <= 3(n+1)
Thus the running time of sequential search belongs to O(n)
Algorithmics - Lecture 5
25
Ω – notation
Definition.
f(n) Ω(g(n)) iff there exist c > 0 and n0  N such that
cg(n) <= f(n) for all n>=n0
Notation. f(n)= Ω(g(n)) (an order of growth at least as that of g(n))
Examples.
1.
T(n) = 3n+3  T(n)  Ω(n)
c=3, n0=1, g(n)=n
2. 6<= T(n) <= 3(n+1) T(n)  Ω(1)
c=6, n0=1, g(n)=1
Algorithmics - Lecture 5
26
Ω – notation
Graphical illustration. f(n) is bounded below by g(n) multiplied by a
positive constant
f(n)=10nlgn+5
4000
Doesn’t matter
cg(n)<=f(n)
3000
2000
1000
cg(n)=20n
Ω(n)
0
0
20
n0=7
40
60
80
Algorithmics - Lecture 5
100
27
Ω – notation. Properties
1.
If T(n)=aknk+ak-1nk-1+…+a1n+a0
then T(n)  Ω(nd) for all d<=k
Proof. Since T(n)>0 for all n it follows that ak>0.
Then T(n)/nk -> ak (as n->).
Thus for all ε>0 there exists N(ε) such that
ak - ε <= T(n)/nk for all n>N(ε)
Hence (ak -ε)nd <=(ak-ε)nk <= T(n)
Then by taking c=ak- ε and n0=N(ε) one obtains
cnd <= T(n) for all n>n0, i.e. T(n)  Ω(nd)
Example.
n2  Ω (n)
Algorithmics - Lecture 5
28
Ω – notation. Properties
2.
(g(n)) Ω(g(n)
Proof. It suffices to consider only the lower bound from big-theta
definition
Remark. The inclusion is a strict one: there exist elements of Ω(g(n))
which do not belong to (g(n))
Example:
f(n)=10nlgn+5, g(n)=n
f(n) >= 10g(n) for all n>=1  f(n)  Ω(g(n))
But it don’t exist constants c and n0 such that:
10nlgn+5<=cn for all n >= n0
3. (g(n))=O(g(n))Ω(g(n)
Algorithmics - Lecture 5
29
Outline
•
What is the order of growth ?
•
What is asympotic analysis ?
•
Some asymptotic notations
•
Efficiency analysis of basic processing structures
•
Efficiency classes
•
Empirical analysis of the algorithms
Algorithmics - Lecture 5
30
Efficiency analysis of basic processing
structures
•
P:
Sequential structure
P1
(g1(n)) O(g1(n))
Ω(g1(n))
P2
(g2(n)) O(g2(n))
Ω(g2(n))
…
…
…
Pk
(gk(n)) O(gk(n))
Ω(gk(n))
---------------------------------------------------(max{g1(n),g2(n), …, gk(n)})
O(max{g1(n),g2(n), …, gk(n)})
Ω(max{g1(n),g2(n), …, gk(n)})
Algorithmics - Lecture 5
31
Efficiency analysis of basic processing
structures
•
P:
Conditional statement
IF <condition>
THEN P1
(g1(n)) O(g1(n)) Ω(g1(n))
ELSE P2
(g2(n)) O(g2(n)) Ω(g2(n))
--------------------------------------------------------------------------O(max{g1(n),g2(n)})
Ω(min{g1(n),g2(n)})
Algorithmics - Lecture 5
32
Efficiency analysis of basic processing
structures
•
P:
Loop statement
FOR i←1, n DO
P1
(1)

(n)
FOR i ← 1,n DO
FOR j ← 1,n DO
P1
(1)

(n2)
Remark: If the counting variables vary between 1 and n the
complexity is nk (k is the number superposed loops)
Algorithmics - Lecture 5
33
Efficiency analysis of basic processing
structures
Remark.
If the limits of the counters are modified inside the loop body
then the analysis need to be modified
Example:
m←1
FOR i ← 1,n DO
m ← 3*m {m=3i}
FOR j ← 1,m DO
processing step from (1)
The complexity of the sequence is:
3+32+…+3n = (3n+1-1)/2-1
The complexity (3n)
Algorithmics - Lecture 5
34
Outline
•
What is the order of growth ?
•
What is asymptotic analysis ?
•
Some asymptotic notations
•
Efficiency analysis of basic processing structures
•
Efficiency classes
•
Empirical analysis of the algorithms
Algorithmics - Lecture 5
35
Efficiency classes
Some frequently encountered efficiency classes:
Name of the class
Asymptotic
notation
Example
logarithmic
O(lgn)
Binary search
linear
O(n)
Sequential search
quadratic
O(n2)
Insertion sort
cubic
O(n3)
Multiplication of two n*n
matrices
exponential
O(2n)
Processing all subsets of a set
with n elements
factorial
O(n!)
Processing all permutations of
order n
Algorithmics - Lecture 5
36
Example
Let, x[1..n] be an array with values from the set {1,…,n}. Let
suppose that this array either has distinct elements or there is a
unique pair of indices (i,j) such that i<>j and x[i]=x[j]
Particular case:
n=8, x=[2,1,4,5,3,8,7,6]
x=[2,1,4,5,3,8,5,6]
all elements are distinct
there is a pair of identical elements
Find an efficient algorithm (both with respect the running time and
the memory space) to check if all elements are distinct or not
Algoritmica - Curs 5
37
Example
Variant 1:
check1(x[1..n])
i←1
d ← True
while (d=True) and (i<n) do
d ← NOT (search(x[i+1..n],x[i]))
i ← i+1
endwhile
return d
Problem size: n
1<= T(n)<=T’(n-1)+T’(n-2)+…+T’(1)
1<=T(n)<=n(n-1)/2
T(n)  Ω (1), T(n)  O (n2)
search(x[left..right],v)
i ← left
while x[i]<>v AND i<right do
i ← i+1
endwhile
if x[i]=v then return True
else return False
endif
Subproblem size: k=f-s+1
1<= T’(k)<=k
Best case: x[1]=x[2]
Worst case: distinct elements
Algoritmica - Curs 5
38
Example
Variant 2:
check2(x[1..n])
Integer f[1..n] // frequencies
f[1..n] ← 0
for i ← 1 to n do
f[x[i]] ← f[x[i]]+1
i←1
while f[i]<2 AND i<n do i ← i+1
if f[i]>=2 then return False
else return True
endif
Problem size: n
n+3<= T(n)<=2n
T(n)   (n)
Variant 3:
check3(x[1..n])
Integer f[1..n] // frequencies
f[1..n] ← 0
i←1
while i<=n do
f[x[i]] ← f[x[i]]+1
if f[x[i]]>=2 then return False
i ← i+1
endif
endwhile
return True
Problem size: n
4<= T(n)<=2n
T(n)  O (n)
Algoritmica - Curs 5
39
Example
Variant 4:
Variants 2 and 3 need an
additional memory space of
size O(n)
Can we solve the problem in
linear time by using an
additional memory space of
size O(1) ?
Idea: the elements are distinct
if and only if the array
contains all elements of the
set {1,2,…,n}. In the case
when only one value is
duplicated then it is enough
to check the fact that the sum
of all elements is n(n+1)/2
check4(x[1..n])
s←0
for i:=1 to n do s ← s+x[i] endfor
if s=n(n+1)/2 then return True
else return False
Endif
Problem size: n
T(n) = n
T(n)   (n)
Remark.
Variant 4 is better than variant 3 with
respect to the size of the memory
space but the average running time
is smaller in variant 3 than in
variant 4
Algoritmica - Curs 5
40
Outline
• What is the order of growth ?
• What is asympotic analysis ?
• Some asymptotic notations
• Efficiency analysis of basic processing structures
• Efficiency classes
• Empirical analysis of the algorithms
Algorithmics - Lecture 5
41
Empirical analysis of the algorithms
Sometimes the mathematical analysis of efficiency is too difficult to
apply … in these cases the empirical analysis could be useful
It can be used to:
• Develop a hypothesis about the algorithm’s efficiency
• Compare the efficiency of several algorithms designed to solve
the same problem
• Establish the efficiency of an algorithm’s implementation
• Check the accuracy of a theoretical assertion about algorithm’s
efficiency
Algorithmics - Lecture 5
42
General plan for empirical analysis
1.
Establish the aim of the analysis
2.
3.
Choose an efficiency measure (e.g. number of executions of some
operations or time needed to execute a sequence of processing
steps)
Decide on the characteristics of the input sample (size, range …)
4.
Implement the algorithm in a programming language
5.
Generate a set of data inputs
6.
Execute the program for each data sample and record the results
7.
Analyze the obtained results
Algorithmics - Lecture 5
43
General plan for empirical analysis
Efficiency measure. It is chosen depending on the aim of the empirical
analysis:
•
If the aim is to estimate the efficiency class an adequate efficiency
measure is the number of operations
•
If the aim is to analyze/compare the implementation of an algorithm
on a given machine an adequate efficiency measure is the physical
time
Algorithmics - Lecture 5
44
General plan for empirical analysis
Set of input data. Different input data must be generated in order to
conduct a useful empirical analysis
Some rules in generating input data:
•
•
•
The input data in the set should be of different sizes and values (the
entire range of values should be represented)
All characteristics of input data should be represented in the sample
set (different configurations)
The data should be typical (not only exceptions)
Algorithmics - Lecture 5
45
General plan for empirical analysis
Algorithm’s implementation. Some monitoring processing step should be
included:
•
Counting variables (when the efficiency measure is the number of
executions)
•
Calls of some functions which return the current time (in order to
estimate the time needed to execute a processing sequence)
Algorithmics - Lecture 5
46
Next lecture will be on …
… basic sorting algorithms
… on their correctness and
… on their efficiency
Algorithmics - Lecture 5
47