Software Design III - University of Wisconsin–La Crosse

Download Report

Transcript Software Design III - University of Wisconsin–La Crosse

Algorithm Analysis
Examining code
and making the
right choice using
lots of math stuff!
1
Algorithm Analysis
- also called “Asymptotic Analysis”
- why & when?
* justify the choice of one over the other
* justify the need for developing a new
algorithm
2
Quantities:
• number of processors
• the size of the memory
• the running time
Applications require massive processing of large
data set: molecular modeling, weather
forecasting, image analysis, neural network
training, and simulation.
3
Efficiency
the problem within its resource constraints.
– Space
– Time
• The cost of a solution is the amount of
resources that the solution consumes.
4
Complexity
• An algorithm can be characterized in terms of it’s
complexity
– Time complexity: how much time will the algorithm take to
complete?
– Space complexity: how much storage will the algorithm require?
• Sometimes time complexity must be minimized:
– Real time systems or interactive interfaces
• Sometimes space complexity must be minimized:
– Mobile devices with limited memory
5
How Fast is an Algorithm?
• Experimental
– Must actually implement the algorithm
– Results are system dependent (language, OS, hardware)
– Results are for the tested input only. Testing does not “prove”
anything about data that hasn’t been tested.
• Analytical
– Do not need to implement the algorithm
– Results are not dependent on a particular system
– Conclusions can be determined for all cases without performing
any tests!
6
Estimation Techniques
Known as “back of the envelope” or
“back of the napkin” calculation
1. Determine the major parameters that effect the
problem.
2. Derive an equation that relates the parameters
to the problem.
3. Select values for the parameters, and apply
the equation to yield an estimated solution.
7
How to Measure Efficiency?
Factors affecting running time:
For most algorithms, running time depends on
“size” of the input.
Running time is expressed as T(n) for some
function T on input size n.
8
Analysis is Fun!
•
Each CPU operation takes time
–
–
–
–
–
–
–
arithmetic operations (+, -, /, *, %)
logical operations (!, &&, ||, ^)
comparisons (<,==,>,<=, >=, !=)
assignment
method calls and method returns
array access
member access
•
These operations may actually take different lengths of time (depending on the
CPU and operating system) but assume (for simplicity) that each operation
takes the same amount of time to complete.
•
The length of time a program will take to execute is given by the total number
of operations multiplied by the length of time per operation
TotalTimeprogram = Nop * Timeop
9
Asymptotic Notation
• A theoretical means of ordering functions
• Hides unnecessary details
• Compares relative growth rates of functions
- not the functional values
Asymptotic Notation
Name
Meaning
f(n) is O(g(n))
Big-o
f≤g
f(n) is (g(n))
Big-theta
f=g
f(n) is  (g(n))
Big-omega
f≥g
f(n) is o(g(n))
Little-o
f<g
10
11
12
Examples of Growth Rate
Example 1.
// Find largest value
public int largest(int[] arr) {
int currlarge = 0;
for (int i=1; i<arr.length; i++) {
if (array[currlarge] < array[i])
currlarge = i;
}
return currlarge;
}
13
Examples (cont)
Example 2: Assignment statement.
Example 3:
sum = 0;
for (i=1; i<=n; i++)
for (j=1; j<n; j++)
sum++;
}
14
Growth Rate Graph
15
Best, Worst, Average Cases
Not all inputs of a given size take the same
time to run.
Sequential search for K in an array of n
integers:
•
Begin at first element in array and look at
each element in turn until K is found
Best case: T(n)=1
Worst case: T(n)=n
Average case: T(n)=n/2
16
Which Analysis to Use?
While average time appears to be the fairest
measure, it may be difficult to determine.
When is the worst case time important?
17
A comparison of growth-rate functions: (a) in tabular form
18
A comparison of growth-rate functions: (b) in graphical form
19
Asymptotic Analysis: Big-oh
Definition: For T(n) a non-negatively valued
function, T(n) is in the set O(f(n)) if there
exist two positive constants c and n0
such that T(n) <= cf(n) for all n > n0.
Usage: The algorithm is in O(n2) in [best, average,
worst] case.
Meaning: For all data sets big enough (i.e., n>n0),
the algorithm always executes in less than
cf(n) steps in [best, average, worst] case.
20
f(n) is in O(g(n)) : f(n) is bounded
from above by g(n) for all n >= n
0
21
Big-oh Notation (cont)
Big-oh notation indicates an upper bound.
Example: If T(n) = 3n2 then T(n) is in O(n2).
tightest upper bound:
While T(n) = 3n2 is in O(n3), we prefer O(n2).
22
Big-Oh Examples
Example 1: Finding value X in an array
(average cost).
T(n) = csn/2.
For all values of n > 1, csn/2 <= csn.
Therefore, by the definition, T(n) is in O(n)
for n0 = 1 and c = cs.
23
Big Oh Example
T(N) = 7n-3
f(n) = n
80
70
Function Value
60
50
40
T(n)
30
F(N)
20
10
0
-10
0
1
2
3
4
5
6
7
8
9
10
N
Question: Is T(n) in O(f(n))?
Or:
Do positive constants c and n0 exist such that T(n) <= c*f(n) for all n >= n0?
24
Big Oh Example
T(n) = n*n/2
f(n) = 20*n
1400
Function Value
1200
1000
800
T(n)
F(n)
600
400
200
0
0
5
10
15
20
25
30
35
40
45
50
N
Question: Is T(n) in O(f(n))?
Or:
Do positive constants c and n0 exist such that T(n) <= c*f(n) for all n >= n0?
25
Common Growth Rates
Name
Big Oh Notation
Constant
O(1)
Logarithmic
O(log N)
Log-squared
O(log2 N)
Linear
O(N)
O(N log N)
Quadratic
O(N2)
Cubic
O(N3)
Exponential
O(2N)
26
Plot of Common Functions
(n ranges from 1 to 2)
8
6
Log n
Log n ^2
4
n
nLog
n
n^2
2
n^3
2^n
1.2
1.4
1.6
1.8
2
27
Plot of Common Functions
(n ranges from 1 to 25)
14000
12000
Log n
10000
Log n ^2
8000
n
6000
nLog
n
n^2
4000
n^3
2000
2^n
5
10
15
20
25
28
Table of Common Functions
Growth Rates
n
log n
n^.5
n
n log n
n^2
n^3
2^n
2
1
1.4
2
2
4
8
4
4
2
2
4
8
16
64
16
8
3
2.8
8
24
64
512
256
16
4
4
16
64
256
4096
65536
32
5
5.7
32
160
1024
32768
4294967296
64
6
8
64
384
4096
262144
1.84E19
128
7
11
128
896
16384
2097152
3.4E38
256
8
16
256
2048
65536
16777216
1.15E77
512
9
23
512
4608
262144
134217728
1.34E154
29
Other Common Asymptotic
Notations
• O - “Big-Oh” notation expresses a less-than-or-equal-to
relation between two functions
– If T(n) is O(f(n) then T(n) is less-than-or-equal-to f(n)
•  - “Big Omega”
– If T(n) is (f(n)) then T(n) is greater-than-or-equal-to f(n)
•  - “Big Theta”
– If T(n) is (f(n) )then T(n) is equal-to f(n)
• o – “Little-Oh”
– If T(n) is o(f(n)) then T(n) is strictly less-than f(n)
30
Big-Omega
Definition: For T(n) a non-negatively valued
function, T(n) is in the set (g(n)) if there
exist two positive constants c and n0
such that T(n) >= cg(n) for all n > n0.
Meaning: For all data sets big enough (i.e.,
n > n0), the algorithm always executes in
more than cg(n) steps.
Lower bound.
31
f(n) is in  (g(n)): f(n) is bounded
from below by g(n) for all n>=n
0
32
Big-Omega Example
T(n) = c1n2 + c2n.
c1n2 + c2n >= c1n2 for all n > 1.
T(n) >= cn2 for c = c1 and n0 = 1.
Therefore, T(n) is in (n2) by the definition.
We want the greatest lower bound.
33
Theta Notation
When big-Oh and  meet, we indicate this
by using  (big-Theta) notation.
Definition: An algorithm is said to be (h(n))
if it is in O(h(n)) and it is in (h(n)).
34
f(n) =  (g(n)): f(n) and g(n) grow at
the same rate for all n >= n
0
35
Simplifying Rules
1. If f(n) is in O(g(n)) and g(n) is in O(h(n)),
then f(n) is in O(h(n)).
2. If f(n) is in O(kg(n)) for any constant k >
0, then f(n) is in O(g(n)).
3. If f1(n) is in O(g1(n)) and f2(n) is in
O(g2(n)), then (f1 + f2)(n) is in
O(max(g1(n), g2(n))).
4. If f1(n) is in O(g1(n)) and f2(n) is in
O(g2(n)) then f1(n)f2(n) is in O(g1(n)g2(n)).
36
Rules for Asymptotic Notation
• It is bad style to include constants or lowerorder terms when using asymptotic notation
• Example:
– If T(n) is O(10n3+5n-12) we should write that T(n) is O(n3)
37
More
• If T(n) is a polynomial of degree k then
– T(n) is (nk)
• Logkn is O(n) for any constant k
38
Analysis Example
Problem: Write an efficient method to compute xn where x is
a real number and n is a non-negative integer.
public double power(double x, int n) {
double result = 1;
for(int i=1; i<=n; i++) {
result *= x;
}
return result;
}
Is this fast
enough?
39
Analysis Example
Problem: Write an efficient method to compute xn where x is
a real number and n is a non-negative integer.
public double power(double x, int n) {
if(n == 0) return 1;
else if(n == 1) return x;
else if(n % 2 == 0) return power(x*x, n/2);
else return power(x*x, n/2) * x;
}
Am I good
or what?
Notice the following:
x3 = (x2)x
x4 = (x2)2
x5 = (x2)2x
x6 = (x2)3
x7 = (x2)3x
x8 = (x2)4
Show me the money!
40
Analysis Example
Problem: Write an efficient method to compute xn where x is
a real number and n is a non-negative integer.
public double power(double x, int n) {
if(n == 0) return 1;
else if(n == 1) return x;
else if(n % 2 == 0) return power(x*x, n/2);
else return power(x*x, n/2) * x;
}
Think, think,
think, ….!
Are the following modifications OK?
•
•
•
return power(power(x,2), n/2);
return power(power(x, n/2), 2);
return power(x, n/2) * power(x, n/2);
41
Analysis Example
Maximum Subsequence Sum Problem
Problem: Given an ordered set of (possibly negative)
integers A1, A2, A3, …, AN write an efficient method to find
the maximum value of
j
A
k
1  i  jN
k =i
For convenience, the value is defined to be 0 if all the numbers are negative.
Example.
Given [-2, 11, -4, 13, -5, -2] what is the correct answer?
42
First Idea
public int maxSubSum1(int[] a) {
int maxSum = 0;
for(int i=0; i<a.length; i++){
for(int j=i; j<a.length; j++) {
int thisSum = 0;
for(int k=i; k<=j; k++) {
thisSum += a[k];
}
if(thisSum > maxSum) {
maxSum = thisSum;
}
}
}
return maxSum;
}
How “good” is this solution?
43
Second Idea
public int maxSubSum2(int[] a) {
int maxSum = 0;
for(int i=0; i<a.length; i++){
int thisSum = 0;
for(int j=i; j<a.length; j++) {
thisSum += a[j];
if(thisSum > maxSum) {
maxSum = thisSum;
}
}
}
return maxSum;
}
How “good” is this solution?
44
Third Idea
public int maxSubSum3(int[] a) {
int maxSum = 0, thisSum = 0;
for(int i=0; i<a.length; i++){
thisSum += a[i];
if(thisSum > maxSum) {
maxSum = thisSum;
} else if(thisSum < 0) {
thisSum = 0;
}
}
return maxSum;
}
How “good” is this solution?
45
Greatest Common Divisor
Problem: Compute the largest integer D that evenly divides two non-negative
integers M and N.
Example: Compute the greatest common divisor of 1989 and 1590
1989 = 1 * 3 * 3 * 13 * 17
1590 = 1 * 2 * 3 * 5 * 53
gcd(m,n) =
m
if n = 0
gcd(n, m%n)
otherwise
Example
Mr. Euclid
gcd(1440, 408) = gcd(408, 216)
gcd(408, 216) = gcd(216, 192)
gcd(216,192) = gcd(192, 24)
gcd(192, 24) = gcd(24, 0)
gcd(24, 0) = 24
46
Greatest Common Divisor
Problem: Compute the largest integer D that evenly divides two non-negative
integers M and N.
gcd(m,n) =
m
if n = 0
gcd(n, m%n)
otherwise
Implementation
int gcd(int m, int n) {
while(n != 0) {
int rem = m % n;
m = n;
n = rem;
}
return m;
}
Mr. Euclid
47
Summary
• Algorithmic analysis usually focuses on
– Determining the efficiency of an algorithm
• Quantified by growth rate using Big Oh notation
– Proving the correctness of an algorithm
• Detailed Examples of Efficiency Analysis
– Power
– Maximum subsequence
– Greatest common divisor
48