Algorithm Efficiency

Download Report

Transcript Algorithm Efficiency

    

Algorithm Efficiency

There are often many approaches (algorithms) to solve a problem. How do we choose between them?

At the heart of a computer program design are two (sometimes conflicting) goals: 1. To design an algorithm that is easy to understand, code, and debug.

2. To design an algorithm that makes efficient use of the computer’s resources.

Goal 1 is the concern of Software Engineering.

Goal 2 is the concern of data structures and algorithm analysis.

When goal 2 is important, how do we measure an

algorithm’s

cost?

How to Measure Efficiency?

 Empirical comparison (run the programs).

 Only valid for that machine.

 Only valid for that compiler.

 Only valid for that coding of the

algorithm

.

 Asymptotic Algorithm Analysis  Must identify critical resources  time - where we will concentrate  space   Identify factors affecting that resource For most algorithms, running time depends on “size” of the input.

 Running time is expressed as

T

(

n

) for some function

T

on input size

n

.

Examples of Growth Rate

Example 1: int largest (int* array, int n) { int currlarge = 0; for (int I=0; Icurrlare) currlarge=array[I]; return currlarge; } Example 2: sum = 0; for (I=1; I<=n; I++) for (j=1; j<=n; j++) sum++;

1750

Growth Rate Graphs

2n^2 2^n 1500 1250 1000 750 500 250 10 20 30 40 5n log n 20 n 10 n 50

400 350 300 250 200 150 100 50

Expanded View

2^n 2 4 6 8 2n^2 20 n 5n log n 10 n 10 12 14

Best, Worst and Average Cases

       Not all inputs of a given size take the same time.

Sequential search for 

K

in an array of

n

integers: Begin at the first element in array and look at each element in turn until

K

is found.

Best Case: Worst Case: Average Case: While average time seems to be the fairest measure, it may be difficult to determine.

When is the worst case time important?

 Time critical events (real time processing).

Faster Computer or Faster Algorithm?

What happens when we buy a computer 10 times faster?

f(n) n n’ change n’/n 10n 1,000 10,000 n’=10n 10 20n 500 5,000 n’=10n 10 5n log n 250 1,842 sqrt(10)n

2n 2 2 n 70 223 n’=sqrt(10)n 3.16

13 16 n’=n+3 >1 n:Size of input that can be processed in 1 hour (10,000 steps).

n’: Size of input that can be processed in one hour on the new machine (100,000 steps).

Asymptotic Analysis: Big-oh

 Definition:

T

(n) is in the set

O

(

f(n)

) if there exist  two positive constants c and n 0 c|

f(n)

| for all n>n 0 .

such that |

T

(n)|<= Usage: the algorithm is in O(n 2 ) in [best, average, worst] case.

  Meaning: for all data sets big enough (i.e., n>n 0 ), the algorithm always executes in less than c|

f(n)

| steps in [best, average, worst] case.

Upper Bound:  Example: if T(n)=

3n 2

then T(n) is in O(

n 2

).

 Tightest upper bound:  T(n)=

3n 2

is in O(

n 3

), we prefer O(

n 2

).

Big-oh Example

   Example 1. Finding the value

X

in an array.

 

T

(

n

)=

c s n/2

.

For all values of

n>1

, |

c s n/2

|<=

c

s |

n

|. Therefore, by the definition,

T

(

n

) is in

O

(

n

) for

n 0 =

1 and

c=c s

.

Example 2.

T

(

n

)=

c 1 n 2 +c 2 n

in the average case   |

c 1 n 2 +c 2 n

|<=|

c 1 n 2 +c 2 n 2 |<=(c 1 +c 2 )|n 2 |

for all

n>

1.

Therefore,

T

(

n

) is in

O

(

n 2

).

Example 3:

T

(

n

)=

c

. This is in

O

(1).

     

Big-Omega

Definition :

T

(

n

) is in the set

Ω

(

g(n)

) if there exist two positive constants

c

|

g(n)

| for all

n>n 0

.

c

and

n 0

such that |

T

(

n

)| >= Meaning: For all data sets big enough (i.e.,

n>n 0

), the algorithm always executes in more than

c

|

g(n)

| steps.

It is a LOWER bound.

Example:

T

(

n

)=

c 1 n 2 +c 2 n

 |

c 1 n 2 +c 2 n

|>=|

c 1 n 2

| for all

n>

1.

 |

T

(

n

)|>=

c

|

n 2

| for

c=c 1

and

n 0

=1.

Therefore,

T

(

n

) is in

Ω

(

n 2

) by the definition We want the greatest lower bound.

Theta Notation

   When big-Oh and

Ω

are the same for an algorithm, we indicate this by using

Θ

(big-Theta) notation.

Definition: an algorithm is said to be

Θ

(

h(n)

) if it is in

O

(

h(n)

) and it is in

Ω

(

h(n)

).

Simplifying rules:  if

f(n)

is in

O

(

g(n)

) and

g(n)

O

(

h(n)

).

is in

O

(

h(n)

) then

f(n)

is in    if

f(n)

is in

O

(

kg(n)

) for any constant

k>0

, then

f(n)

O

(

g(n)

).

is in if

f 1 (n)

is in

O

(

g 1 (n)

) and

f 2 (n)

is in

O

(

g 2 (n)

), then (

f 1 +f 2

)(

n

) is in

O

(max(

g 1 (n),g 2 (n)

)).

if

f 1 (n)

is in

O

(

g 1 (n)

is in

O

(

g 1 (n)g 2 (n)

).

) and

f 2 (n)

is in

O

(

g 2 (n)

), then

f 1 (n)f 2

(

n

)

Big O rules

   If T 1 (n)=O(f(n)) and T 2 (n)=O(g(n)) then  T 1 (n)+T 2 (n)=max( O(f(n)), O(g(n)) )  T 1 (n)*T 2 (n)=O(f(n)) * O(g(n)) If T(n) is a polynomial of degree k then T(n)=

Θ

(n k ).

log k n = O(n) for any constant k. Logarithms grow very slowly.

General Algorithm Analysis Rules

    The running time of a for loop is at most the running time of the statements inside the for loop times the number of iterations.

Analyze nested loops inside out. Then apply the previous rule.

Consecutive statements just add (so apply the max rule).

The running time of a if/else statement is never more than the running time of the test plus the larger of the times of the true and false case.

  

Running Time of a Program

Example 1: a=b;  this assignment statement takes constant time, so it is

Θ

(1) Example 2: sum=0; for (I=1; I<=n; I++) sum+=n; Example 3: sum=0; for (j=1; j<=n; j++) for (I=1; I<=j; I++) sum++; for (k=1; k<=n; k++) a[k]=k-1;

More Examples

  Example 4: sum1=0; for (I=1; I<=n; I++) for (j=1; j<=n; j++) sum1++; sum2=0; for (I=1; I<=n; I++) for (j=1; j<=I; j++) sum2++: Example 5: sum1=0; for (k=1; k

Binary Search

int binary (int value, int* array, int size) { int left=-1; int right=size; while (left+1!= right) { int mid=(left+right)/2; if (value < array[mid]) right=mid; else if (value>array[mid]) left=mid; else return mid; } return -1; }

Binary Search Example

Position 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Key 11 13 21 26 29 36 40 41 45 51 54 56 65 72 77 83 Now let’s search for the value 45.

Unsuccessful Search Position 0 1 2 3 4 5 6 7 8 9 10 11 12 13 45 51 54 56 65 72 Key 11 13 21 26 29 36 40 41 Now let’s search for the value 24 14 15 77 83 How many elements are examined in the worse case?

Case Study – Maximum Subsequence

 Given a sequence of integers a 1 , a 2 ,…, a n , find the subsequence that gives you the largest sum.

 Since there is no size limit on the subsequence, then if the sequence is all positive or all negative then the solution is trivial.

Simple Solution

 Look at all possible combinations of start and stop positions of the subsequence.

for (i=0 imaxsum) maxsum=thissum; }

Analysis of Simple Solution

  Inner loop is executed j-i+1 times.

The middle loop changes j from i to n-1.

   Looking at the j-I+1 when j goes from I to n-1, we have 1+2+…+(n-i).

So this is done (n-i+1)(n-i)/2 times.

(n 2 -2ni+i 2 +n-i)/2

More Analysis

 The outer loop changes i from 0 to n-1.

 n 2 summed n times is n 3       The sum of 2ni when i changes from 0 to n-1 is 2n(n 1)(n)/2 = n 3 -n 2 The sum of i 2 when i changes from 0 to n-1 is (n 1)(n)(2n-1)/6 = (2n 3 -3n 2 -n)/6 The sum of n when i changes is n 2 .

The sum of i when i changes from 0 to n-1 is (n-1)(n)/2 = (n 2 -n)/2.

Total is (n 3 + n 3 -n 2 + (2n 3 -3n 2 -n)/6+ n 2 + (n 2 -n)/2)/6 This is O(n 3 ).

An improved Algorithm

 Start at position i and find the sum of all subsequences that start at position i. Then repeat for all starting positions.

for (i=0 imaxsum) maxsum=thissum;}}

Analysis of Improved Algorithm

    The inner loop goes from i to n-1.

When i is 0, this is n times When i is 1 it is n-1 times Until i is n-1 then 1 time  Summing this up backwards, we get 1+2+…+n = n(n+1)/2=(n 2 +n)/2= O(n 2 )

Final great algorithm

for (j=0; jmaxsum)maxsum=thissum; else if(thissum<0) thissum=0; } O(n)

      

Analyzing

Problems

Upper bound: The upper bound of best known algorithm to solve the problem.

Lower bound: The lower bound for every possible algorithm to solve that problem, even unknown algorithms.

Example: Sorting Cost of I/O: Ω(n) Bubble or insertion sort O(n 2 ) A better sort (Quicksort Mergesort, Heapsort) O(

n

log

n

) We prove in chapter 8 that sorting is Ω(

n

log

n

)

Multiple Parameters

      Compute the rank ordering for all

C

a picture of

P

pixels.

pixel values in Monitors have a fixed number of colors (256, 16M, 64M). Need to count the number of each color and determine the most used and least used colors.

for (i=0; i

P

as the measure, then time is O(

P

log

P

) Which is bigger,

C

or

P

? 600x400=2400; 1024x1024=1M More accurate is O(

P

+

C

log

C

)

  

Space Bounds

Space bounds can also be analyzed with asymptotic complexity analysis.

 Time: Algorithm  Space: Data Structure Space/Time Tradeoff Principle:  One can often achieve a reduction in time if one is willing to sacrifice space, or vice versa.  Encoding or packing information   Boolean flags- takes one bit, but a byte is the smallest storage, so pack 8 booleans into 1 byte. Takes more time, less space.

Table Lookup  Factorials - Compute once, use many times Disk based Space/Time Tradeoff Principle:  The smaller you can make your disk storage requirements, the faster your program will run. Disk is slow.