Data Structures and Algorithms

Download Report

Transcript Data Structures and Algorithms

Data Structures and Algorithms

Week 2 Dr. Ken Cosh

Week 1 Review

• Introduction to Data Structures and Algorithms • Background – Computer Programming in C++ – Mathematical Background

• Arrays • Vectors • Strings

Week 2 Review

Week 2 Topics

• Complexity Analysis – Computational and Asymptotic Complexity – Big-O Notation – Properties of Big-O Notation – Amortized Complexity – NP-Completeness

Computational Complexity

• The same problem can be solved using many different algorithms; – Factorials can be calculated iteratively or recursively – Sorting can be done using shellsort, heapsort, quicksort etc.

• So how do we know what the best algorithm is? And why do we need to know?

Why do we need to know?

• If searching for a value amongst 10 values, as with many of the exercises we have encountered while learning computer programming, the efficiency of the program is maybe not as significant as getting the job done. • However, if we are looking for a value amongst several trillion values, as only one step in a longer algorithm establishing the most efficient searching algorithm is very significant.

How do we find the most efficient algorithm?

• To compare the efficiency of algorithms,

computational complexity

can be used.

• Computational Complexity is a measure of how much effort is needed to apply an algorithm, or how much it costs.

• An algorithm’s cost can be considered in different ways, but for our means

Time

and

Space

are critical. Time being the most significant.

Computational Complexity Considerations

• Computational Complexity is both platform / system and language dependent; – An algorithm will run faster on my PC at home than the PC’s in the lab.

– A precompiled program written in C++ is likely to be much faster than the same program written in Basic.

• Therefore to compare algorithms all should be run on the same machine.

Computational Complexity Considerations II

• When comparing algorithm efficiencies, real-time units such as nanoseconds need not be used.

• Instead logical units representing the relationship between ‘n’ the size of a file, and ‘t’ the time taken to process the data should be used.

Time / Size relationships

• Linear – If t=cn, then an increase in the size of data increases the execution time by the same factor • Logarithmic – If t=log 2 n then doubling the size ‘n’ increases ‘t’ by one time unit.

Asymptotic Complexity

• Functions representing ‘n’ and ‘t’ are normally much more complex, but calculating such a function is only important when considering large bodies of data, large ‘n’.

• Ergo, any terms which don’t significantly affect the outcome of the function can be eliminated, producing a function which approximates the functions efficiency. This is called Asymptotic Complexity.

Example I

• Consider this example; – F(n) = n 2 + 100n + log 10 n + 1000 • For small values of n, the final term is the most significant.

• However as n grows, the first term becomes most significant. Hence for large ‘n’ it isn’t worth considering the final term – how about the penultimate term?

Example II

n F(n) n 2 Value

1,101

Value

1

%

0.1

100n Value % log 10 n Value %

100 9.1

0 0.0

1000 Value %

1,000 90.83

1 10 100 2,101 21,002 100 10,000 4.76

47.6

1000 47.6

10,000 47.6

1000 10000 100000 1,101,003 101,001,0 04 10,010,00 1,005 1,000,00 0 100,000, 000 10,000,0 00,000 90.8

99 100,000 1,000,000 9.1

0.99

99.9

10,000,000 0.099

4 5 1 0.05

1,000 47.6

2 0.001

1,000 4.76

3 0.0003

1,000 0.09

0.0

1,000 0.001

0.0

1,000 0.00

Big-O Notation

• Given 2 positively-valued functions (f() and g()); – f(n) is O(g(n)) if (c>0 and N>0) exist such that f(n) ≤ cg(n) for all n ≥ N.

(in other words) f

positive number

c

is big-O of

g

such that

f

if there is a is not larger than

cg

for sufficiently large

n

s (all

n

s larger than some number N).

– The relationship between f and g is that g(n) is an upper bound of f(n), or that in the long run f grows at most as fast as g.

Big-O Notation problems

• The problem with the definition is that while c and N must exist, no help is given towards calculating them • No restrictions are given for these values.

• No guidance for choosing values when more than one exist.

• The choice for g() is infinite! (so when dealing Big-O the smallest g() is chosen).

Example I

• Consider; • f(n) = 2n 2 + 3n + 1 = O(n 2 ) • When g(n) = n 2 , candidates for c and N can be calculated using the following inequality; – 2n 2 + 3n + 1 ≤ cn 2 – 2 + (3/n) + 1/n 2 ≤ c • If n = 1, c ≥ 6. If n = 2, c ≥ 3.75. If n = 3, c ≥ 3.111, If n = 4, c ≥ 2.8125….

Example II

• So what pair of c & N?

– Choose the best pair by determining when a term in f becomes the largest and stays the largest. In our equation on 2n 2 and 3n are candidates. Comparing them, 2n 2 > 3n holds true for n > 1, hence N = 2 can be chosen.

• But whats the practical significance of c and N?

– For any g an infinite number of pairs of c & N can be calculated.

– g is ‘almost always’ greater than or equal to f when multiplied by a constant. Almost always means when n is greater than N. The constant then depends on the value of N chosen.

Big-O

• Big-O is used to give an asymptotic upper bound for a function, i.e. an approximation of the upper bound of a function which is difficult to formulate.

• Just as there is an upper bound, there is a lower bound (Big Ω), we’ll come on to that shortly… • But first, some useful properties of Big-O.

Fact 1 - Transitivity

• If f(n) is O(g(n)) and g(n) is O(h(n)), then f(n) is O(h)n)) – or O(O(g(n))) is O(g(n)).

• Proof: – c 1 and N 1 – c 2 and N 2 n≥N 2 .

exist so that f(n) exist so that g(n) ≤c ≤c 1 2 g(n) for all n≥N h(n) for all – c 1 g(n)≤c 1 c 2 h(n) for all n≥N, when N= the larger of N 1 and N 2 – Hence if c = c 1 c 2 , f(n) ≤c 1 h(n) for all n≥N.

– f(n) is O(h)n)) 1 .

Fact 2

• If f(n) is O(h(n)) and g(n) is O(h(n)), then f(n) + g(n) is O(h(n)).

• Proof: – After c = c 1 +c 2 , f(n)+g(n)≤ch(n).

Fact 3

• The function an k is O(n k ) • Proof: – For the inequality an k ≤cn k to hold, c≥a is necessary.

Fact 4

• The function n k is O(n k+j ) for any positive j.

• Proof: – This is true if c=N=1.

• From this, it is clear that every polynomial is big-O of n raised to the largest power; – f(n) = a k n k + a k-1 n k-1 + … + a 1 n + a 0 is O(n k )

Big-O and Logarithms

• First lets state that if the complexity of an algorithm is on the order of a logarithmic function, it is very good! (Check out slide 12).

• Second lets state that despite that, there are an infinite number of better functions, however very few are useful; O(lg lg n) or O(1).

• Therefore, it is important to understand big-O when it comes to Logarithms.

Fact 5 - Logarithms

• The function log a n is O(log b n) for any positive a and b ≠ 1.

• This means that regardless of their bases all logarithmic functions are big-O of each other; i.e. all have the same rate of growth.

• Proof: – log a n = x, log b n = y, i.e. a x =n, b y =n – ln of both sides gives, x ln a = ln n and x ln b = ln n – x ln a = y ln b – ln a log a n = ln b log b n – log a n = (ln b / ln a) log b n = c log b n – Hence log a n and log b n are multiples of each other.

Fact 5 (cont.)

• Because the base of a logarithm is irrelevant in terms of big-O we can use just one base; – Log a n is O(lg n) for any positive a ≠1, where lg n = log 2 n

Big Ω

• Big-O refers to the upper bound of functions. The opposite of this is a definition for the lower bound of functions, known as big Ω (big omega) – f(n) is Ω(g(n)) if (c>0 and N>0) exist such that f(n) ≥ cg(n) for all n ≥ N.

(in other words) f

is big Ω of

g

if there is a positive number

c

to

cg

for almost all such that

n

s (all

n f

is at least equal s larger than some number N).

– The relationship between f and g is that g(n) is an lower bound of f(n), or that in the long run f grows at least as fast as g.

Big Ω example

• Consider: • f(n) = 2n 2 + 3n + 1 = Ω(n 2 ) • When g(n) = n 2 , candidates for c and N can be calculated using the following inequality; – 2n 2 + 3n + 1 ≥ cn 2 – 2 + (3/n) + 1/n 2 ≥ c • As we saw before, in this equation c tends towards 2 as n grows, hence the proposal is true for all c≤2.

Big Ω

• f(n) is Ω(g(n)) iff g(n) is O(f(n)) • There is a clear relationship between big- Ω and big-O, and the same (in reverse) problems and facts hold true for in both cases; – There are still infinite numbers of big-Ω equations.

• Therefore we can explore the relationship between big-O and big Ω further by introducing big Θ (theta), which restricts the sets of possible upper and lower bounds.

Big Θ

• f(n) is Θ(g(n)) if c 1 ,c 2 ,N > 0 exist such that c 1 g(n) ≤ f(n) ≤ c 2 g(n) for all n≥N.

• From this f(n) is Θg(n) if both functions grow at the same rate in the long run.

O, Ω & Θ

• For the function; – f(n) = 2n – g(n) = n 2 2 + 3n + 1 • Options for big-O include; , g(n) = n 3 , g(n) = n 4 etc.

• Options for big-Ω include; – g(n) = n 2 , g(n) = n, g(n) = n ½ • Options for big-Θ include; – g(n) = n 2 , g(n) = 2n 2 , g(n) = 3n 2 • Therefore, while there are still an infinite number of equations to choose from, it is obvious which equation should be chosen.

Possible problems with Big-O

• Given the rules of Big-O an equation g(n) can be chosen such that f(n) ≤cg(n) assuming the constant c is large enough. • As c grows, the number of exceptions (essentially n) is reduced.

• If c=10 8 , g(n) might not be very useful for approximating f(n), as our algorithm may never need to perform 10 unnecessarily.

8 operations.

• This may lead to algorithms being rejected • If c is too large for practical significance g(n) is said to be OO of f(n) (double O), however ‘too large’ depends upon the application.

Why Complexity Analysis?

• Today’s computers can perform millions of operations per second at relatively low cost, so why complexity analysis?

– With a PC that can perform 1 million operations per second and 1 million items to be processed. • A quadratic equation O(n 2 ) would take 11.6 days. • A cubic equation O(n 3 ) would take 31,709 years.

• An exponential equation O(2 n ) is not worth thinking about.

Why Complexity Analysis

• Even a 1,000 times improvement in processing power (consider Moore’s Law).

– The cubic equation would take over 31 years.

– The quadratic would still be over 16 minutes.

• To make scalable programs algorithm complexity does need to be analysed.

Complexity Classes

1 operation per μsec (microsecond), 10 operations to be completed.

• Constant = O(1) = 1 μsec • Logarithmic = O(lg n) = 3 μsec • Linear = O(n) = 10 μsec • O(n lg n) = 33.2 μsec • Quadratic = O(n 2 ) = 100 μsec • Cubic = O(n 3 ) = 1msec • Exponential = O(2 n ) = 10msec

Complexity Classes

1 operation per μsec (microsecond), 10 2 operations to be completed.

• Constant = O(1) = 1 μsec • Logarithmic = O(lg n) = 7 μsec • Linear = O(n) = 100 μsec • O(n lg n) = 664 μsec • Quadratic = O(n 2 ) = 10 msec • Cubic = O(n 3 ) = 1 sec • Exponential = O(2 n ) = 3.17*10 17 yrs

Complexity Classes

1 operation per μsec (microsecond), 10 3 operations to be completed.

• Constant = O(1) = 1 μsec • Logarithmic = O(lg n) = 10 μsec • Linear = O(n) = 1 msec • O(n lg n) = 10 msec • Quadratic = O(n 2 ) = 1 sec • Cubic = O(n 3 ) = 16.7min

• Exponential = O(2 n ) = ……

Complexity Classes

1 operation per μsec (microsecond), 10 4 operations to be completed.

• Constant = O(1) = 1 μsec • Logarithmic = O(lg n) = 13 μsec • Linear = O(n) = 10 msec • O(n lg n) = 133 msec • Quadratic = O(n 2 ) = 1.7 min • Cubic = O(n 3 ) = 11.6 days

Complexity Classes

1 operation per μsec (microsecond), 10 5 operations to be completed.

• Constant = O(1) = 1 μsec • Logarithmic = O(lg n) = 17 μsec • Linear = O(n) = 0.1 sec • O(n lg n) = 1.6 sec • Quadratic = O(n 2 ) = 16.7 min • Cubic = O(n 3 ) = 31.7 years

Complexity Classes

1 operation per μsec (microsecond), 10 6 operations to be completed.

• Constant = O(1) = 1 μsec • Logarithmic = O(lg n) = 20 μsec • Linear = O(n) = 1 sec • O(n lg n) = 20 sec • Quadratic = O(n 2 ) = 11.6 days • Cubic = O(n 3 ) = 31,709 years

Asymptotic Complexity Example

• Consider this simple code; for (i = sum = 0; i < n; i++) sum += a[i]; – First 2 variables are initialised.

– The loop executes n times, with 2 assignments each time (one updates sum and one updates i) – Thus there are 2+2n assignments for this code; and so an Asymptotic Complexity of O(n).

Asymptotic Complexity Example 2

• Consider this code; for (i = 0; i < n; i++) { for (j = 1, sum = a[0]; j <= i; j++) sum += a[j]; cout<<“sum for subarray 0 through “ << i << “ is “<

Asymptotic Complexity Example 2 (cont.)

• Therefore there are; 1+3n+n(n-1) or O(n 2 ) • assignments before the program completes.

Asymptotic Complexity 3

• Consider this refinement; for (i = 4; i < n; i++) { for (j = i - 3, sum = a[i-4]; j <= i; j++) sum += a[j]; cout<<“sum for subarray “<

The Number Game

• I’ve picked a number between 1 and 10 – can you guess what it is?

– Take a guess, and I’ll tell you if its higher or lower than your guess.

The Number Game

• There are several approaches you could take; – Guess 1, if wrong guess 2, if wrong guess 3, etc.

– Alternatively, guess the midpoint 5. If lower guess halfway between 1 and 5, maybe 3 etc.

• Which is more better?

– It depends on what the number was! But, in each option there is a best, worst and average case.

Average Case Complexity

• Best Case; – Number of steps is smallest • Worst Case; – Number of steps is maximum • Average Case; – Somewhere in between.

– Could calculate as the sum of the number of steps for each input divided by the number of inputs. But this assumes each input has equal probability.

– So we weight calculation with the probability of each input.

Method 1

• Choose 1, if wrong choose 2 , if wrong choose 3… – Probability of success for 1 st try = 1/n – Probability of success for 2 nd try = 1/n – Probability of success for n th try = 1/n • Average; 1+2+…+n / n = (n+1)/2

Method 2

• Picking midpoints; – Method 2 is actually like searching a binary tree, so we will leave a full calculation until week 6, as right now the maths could get complicated.

– But for n=10, you should be able to calculate the average case – try it! (When n=10 I make it 1.9 times as efficient)

Average Case Complexity

• Calculating Average Case Complexity can be difficult, even if the probabilities are equal, so calculating approximations in the form of big-O, big Ω and big-Θ can simplify the task.

Amortized Complexity

• Thus far we have considered simple algorithms independently from any others, however its more likely these algorithms are part of a larger problem. • To calculate the best, worst and average case for the whole sequence, we could simply add the best, worst and average cases for each algorithm in the sequence; C worst (op 1 , op 2 , op 3 , …) = C worst (op 1 )+C worst (op 2 )+C worst (op 3 )+…

Grades Case

• Suppose I create an array in which to store student grades. I then enter the midterm grades and sort the array best to worst. Next I enter the coursework grades, and then sort the array best to worst. Finally I enter the final exam grades and sort the array best to worst.

• This is a sequence of algorithms; – Input Values – Sort Array – Input Values – Sort Array – Input Values – Sort Array

Grades Case

• However, is it fair to calculate the worst case for this program by adding the worst cases for each step?

• Is it fair to use the worst case ‘Sort Array’ cost for sorting the array every time, even after it has only changed slightly?

• Is it likely that the array will need a complete rearrangement after the coursework grade has been added? i.e. is it likely that the student who receives the lowest mid term grade then has the highest score after midterm and coursework are included?

Grades Case

• In reality it is unlikely that the worst case scenario will ever be run for the 2 nd and 3 rd array sorts, so how do we approximate an accurate worst case when combining a sequence of operations?

– Steal from the rich, and give to the poor.

– Add a little to the quick operations and take a little from the expensive operations.

– Overcharge cheap operations, undercharge expensive ones.

Bangkok

• I want to drive to Bangkok – how long will it take?

– Average Case?

– Best Case?

– Worst Case?

• How do you come to your answer?

is a library defining the vector data structure. This is how it works; – Add elements to the vector when there is space through push_back.

– When capacity is reached add to capacity through reserve.

• Suppose each time the capacity is full, we double the size of the vector – how can we estimate an amortized cost of filling the vector?

• Case of adding an element to a vector with space: – Copy new values into first available cell.

– O(1) • Case of adding an element to a full vector: – Copy existing values to new space – Add new value – O(size(vector)+1) – i.e. if the vector capacity and size is 4, the cost of adding an element would be 4+1.

Amortized cost = 2

Size 1 2 3 4 5 6 7 8 9 10 17 Capacity 1 8 8 8 8 16 2 4 4 16 32 Amortized Cost 2 2 2 2 2 2 2 2 2 2 2 Cost 0+1 1+1 2+1 1 4+1 1 1 1 8+1 1 16+1 Units Left 1 1 0 1 -2 -1 0 1 -6 -5 -14

Amortized cost = 3

Size 1 2 3 4 5 6 7 8 9 10 17 Capacity 1 8 8 8 8 16 2 4 4 16 32 Amortized Cost 3 3 3 3 3 3 3 3 3 3 3 Cost 0+1 1+1 2+1 1 4+1 1 1 1 8+1 1 16+1 Units Left 2 3 3 5 3 5 7 9 3 5 3

Amortized Cost

• From the previous 2 tables it can be seen that an amortized cost of ‘2’ is not enough.

• With an Amortized cost of 3, there are sufficient units left over to cover expensive operations.

• Finding an acceptable amortized cost is however not always that easy.

Difficult Problems

• It would be ideal if problems were of class constant, linear or logarithmic.

• However, many problems we will look at are polynomial class problems (quadratic / cubic or worse) - P • Unfortunately, there are many important problems whose best algorithms are very complex, sometimes taking exponential time (and in fact sometimes worse!) • As well as EXPTIME problems there is another class of problem call NP-Complete, which is bad news, ‘evidence’ that some problems just can’t be solved easily.

NP-Complete

• Why worry about it?

– Knowing that some problems are NP Complete saves you blindly trying to find a solution to them.

NP-Complete

• Background – P refers to the class of problems which can be solved in polynomial time.

– NP stands for “Non-deterministic Polynomial Time” • Essentially here, we can test whether a proposed solution is correct fairly quickly, but finding a solution is difficult. There is no problem with an NP problem if we could only guess the right solution!

NP Examples

• Long Simple Path – Finding a path through a graph from A to B traveling over ever vertex once and only once is very difficult, but if I tell you a solution path it is relatively simple for you to check it. The Traveling Salesman Problem is a long ongoing problem, with huge financial rewards for a successful solution!

• Cracking Cryptography – It’s difficult to break encryption, but if I give you a solution, it is easy to test it works.

• Infinite Loop Checking – Ever wondered why your compiler doesn’t tell you you’ve got an infinite loop? This problem is actually much harder than NP – a class of complexity known as ‘undecidable’

P vs NP

• Arguably one of the most famous current theoretical science debates concerns whether P=NP, with many theoreticians divided.

– While all P problems are NP, is the reverse true?

– If it is always easy to check a solution, should it also be easy to find the solution? Can you prove either way?

• This leads us to a complexity framework where we can’t prove that a problem isn’t P, but known to be NP, and this is where NP-Complete fits in.

NP-Complete

• NP-Complete problems are the hardest problems within NP, which are not known to have solutions in polynomial time.

• We are still left with the problem of identifying NP-Complete problems.

– How can we prove that a problem is “not known to” have a solution in polynomial time?

• (Rather than just a problem we haven’t solved?)

Reduction

• We can often reduce problems; – Problem A, can be solved by an algorithm involving a number of calls to Algorithm B.

– The number of calls could 1, a constant or polynomial.

– If Algorithm B is P, then this demonstrates that Algorithm A is also P.

• The same theory applies to NP-Complete problems. – If Problem is NP-Complete if it is NP, and all other NP problems are polynomially reduced to it.

– The astute will realise that to prove a problem is NP-Complete it takes a problem which has already been proved to be NP Complete. A kind of Chicken and Egg scenario, where fortunately Cook’s satisfiability problem came first.

• We will encounter more NP-Complete problems when dealing with graphs later in the course.