Mastering Algorithms, space and time complexity • • • • • • • • The P vs NP Problem Understand that algorithms can be compared by expressing their complexity as a function.
Download ReportTranscript Mastering Algorithms, space and time complexity • • • • • • • • The P vs NP Problem Understand that algorithms can be compared by expressing their complexity as a function.
Mastering Algorithms, space and time complexity • • • • • • • • The P vs NP Problem Understand that algorithms can be compared by expressing their complexity as a function relative to the size of the problem. Understand that some algorithms are more efficient time-wise than other algorithms. Understand that some algorithms are more space-efficient than other algorithms. Big-O notation Linear time, polynomial time, exponential time. Python (try it yourself tasks) Order of complexity Ever heard of the P vs NP Problem? Before we do anything else, let me introduce you to this fabulous problem -it remains (at the time of writing) one of the major unsolved problems in Computer Science. You could be the one to solve it! But first …what is it? What does it have to do with computing and algorithms? All these questions will be answered! Why does the question matter? The P versus NP problem is a major unsolved problem in computer science. Informally, it asks whether every problem whose solution can be quickly verified by a computer can also be quickly solved by a computer. It was introduced in 1971 by Stephen Cook in his seminal paper "The complexity of theorem proving procedures" and is considered by many to be the most important open problem in the field. It is one of the seven Millennium Prize Problems selected by the Clay Mathematics Institute to carry a US$ 1,000,000 prize for the first correct solution. And it gets better …. Aside from being an important problem in computational theory, a proof either way would have profound implications for mathematics, Philosophy, cryptography, algorithm research, artificial intelligence, game theory, multimedia processing and many other fields. Really, if a solution is discovered, it could change the world as we know it! There is a subset of computer science called complexity theory and one of the main questions that is asked in P vs NP is: “How powerful can computers really be?” What do you think? Imagine an algorithm that could literally solve any problem. Hmmm….. Decision problems Life is full of all kinds of problems. Relationship problems, money problems…and I could go on. But in Maths and Computing there is a type of problem called DECISION ?PROBLEMS INPUT That is does x have a given property and can we signal this fact by the output of a “yes” or “no” (x = given set of data) Algorithm Yes No Here’s an example: Enter 2,3,5,9 EVEN-SUM Problem INPUT "is the sum of 2, 3, 5, and 9 even?" ? Here the answer would be … nope! That’s because 2 + 3 + 5 + 9 = 19, which is ? number. an odd Is the sum of the inputs EVEN? Algorithm Yes x No No Given a problem like the EVEN-SUM … The goal is to create a method by which to solve any instance of the problem. This method is called an algorithm, and an algorithm must work correctly on every possible input to be considered correct. A decision problem may have multiple algorithms that solve it. For example, two different algorithms can solve EVEN-SUM: 1. add up all the numbers and output Yes if the sum is even or No if the sum is not even 2. output Yes if an even number of the input numbers are odd, or No otherwise. Which method is better? Time efficient? Both of them solve the problem, but one of them is more time efficient, and cleverer – doing less work along the way! 1. add up all the numbers and output Yes if the sum is even or No if the sum is not even 2. output Yes if an even number of the input numbers are odd, or No otherwise. Instead of adding up all the numbers, the second algorithm makes use of the fact that the sum of two odd numbers is even! Genius! The second algorithm is more efficient than the first in the sense that as the size of the input increases, the number of steps required to solve the problem increases more slowly. Let’s look at those key words again: The second algorithm is more efficient than the first in the sense that as the size of the input increases, the number of steps required to solve the problem increases more slowly. ? ? ? ? Complexity Classes Computer scientists place decision problems in complexity classes by how efficient their most efficient known algorithm is. The complexity class P, or P for short, is the class of decision problems that we know how to solve with reasonable efficiency. Complexity Classes Similarly, NP is a complexity class that contains all problems that can be solved reasonably efficiently given an extra bit of information called a certificate or hint As an example, take the classic Travelling Salesman Problem, or TSP. An instance of TSP is something like "does there exist a route that passes through Chicago, Manhattan, Boston, and Miami exactly once each that is no more than 2000 miles long?" Definitions = the Complexity Class (or P for short) – it is the class of Decision problems that we know how to solve with reasonable efficiency = Complexity class that contains all problems that can be solved Reasonable efficiently given an extra bit of information (this Extra bit is called a certificate or hint) The travelling salesman problem (TSP) Let’s look more closely at this famous problem. And to make it more interesting, let’s call it the travelling pig problem … An instance of this problem would be something like: “Does there exist A route that passes through Cities Q, R, S, and T exactly once each that is no more than 2000 miles long?” The travelling pig problem…. We are asking the question: “Does there exist a route that passes through Cities Q, R, S, and T exactly once each that Is no more than 2000 miles long?” 1) Try all possible orderings of the four cities and compute the length of each route What if you were given a *hint*?! We are asking the question: “Does there exist a route that passes through Cities Q, R, S, and T exactly once each that Is no more than 2000 miles long?” Suppose you were given a Hint such as : Try Going via Q R S T WHEN the Hint is provided: A valid Algorithm only needs now to check the length of the suggested (hint) path And output YES if the path is short enough. Take the hint, find the solution! 1) Try all possible orderings of the four cities and compute the length of each route The hint given to you is : Try Going via Q R S T WHEN the Hint is provided: A valid Algorithm only needs now to check the length of the suggested (hint) path And output YES if the path is short enough. Definitions P = the Complexity Class (or P for short) – it is the class of Decision problems that we know how to solve with reasonable efficiency NP =Complexity class that contains all problems that can be solved reasonable efficiently given an extra bit of information (this extra bit is called a certificate or hint) = algorithm which is reasonably efficient but has been given a hint *Note that TSP is in NP (it can be solved efficiently if given a certificate) Note: TSP is in NP P = the Complexity Class (or P for short) – it is the class of Decision problems that we know how to solve with reasonable efficiency NP =Complexity class that contains all problems that can be solved reasonable efficiently given an extra bit of information (this extra bit is called a certificate or hint) Is in NP Is interestingly, also in NP (as algorithms for problems in P can simply ignore the certificate handed to them) Consider further … = There are no known (reasonably efficient) algorithms that do not need a HINT (or certificate) You could try the most obvious algorithm – which is “try all possible paths” – but this becomes terribly terribly slow when C (no of cities) is even a 100! Believe it or not the number of possible paths is a 157 digit number and trying all of these paths would take literally years on even the fastest computer! WHEN the Hint is provided: A valid algorithm only needs now to check the length of the suggested (hint) path And output YES if the path is short enough. If no known algorithms are any better Can we really say that TSP is in P? Not really! (because for TSP to be in P it would have to have some known algorithm that can be executed with reasonable efficiency and no known algorithm has been discovered. Yet – logically, it appears that P (as shown on the previous slide) is in NP (because anything in P can simply ignore the certificate). Logically, then TSP is in P – but really ….in reality (according to the evidence we have) it isn’t So do we give up because we don’t have thee evidence (or rather the power to compute the evidence) How can we prove that TSP is NOT in P? The question of P vs NP considers: 1. Is the Class P the SAME AS NP 2. Are we able to show that P is NOT the same as NP? On the other hand P = NP is easy to ‘show’ • TSP is a special kind of NP Problem called an NP-Complete Problem • All NP-Complete Problems are equivalent to each other How do we show P = NP? All we need is ONE efficient algorithm for ANY NP-Complete Problem We could of course just assume that P is NOT the same as NP, but assumptions aren’t what we’re looking for here …! Formal definition of NP-Completeness • A decision problem C is NP-complete if: 1. C is in NP, and 2. Every problem in NP is reducible in polynomial time. 3. can be shown to be in NP by demonstrating that a candidate solution to can be verified in polynomial time. 4. Note that a problem satisfying condition 2 is said to be NPhard whether or not it satisfies condition 1. Polynomial time? Polynomial time is a synonym for "tractable", "feasible", "efficient", or "fast" If a problem cannot be solved in polynomial time, it is …er….not going to be easy to solve! Some examples of polynomial time algorithms: • The quicksort sorting algorithm on n integers performs at most An^2 operations for some constant A. Thus it runs in time O(n^2) and is a polynomial time algorithm. • All the basic arithmetic operations (addition, subtraction, multiplication, division, and comparison) can be done in polynomial time. • Maximum matchings in graphs can be found in polynomial time. Problems that cannot be solved in Polynomial time? NP stands for Non-deterministic Polynomial time.: This means that the problem can be solved in Polynomial time using a Non-deterministic Turing machine Although any given solution to an NP-complete problem can be verified quickly (in polynomial time), there is no known efficient way to locate a solution in the first place; indeed, the most notable characteristic of NP-complete problems is that no fast solution to them is known. . That is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows. As a consequence, determining whether or not it is possible to solve these problems quickly, called the P versus NP problem, is one of the principal unsolved problems in computer science today! Examples of other Problems that are NP-Complete when expressed as Decision problems … if you’re interested click the links to read more! • Boolean satisfiability problem (Sat.) • N-puzzle • Knapsack problem • Hamiltonian path problem • Travelling salesman problem • Subgraph isomorphism problem • Subset sum problem • Clique problem • Vertex cover problem • Independent set problem • Dominating set problem • Graph coloring problem Big O Notation (Mathematics) In mathematics, big O notation describes the limiting behaviour of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions It is a member of a larger family of notations that is called Landau notation, Bachmann–Landau notation (after Edmund Landau and Paul Bachmann),or asymptotic notation. Big O Notation (Computer Science) In computer science, big O notation is used to classify algorithms by how they respond (e.g., in their processing time or working space requirements) to changes in input size. A famous example is the problem of estimating the remainder term in the prime number theorem. This may sound very high flung and complex, but the basic concepts associated with the Big O Notation are not hard at all. To put it simply, Big-O Notation is how programmers talk about algorithms. *think, in this case, of an algorithm as a function that occurs in your program. Big O Notation continued: A function’s Big-O Notation is basically determined by how it responds to different inputs You might, for instance, ask the question: How much slower would the program be if we gave it a list of 1000 things to work on instead of a list of just 1 thing? Consider the block of code on the top right hand corner: If we call this function in the following manner: item_in_list(2, [1,2,3]), The response would be rather quick! You are basically looping over each thing in the list and if you find the first argument to your function, then return True. If we get to the end and it has not been found, return False. Function x item_in_list(2, [1,2,3]), If 2 is found, then return True. If end of list is reached and 2 is not found, return False Function x Orders of Magnitude The complexity of this function is O(n) This is read “Order of n” as the O function is also known as the Order function (to do with approximation which deals with ‘orders of magnitude’ A jar whose number of sweets is probably within an order of magnitude of 100 O(n): Consider further if we were to graph the time it takes to run this function with different sized inputs (e.g. an array of 1 item, 2 items, 3 items, etc.), we'd see that it approximately corresponds to the number of items in the array This is called a linear graph. This means that the line is basically straight if you were to graph it. O(n): Consider further Big O however is all about considering the WORST CASE SCENARIO! (the worst-case performance of doing something) Note: In the code above, if the item we were looking for was always the first item in the list – the function would be super efficient! In this example, the worst case is that …. the item we are looking for isn’t in the list at all! (The math term for this is "upper bound"). Plotting an O(n) graph Runtime characteristics of an O(n) function. Runtime characteristics of an O(1) function. Recap on definitions ? an Big O notation is the language we use for articulating how long ? ? algorithm takes to run. It's how we compare the efficiency of different approaches to a problem ? With big O notation we express the runtime in terms of—brace yourself for a techy sounding sentence—how quickly it grows relative to the ? ? ? ? input, as the input gets arbitrarily large. Let’s look at three things (from the above sentence) more closely: 1. How quickly the runtime grows 2. Relative to the input 3. As the input gets arbitrarily large. How quickly the runtime “grows” Often the case is that external factors affect the time it takes for a function to run. This could include things like -the speed of the processor -other programs the computer is running etc. For the above reason, it's hard to make strong statements about the exact runtime of an algorithm. Instead we use big O notation to express how quickly its runtime grows Relative to the input Note that it is not an exact number that we are looking for. What we need is something to phrase our runtime growth in terms of. It’s quite useful to use the size of the input. We can say things like the runtime grows "on the order of the size of the input" Or "on the order of the square of the size of the input As the input gets arbitrarily large … You’ve got to remember that your algorithm may seem expensive or appear to have too many steps when n is small but this is made up for eventually by other steps as n gets huge. When working with Big-O analysis, we are most concerned with the stuff that grows fastest as the input grows, because everything else is quickly eclipsed as n gets very large. If you know what an asymptote is, you might see why "big O analysis" is sometimes called "asymptotic analysis." Big-O notation seems like an abstract concept, but it would help to look at some coded examples As we work through them, think about their relative time complexities. Assign each of the below one of the 5 categories Excellent Good Fair Bad Horrible! Excellent Example #1 (Try it yourself in Python) Note: This function runs in O(1) time (or "constant time") relative to its input. The input list could be 1 item or 1,000 items, but this function would still just require one "step.“ It will always, always only return the first item on the list. Example #2 (Try it yourself in Python) This function runs in O(n) time (or "linear time"), where n is the number of items in the list. If the list has 10 items, ? items. If the it prints 10 list has 1 million items, ? well then 1 million items would be printed! Horrible! Example #2 (Try it yourself in Python) There are two nested loops here. If our list has n items, our outer loop runs n times and our inner loop runs n times for each iteration of the outer loop, giving us n^2 total prints. Thus this function runs in O(n^2) time (or "quadratic time"). The GROWTH of the solution is evident: If the list has 10 items, we have to print 100 ? times. If it has 1,000 items, we have to print 1,000,000 times. Note: n could be the ACTUAL INPUT or n could be the ‘SIZE OF THE INPUT’ Note: sometimes n is an actual number that's an input into our function, and other times n is thenumber of items in an input array (or an input map, or an input object, etc.). Further note that you can GET RID OF THE CONSTANTS This is O(2n), which we just call O(n). Further note that you can GET RID OF THE CONSTANTS This is O(1+n/2+100) which we just call O(n) How come getting rid of the constants doesn’t matter? This is O(1+n/2+100) which we just call O(n) Remember that for the Big O Notation we are concerned with what happens when n gets arbiritarily large. In the example above we are doing things like adding 100 and dividing by two- and as n gets REALLLY big – these things have a decreasingly significant effect. We can also drop the less significant terms. (Try this yourself in Python) Here our runtime is O(n+n²)which we just call O(n²). Even if it was O(n²/2+100n), it would still be O(n²) Linear, Quadratic and Exponential time: Linear growth Linear growth occurs when there is a constant rate of change. The equation for a linear relationship is y = bx, and its graph is a straight line. If you work for an hourly wage, and don’t spend your earnings, your savings will grow linearly Quadratic growth When the rate of change increases with time, numbers can grow more quickly. Quadratic and cubic growth can be represented by y = x2 and y = x3, respectively. The general form of this type of relationship can be written y = xb, and is called polynomial growth. The distance travelled by a falling object can be calculated with a quadratic equation. Exponential Logarithmic Linear, Quadratic and Exponential time: Exponential Exponential (or geometric) growth is faster still. Here, the rate of growth is proportional to the value of y at any time. Exponential relationships can be expressed as y = bx. Bacterial populations with unlimited food, nuclear chain reactions, and computer processing power are all said to grow exponentially. Linear, Quadratic and Exponential time and LOGARITHMIC Growth Logarithmic Logarithmic growth is the inverse of exponential growth. Logarithmic phenomena grow very slowly, and have an equation of the form y = logbx. Sound volume and frequency are both perceived logarithmically, allowing humans to detect a huge range of sound levels. Linear, Quadratic or Exponential time? Linear, Quadratic or Exponential time? Linear, Quadratic or Exponential time? Order of Complexity You could think of order of complexity as referring to going from the simplest types to the more complex: O(1), O(log n), O(n), O(n log n), O(n^2) O(log n) — An algorithm is said to be logarithmic if its running time increases logarithmically in proportion to the input size. O(n) — A linear algorithm’s running time increases in direct proportion to the input size. O(n log n) — A superlinear algorithm is midway between a linear algorithm and a polynomial algorithm. O(n^c) — A polynomial algorithm grows quickly based on the size of the input. O(c^n) — An exponential algorithm grows even faster than a polynomial algorithm. O(n!) — A factorial algorithm grows the fastest and becomes quickly unusable for even small values of n. Order of Complexity O(log n) — An algorithm is said to be logarithmic if its running time ? increases logarithmically in proportion to the input size. ? O(n) — A linear algorithm’s running time increases in direct ? size. proportion to the input O(n log n) — A superlinear algorithm is midway between a linear algorithm and a polynomial algorithm. O(n^c) — A polynomial algorithm grows quickly based on the size of ? the input. ? O(c^n) — An exponential algorithm grows even faster than a polynomial algorithm. O(n!) — A factorial algorithm grows the fastest and becomes quickly ? ? unusable for even small values of n. Know thy complexity: http://bigocheatsheet.com/ Know thy complexity: http://bigocheatsheet.com/ New to these concepts? Brush up on your maths!