Mastering Algorithms, space and time complexity • • • • • • • • The P vs NP Problem Understand that algorithms can be compared by expressing their complexity as a function.

Download Report

Transcript Mastering Algorithms, space and time complexity • • • • • • • • The P vs NP Problem Understand that algorithms can be compared by expressing their complexity as a function.

Mastering Algorithms,
space and time complexity
•
•
•
•
•
•
•
•
The P vs NP Problem
Understand that algorithms can be compared by expressing their
complexity as a function relative to the size of the problem.
Understand that some algorithms are more efficient time-wise than other algorithms.
Understand that some algorithms are more space-efficient than other algorithms.
Big-O notation Linear time, polynomial time, exponential time.
Python (try it yourself tasks)
Order of complexity
Ever heard of the P vs NP Problem?
Before we do anything else, let me
introduce you to this fabulous problem -it
remains (at the time of writing) one of the
major unsolved problems in Computer
Science.
You could be the one to solve it!
But first …what is it?
What does it have to do with computing
and algorithms?
All these questions will be answered!
Why does the question matter?
The P versus NP problem is a major unsolved problem in
computer science.
Informally, it asks whether every problem whose solution
can be quickly verified by a computer can also be quickly
solved by a computer.
It was introduced in 1971 by Stephen Cook in his seminal
paper "The complexity of theorem proving procedures" and
is considered by many to be the most important open
problem in the field.
It is one of the seven Millennium Prize Problems selected
by the Clay Mathematics Institute to carry a US$ 1,000,000
prize for the first correct solution.
And it gets better ….
Aside from being an important problem in computational
theory, a proof either way would have profound
implications for mathematics, Philosophy, cryptography,
algorithm research, artificial intelligence, game theory,
multimedia processing and many other fields.
Really, if a solution is discovered, it could change the
world as we know it!
There is a subset of computer science called complexity
theory and one of the main questions that is asked in P
vs NP is: “How powerful can computers really be?”
What do you think? Imagine an algorithm that could
literally solve any problem. Hmmm…..
Decision problems
Life is full of all kinds of problems. Relationship problems,
money problems…and I could go on. But in Maths and
Computing there is a type of problem called
DECISION ?PROBLEMS
INPUT
That is does x have a given property and can we signal this
fact by the output of a “yes” or “no”
(x = given set of data)
Algorithm
Yes
No
Here’s an example:
Enter 2,3,5,9
EVEN-SUM Problem
INPUT
"is the sum of 2, 3, 5, and 9 even?"
?
Here the answer would be … nope!
That’s because 2 + 3 + 5 + 9 = 19, which is
? number.
an odd
Is the sum of the
inputs EVEN?
Algorithm
Yes
x
No
No
Given a problem like the EVEN-SUM …
The goal is to create a method by which to solve any instance of the
problem. This method is called an algorithm, and an algorithm must
work correctly on every possible input to be considered correct. A
decision problem may have multiple algorithms that solve it. For
example, two different algorithms can solve EVEN-SUM:
1. add up all the numbers and output Yes if the sum is even or No if
the sum is not even
2. output Yes if an even number of the input numbers are odd, or No
otherwise.
Which method is better? Time efficient?
Both of them solve the problem, but one of them is more time efficient,
and cleverer – doing less work along the way!
1. add up all the numbers and output Yes if the sum is even or No if the sum is not even
2. output Yes if an even number of the input numbers are odd, or No otherwise.
Instead of adding up all
the numbers, the second
algorithm makes use of
the fact that the sum of
two odd numbers is even!
Genius!
The second algorithm is
more efficient than the first in the sense
that as the size of the input increases,
the number of steps required to solve
the problem increases more slowly.
Let’s look at those key words again:
The second algorithm is
more efficient than the first in the
sense that as the size of the input
increases, the number of steps
required to solve the problem
increases more slowly.
?
?
?
?
Complexity Classes
Computer scientists place
decision problems in complexity
classes by how efficient their most
efficient known algorithm is.
The complexity class P, or P for
short, is the class of decision
problems that we know how to
solve with reasonable efficiency.
Complexity Classes
Similarly, NP is a complexity class
that contains all problems that can
be solved reasonably efficiently
given an extra bit of information
called a certificate or hint
As an example, take the classic Travelling Salesman Problem,
or TSP. An instance of TSP is something like "does there exist a
route that passes through Chicago, Manhattan, Boston, and
Miami exactly once each that is no more than 2000 miles long?"
Definitions
= the Complexity Class (or P for short) – it is the class of
Decision problems that we know how to solve with reasonable
efficiency
= Complexity class that contains all problems that can be solved
Reasonable efficiently given an extra bit of information (this
Extra bit is called a certificate or hint)
The travelling salesman problem (TSP)
Let’s look more closely at this
famous problem. And to make it
more interesting, let’s call it the
travelling pig problem …
An instance of this problem would
be something like: “Does there exist
A route that passes through Cities
Q, R, S, and T exactly once each
that is no more than 2000 miles
long?”
The travelling pig problem….
We are asking the question: “Does there exist a route that passes
through Cities Q, R, S, and T exactly once each that
Is no more than 2000 miles long?”
1) Try all possible orderings of the four cities and compute the length of each route
What if you were given a *hint*?!
We are asking the question: “Does there exist a route that passes
through Cities Q, R, S, and T exactly once each that
Is no more than 2000 miles long?”
Suppose you were given a Hint such as : Try Going via Q  R  S  T
WHEN the Hint is
provided: A valid Algorithm
only needs now to check
the length of the
suggested (hint) path And
output YES if the path
is short enough.
Take the hint, find the solution!
1) Try all possible orderings of the four cities and compute the length of
each route
The hint given to you is : Try Going via Q  R  S  T
WHEN the Hint is
provided: A valid Algorithm
only needs now to check
the length of the
suggested (hint) path And
output YES if the path
is short enough.
Definitions
P
= the Complexity Class (or P for short) – it is the class of Decision
problems that we know how to solve with reasonable efficiency
NP
=Complexity class that contains all problems that can be solved
reasonable efficiently given an extra bit of information (this extra bit is called a
certificate or hint)
= algorithm which is reasonably
efficient but has been given a hint
*Note that TSP is in NP (it can be
solved efficiently if given a certificate)
Note: TSP is in NP
P
= the Complexity Class (or P for short) – it is the class of Decision
problems that we know how to solve with reasonable efficiency
NP
=Complexity class that contains all problems that can be solved
reasonable efficiently given an extra bit of information (this extra bit is called a
certificate or hint)
Is in NP
Is interestingly, also in NP (as algorithms for
problems in P can simply ignore the certificate
handed to them)
Consider further …
= There are no known (reasonably efficient)
algorithms that do not need a HINT (or certificate)
You could try the most obvious algorithm – which is “try all possible paths” – but this
becomes terribly terribly slow when C (no of cities) is even a 100! Believe it or not the
number of possible paths is a 157 digit number and trying all of these paths would take
literally years on even the fastest computer!
WHEN the Hint is provided: A
valid algorithm only needs now
to check the length of the
suggested (hint) path And
output YES if the path
is short enough.
If no known algorithms are any better
Can we really say that TSP is in P?
Not really! (because for TSP to be in P it would have to have some known algorithm
that can be executed with reasonable efficiency and no known algorithm has been
discovered. Yet – logically, it appears that P (as shown on the previous slide) is in
NP (because anything in P can simply ignore the certificate). Logically, then TSP is
in P – but really ….in reality (according to the evidence we have) it isn’t
So do we give up because we don’t have thee evidence (or rather the power to
compute the evidence)
How can we prove that TSP is NOT in P?
The question of P vs NP considers:
1. Is the Class P the SAME AS NP
2. Are we able to show that P is
NOT the same as NP?
On the other hand P = NP is easy to ‘show’
• TSP is a special kind of NP Problem
called an NP-Complete Problem
• All NP-Complete Problems are
equivalent to each other
How do we show P = NP?
All we need is ONE efficient algorithm for
ANY NP-Complete Problem
We could of course just assume that P is
NOT the same as NP, but assumptions
aren’t what we’re looking for here …!
Formal definition of NP-Completeness
• A decision problem C is NP-complete if:
1. C is in NP, and
2. Every problem in NP is reducible in polynomial time.
3. can be shown to be in NP by demonstrating that a
candidate solution to can be verified in polynomial time.
4. Note that a problem satisfying condition 2 is said to be NPhard whether or not it satisfies condition 1.
Polynomial time?
Polynomial time is a synonym for "tractable", "feasible", "efficient", or "fast"
If a problem cannot be solved in polynomial time, it is …er….not going to be easy
to solve!
Some examples of polynomial time algorithms:
• The quicksort sorting algorithm on n integers performs at most An^2 operations
for some constant A. Thus it runs in time O(n^2) and is a polynomial time
algorithm.
• All the basic arithmetic operations (addition, subtraction, multiplication, division,
and comparison) can be done in polynomial time.
• Maximum matchings in graphs can be found in polynomial time.
Problems that cannot be solved in
Polynomial time?
NP stands for Non-deterministic Polynomial time.: This means that the problem
can be solved in Polynomial time using a Non-deterministic Turing machine
Although any given solution to an NP-complete problem can be verified quickly (in
polynomial time), there is no known efficient way to locate a solution in the first
place; indeed, the most notable characteristic of NP-complete problems is that no
fast solution to them is known. .
That is, the time required to solve the problem using any currently known algorithm
increases very quickly as the size of the problem grows. As a consequence,
determining whether or not it is possible to solve these problems quickly, called the
P versus NP problem, is one of the
principal unsolved problems in computer science today!
Examples of other Problems that are NP-Complete when expressed as
Decision problems … if you’re interested click the links to read more!
• Boolean satisfiability problem (Sat.)
• N-puzzle
• Knapsack problem
• Hamiltonian path problem
• Travelling salesman problem
• Subgraph isomorphism problem
• Subset sum problem
• Clique problem
• Vertex cover problem
• Independent set problem
• Dominating set problem
• Graph coloring problem
Big O Notation (Mathematics)
In mathematics, big O notation
describes the limiting behaviour of
a function when the argument
tends towards a particular value or
infinity, usually in terms of simpler
functions
It is a member of a larger family of
notations that is called Landau
notation, Bachmann–Landau
notation (after Edmund Landau
and Paul Bachmann),or asymptotic
notation.
Big O Notation (Computer Science)
In computer science, big O notation
is used to classify algorithms by how
they respond (e.g., in their
processing time or working space
requirements) to changes in input
size.
A famous example is the problem of
estimating the remainder term in
the prime number theorem.
This may sound very high flung
and complex, but the basic
concepts associated with the
Big O Notation are not hard at
all.
To put it simply, Big-O Notation
is how programmers talk about
algorithms. *think, in this case,
of an algorithm as a function
that occurs in your program.
Big O Notation continued:
A function’s Big-O Notation is
basically determined by how it
responds to different inputs
You might, for instance, ask the
question: How much slower would
the program be if we gave it a list of
1000 things to work on instead of a
list of just 1 thing?
Consider the block of code on the
top right hand corner:
If we call this function in the following manner:
item_in_list(2, [1,2,3]),
The response would be rather quick! You are
basically looping over each thing in the list and
if you find the first argument to your function,
then return True. If we get to the end and it has
not been found, return False.
Function x
item_in_list(2, [1,2,3]),
If 2 is found, then return True.
If end of list is reached and 2 is not found, return False
Function x
Orders of Magnitude
The complexity of this function is O(n)
This is read “Order of n” as the
O function is also known as
the Order function (to do with
approximation which deals
with ‘orders of magnitude’
A jar whose number of sweets
is probably within an order of
magnitude of 100
O(n): Consider further
if we were to graph the time it
takes to run this function with
different sized inputs (e.g. an
array of 1 item, 2 items, 3 items,
etc.), we'd see that it
approximately corresponds to the
number of items in the array
This is called a linear graph. This
means that the line is basically
straight if you were to graph it.
O(n): Consider further
Big O however is all about
considering the WORST CASE
SCENARIO! (the worst-case
performance of doing something)
Note: In the code above, if the
item we were looking for was
always the first item in the list –
the function would be super
efficient!
In this example, the worst case is
that …. the item we are looking
for isn’t in the list at all!
(The math term for this is "upper
bound").
Plotting an O(n) graph
Runtime characteristics of an O(n)
function.
Runtime characteristics of an O(1)
function.
Recap on definitions
? an
Big O notation is the language we use for articulating how long
?
?
algorithm
takes to run. It's how we compare the efficiency
of different
approaches to a problem
?
With big O notation we express the runtime
in terms of—brace yourself
for a techy sounding sentence—how quickly it grows
relative to the
?
?
?
?
input,
as the input
gets arbitrarily large.
Let’s look at three things (from the above sentence) more closely:
1. How quickly the runtime grows
2. Relative to the input
3. As the input gets arbitrarily large.
How quickly the runtime “grows”
Often the case is that external factors affect the time it
takes for a function to run. This could include things like
-the speed of the processor
-other programs the computer is running etc.
For the above reason, it's hard to make strong
statements about the exact runtime of an algorithm.
Instead we use big O notation to express how quickly
its runtime grows
Relative to the input
Note that it is not an exact number that we are looking for.
What we need is something to phrase our runtime growth in
terms of.
It’s quite useful to use the size of the input. We can say things
like the runtime grows "on the order of the size of the input"
Or "on the order of the square of the size of the input
As the input gets arbitrarily large …
You’ve got to remember that your algorithm may
seem expensive or appear to have too many steps
when n is small but this is made up for eventually by
other steps as n gets huge.
When working with Big-O analysis, we are most
concerned with the stuff that grows fastest as the
input grows, because everything else is quickly
eclipsed as n gets very large.
If you know what an asymptote is, you might see why
"big O analysis" is sometimes called "asymptotic
analysis."
Big-O notation seems like an abstract concept, but it
would help to look at some coded examples
As we work through them, think about their relative time complexities.
Assign each of the below one of the 5 categories
Excellent
Good
Fair
Bad
Horrible!
Excellent
Example #1 (Try it yourself in Python)
Note:
This function runs in
O(1) time (or "constant
time") relative to its
input. The input list
could be 1 item or 1,000
items, but this function
would still just require
one "step.“
It will always, always
only return the first item
on the list.
Example #2 (Try it yourself in Python)
This function runs
in O(n) time (or
"linear time"),
where n is the
number of items in
the list.
If the list has 10 items,
? items. If the
it prints 10
list has 1 million items,
?
well then 1 million
items would be printed!
Horrible!
Example #2 (Try it yourself in Python)
There are two nested loops here. If our list has n items, our outer loop runs n times
and our inner loop runs n times for each iteration of the outer loop, giving
us n^2​​ total prints. Thus this function runs in O(n^2) time (or "quadratic time").
The GROWTH of the solution is evident: If the list has 10 items, we have to print 100
?
times. If it has 1,000 items, we have to print 1,000,000
times.
Note: n could be the ACTUAL INPUT or n could be the ‘SIZE OF THE INPUT’
Note: sometimes n is an actual number that's an input into
our function, and other times n is thenumber of items in an
input array (or an input map, or an input object, etc.).
Further note that you can GET RID OF THE CONSTANTS
This is O(2n),
which we just
call O(n).
Further note that you can GET RID OF THE CONSTANTS
This is O(1+n/2+100) which
we just call O(n)
How come getting rid of the constants doesn’t matter?
This is O(1+n/2+100) which we just call O(n)
Remember that for the Big O Notation we are
concerned with what happens when n gets
arbiritarily large.
In the example above we are doing things like
adding 100 and dividing by two- and as n gets
REALLLY big – these things have a decreasingly
significant effect.
We can also drop the less significant terms. (Try this yourself in Python)
Here our runtime is O(n+n²)which
we just call O(n²). Even if it was
O(n²/2+100n), it would still be
O(n²)
Linear, Quadratic and Exponential time:
Linear growth
Linear growth occurs when there is a constant
rate of change. The equation for a linear
relationship is y = bx, and its graph is a straight
line. If you work for an hourly wage, and don’t
spend your earnings, your savings will grow
linearly
Quadratic growth
When the rate of change increases with time,
numbers can grow more quickly. Quadratic and
cubic growth can be represented by y = x2 and
y = x3, respectively. The general form of this
type of relationship can be written y = xb, and is
called polynomial growth. The distance
travelled by a falling object can be calculated
with a quadratic equation.
Exponential
Logarithmic
Linear, Quadratic and Exponential time:
Exponential
Exponential (or geometric) growth is faster still. Here, the
rate of growth is proportional to the value of y at any time.
Exponential relationships can be expressed as y = bx.
Bacterial populations with unlimited food, nuclear chain
reactions, and computer processing power are all said to
grow exponentially.
Linear, Quadratic and Exponential time and LOGARITHMIC Growth
Logarithmic
Logarithmic growth is the inverse of
exponential growth. Logarithmic phenomena
grow very slowly, and have an equation of the
form y = logbx. Sound volume and frequency
are both perceived logarithmically, allowing
humans to detect a huge range of sound
levels.
Linear, Quadratic or Exponential time?
Linear, Quadratic or Exponential time?
Linear, Quadratic or Exponential time?
Order of Complexity
You could think of order of complexity as referring to going from the simplest
types to the more complex:
O(1),
O(log n), O(n),
O(n log n),
O(n^2)
O(log n) — An algorithm is said to be logarithmic if its running time increases logarithmically in proportion to
the input size.
O(n) — A linear algorithm’s running time increases in direct proportion to the input size.
O(n log n) — A superlinear algorithm is midway between a linear algorithm and a polynomial algorithm.
O(n^c) — A polynomial algorithm grows quickly based on the size of the input.
O(c^n) — An exponential algorithm grows even faster than a polynomial algorithm.
O(n!) — A factorial algorithm grows the fastest and becomes quickly unusable for even small values of n.
Order of Complexity
O(log n) — An algorithm is said to be logarithmic if its running time
?
increases
logarithmically in proportion to the input size.
?
O(n) — A linear algorithm’s running time increases
in direct
? size.
proportion to the input
O(n log n) — A superlinear algorithm is midway between a linear
algorithm and a polynomial algorithm.
O(n^c) — A polynomial algorithm grows quickly based on the size of
?
the input.
?
O(c^n) — An exponential algorithm grows even faster
than a
polynomial algorithm.
O(n!) — A factorial algorithm grows the fastest
and becomes quickly
?
?
unusable for even small
values of n.
Know thy complexity: http://bigocheatsheet.com/
Know thy complexity: http://bigocheatsheet.com/
New to these concepts? Brush up on your maths!