Ratio Bound - Smith College

Download Report

Transcript Ratio Bound - Smith College

Approximation Algorithms:
The Subset-sum Problem
Victoria Manfredi (252a-ad)
Smith College, CSC 252: Algorithms
December 12, 2000
Introduction
NP-completeness and approximation algorithms
Notation associated with approximation algorithms
The NP-complete subset-sum problem and the
optimization problem associated with it
Proof that the approximation algorithm for the
optimization problem is a fully polynomial-time
approximation scheme
Please note: the information and ideas in this presentation were
gathered from various sources . For complete
references, please see the last slide.
NP-Complete Problems
Those problems that are both
 in NP (nondeterministic polynomial time)
 the answer can be described in polynomial space
 the answer can be verified correct or not in
polynomial time

and are NP-hard
 if problem being solved in polynomial time
implies that any other NP problem can be solved
in polynomial time
But the problem cannot currently be solved in
polynomial time
NP-Complete Problems
Because NP-complete problems appear in many
everyday problems (problems like the travelling
salesman for flight scheduling) that need to be solved,
they cannot be ignored.
We therefore need an acceptable way to solve these
problems: we want a polynomial algorithm that can
do the job that the exponential algorithm is doing. It
is unlikely that we will find a polynomial time
algorithm for the NP-complete problem. If you did,
you would be solving the P = NP problem and you of
course would then be rich and famous.
The solution: approximation algorithms
Approximation Algorithms
Say you are splitting a piece of cake with someone.
Dividing it so that you and the person you were
splitting it with got exactly half the cake, right down to
the last atom, would be pretty hard, and would take an
awfully long time (this is our exponential time
algorithm for the NP-complete problem), although it
would do the job exactly right. But do we ever do
this? No, we estimate and say that looks like about
half.
Approximation algorithms do the same sort of
estimation in a more defined way, and are proved to do
a good estimate in a reasonable (polynomial, for
example) time
Approximation Algorithms
When talking about approximation algorithms,
there is some terminology we need to know:






Ratio bound
Relative error
Relative error bound
Approximation scheme
Polynomial-time approximation scheme
Fully polynomial-time approximation scheme
Ratio Bound
Relates to how much bigger the correct answer is than
the answer for the approximation algorithm (if a max
problem) OR how much smaller the correct answer is
(if a min problem)
max (C/C*, C*/C) <= p(n) where C is the optimization
answer given by the approximation algorithm and C* is
the correct optimization answer given by the NPcomplete algorithm
 note: 0<C*<= C if min and 0<C<=C* if max
 p(n) is never less than one because one of C/C* or
C*/C will be greater than or equal to one always
p(n) is the function bounding how big the C and C*
ratio will be. It may depend on input n, hence p(n),
but it also may not, and is then just p
Relative Error
Is how far off the correct answer the answer from the
approximation algorithm is
|C-C*| / C*
For example, if the the optimization answer given by
the approximation algorithm, C, equals 10 and the
optimization answer given by the NP-complete
algorithm is 8, (we’re doing a minimization), then
 (10-8)/8 = 2/8 = 0.25 relative error
Relative error is always either a positive number or
zero (because of the absolute value in the equation).
This makes sense because what would a negative
relative error mean?
Relative Error Bound
Is a bound on how far off the correct answer the
answer from the approximation algorithm is
This bound can either be a function of input n as in
(n) with the relative error changing according to the
size of n, or it can be a constant, as in just 
|C-C*| / C* <= (n)
Approximation Schemes
Approximation scheme

approximation algorithm that also requires a
relative error bound,  > 0 as well as the input
data
Polynomial-time approximation scheme

approximation scheme that runs in polynomial time
for input n
Fully Polynomial-time approximation scheme


approximation scheme that runs in polynomial time
for input n and 1/
Why 1/? To capture how the decreasing the
relative error, , increases the running time
The Subset-sum Decision Problem
The subset-sum problem is a decision problem that asks,
given a set of number and a number x, determine
whether a subset of numbers of the set can be added
together to equal x.
The subset-sum problem is based on the knapsack
problem, but is simpler, although both are NP-complete.
In the knapsack problem you’re looking at both the size
and the profit of the objects, while in the subset-sum
problem you’re just looking at the size of the objects
Naïve solution: come up with all possible combinations
of the numbers in the set, sum them together and see if
any of the resulting sums equals x
Is O(2^n)
The Subset-sum Optimization Problem
The optimization problem associated with the subsetsum problem asks, given a set of numbers and a
number x, determine the subset that sums to the
largest number less than or equal to x.
Since the decision problem associated with it is NPcomplete, the optimization problem is also NP
complete.
Uses of subset-sum algorithm: for example, packing a
truck maximally
The approximation algorithm is for both the subset-sum
optimization problem and decision problem
Solving Subset-sum Optimization
Problem in Exponential Time
Start with an x, a set E ={0} and the set to find the
subset of, S= {s1,s2,…,sn}


Define the set operation S+i to equal {s1+i,s2+i,…,sn+i}
Then do,





E1 = (E + s1) U E
E2 = (E1 + s2) U E1
…
En = (En-1 + Sn) U En-1
Where each S and Sn are sorted lists
At each step, if any element in Ei is greater than x,
then remove the number from the set
At the end, the largest number in En is the answer
Notice that the set En is growing exponentially
Solving Subset-sum Optimization
Problem in Exponential Time
x = 14 E ={0} and S= {1,4,7}
Then,



E1 = {0+1} U {0} = {0,1}
Set size = 2
E2 = {0+4,1+4} U {0,1} = {0,1,4,5} Set size = 4
E3 = {0+7,1+7,4+7,5+7} U {0,1,4,5} Set size = 8
= {0,1,4,5,7,8,11,12}
2 + 2^2 + 2^3 ….2^n = (2^n+1)/(2-1) = O(2^n)
We did obtain the correct answer (12) but we had to
use an exponential amount of space in order to do so
Note, in this example the space use doubles; in other
examples, this is not necessarily the case
Solving Subset-sum Optimization Problem
in Polynomial Time
How do we avoid exponential space use? Trim the set
Ei at each step. Get rid of some larger numbers in the
set by having smaller numbers represent them
Trimming:



Our trimming parameter,  >= (y-z)/ y, with 1> > 0
To see if # should stay or go: if the previous number is less
than or equal to (1- ) times the following number, starting
from the first number in the set, then the following number can
be removed. The first element of the set always stays.
Trimming the set {3,5,6,8} with a  = 0.2, that is, 20% error,
then we get the set




(5-3)/5 = 0.4 keep 3
(6-5)/6 = 0.2 don’t keep 6 (let 5 represent it)
(8-6)/8 = 0.25 keep 8
Final set {3,5,8}
Solving Subset-sum Optimization
Problem in Polynomial Time
How do we choose ? Remember  for the relative error
bound? We choose  to be /n where n is the number of
elements in the set and 0 <  < 1
Looking at our example from before,


x = 14, T ={0}, S= {1,4,7}, n = 3,  = 0.3 so /n = 0.1
Then,
now before
 M1 = {0+1}U{0} = {0,1}
Set size = 2
2
 T1: {0,1}
 M2 = {0+4,1+4}U{0,1} = {0,1,4,5} Set size = 3
4
 T2: {0,1,4} where 4 rep 5
 M3 = {0+7,1+7,4+7}U{0,1,4}
Set size = 5
8
= {0,1,4,7,8,11}
 T3: {0,1,4,7,11} where 7 rep 8

We get 11 now, instead of 12 as the answer. But 11
is within 1-  times 12, so it is acceptable.
Proof that Approximation Algorithm is a Fully
Polynomial-time Approximation Scheme
If we can prove that the approximation algorithm is a
fully polynomial-time approximation scheme, we will
be showing that

the algorithm runs in polynomial time for input n and 1/ 
We want to show this because it would mean that the
approximation algorithm is using polynomial
time/space instead of exponential and would
therefore be a practical algorithm
Proof cont’d
Our trimmed set from the approximation algorithm is a
subset of the untrimmed set from the NP-complete
algorithm. That is Ti is subset of Ei
This means that the answer we find using the
approximation algorithm, some z, is the sum of a
subset of the set we were given.
If this is a good approximation algorithm than if we
were to multiply the answer we would get using the
NP-complete algorithm by 1-  (because of our max
relative error equation for C*(1-  ) <= C), we should
find that z is at least as big as (1-  ) times the result
we would get, which we’ll call m (we are looking for at
least as big because this is a maximization problem)
Proof cont’d
We must therefore prove that z>= (1- )m (If you
remember from before from relative error bounds)





|C*-C| / C* <= (n), (C*-C instead of C-C* because max)
|C*-C| / C* = (C*-C)/C*
= C*/C*-C/C* <= (n)
1-C/C* <= (n)
1- (n) <= C/C*
which can be derived to C>=(1- )C* since in our example  is
not a function of n (the size of the input). This is what we are
working with and it corresponds with z>=(1- )m
We want to show that z and m are very close together.  is
between 0 and 1 so we want to show that z is, for example,
0.89*m
Proof cont’d
Since  was chosen to be /n, this means that the
relative error between a number in T and the number
in M it represents is no more than /n. Therefore the
relative error between the correct answer and the
approximated answer will be no more than .
So, from (y-z)/y <=  (see slide on Solving Subset-sum
in Polynomial Time)





 >= (y-z)/y = y/y -z/y = 1-z/y
 + z/y >= 1
z/y >= 1-
z >= (1- )y
y(1- ) <= z
Proof cont’d
And since we know y>z, we get
 y(1- ) <= z <= y
 y(1-  /n)^i <= z <= y
 It has been shown through induction on i, that for all the
y’s that were removed, there is a z that fits this equation
Let y* be the best answer. Then we get
 y*(1-  /n)^n <= z <= y*
 and the approximation algorithm gives the largest z that
fits this
Proof cont’d
By taking the derivative of the function in the above
equation, (1- /n)^n, with respect to n, we find that it
is greater than zero. This means that when n
increases (1-  /n)^n increases. Then when n>0, we
get,

1- 
remains
the same
<
(1-  /n)^n
increases when
n increases
From this it follows that (1-)y* <= z because from
the previous equation y*(1-  /n)^n <= z <= y* we can
now get,


(1-) y* < y*(1-  /n)^n <= z <= y*
and from this we get (1-  )y* <= z
Proof cont’d
So we have just proved that z>= (1- )m, meaning
that the solution the approximation algorithm gives us
is pretty close (1- ) to the solution from the
exponential time algorithm
We will now show that the approximation algorithm is
a fully polynomial-time approximation scheme,
thereby proving that it runs in polynomial time in
respect to n and 1/ instead of exponential time,
while also yielding an answer reasonably close to the
correct answer.
Proof cont’d
We can show that a function is polylogarithmically
bounded if f(n) = (log n)^O(1)
We can use this to show a function is polynomially
bounded because instead of getting a polylogartihmic
answer we’ll get a polynomial answer
The trimmed list M is what is growing, so we hope to
prove that its length is polynomially bounded.
Well, we know that the difference between mi and
mi+1 in M is given by 1/(1- ) where  = /n and  was
the trimming factor.
So our function isf (1/(1-/n)) = log1/(1-/n) t where
1/(1-E/n) is the base of the logarithm and t is what
we’re taking the log of
Proof cont’d

Changing the base of the log using, logb a = logc a/ logc b
we get
1
loge t
ln t
ln t
log----- t = --------------- = ------------- = ------------------1-/n loge(1/(1-/n)) ln(1/(1-/n)) ln(1- /n)^-1
ln t
= ----------------- =
-1 * ln(1- /n)


ln t
n* ln t
then if (1-/n) then if (1-/n) <= --------- = --------is our x, we get
-(-/n)

Since we know ln(1+x)<= x,
This is a polynomial in respect to n and 1/ , so our
approximation algorithm is a fully polynomial-time
approximation scheme
Note: the proof presented here is from Cormen et al
Proof cont’d
Theorem: There is no fully polynomial approximation
scheme for a strongly NP-complete
problem, unless NP = P (Theorem from
Approximation Algorithms for NP-hard
Problems)
The reason we could prove that the approximation
algorithm for the subset-sum problem was a fully
polynomial approximation scheme was because
subset-sum is a weakly NP-complete problem
Speedup: Some Times
Subset-sum problem - Comparison of Algorithms
Algorithm
Subset sum
Time
GS
25554
0.05
DPS
25557
240.24
APPROX_SUBSET_SUM
25436
12.31
DIOPHANT
25557
0.82
GS = Greedy, DPS = exponential time algorithm,
APPROX_SUBSET_SUM = the algorithm I presented, and
DIOPHANT = algorithm by the author of the web page
Source for this table:
http://www.geocities.com/zabrodskyvlada/aat/a_suba.html#approx_subse
t_sum
Conclusion
Approximation algorithms are one way to
come up with an answer in a reasonable
(polynomial) amount of time for a NPcomplete problem
References: Basically All the Info in this
Presentation came from the below Sources
Sources:
 Main Source: Introduction to Algorithms,Cormen, T.H.,
Leiserson, C.E., and Rivest, R.L. (1999), Chapter 37.
 Approximation Algorithms for NP-hard Problems, HochBaum,
D.S. (1997), Introduction, pp.9-10 and pp.359-365
 Ileana
 lecture notes from class
 http://www.geocities.com/zabrodskyvlada/aat/a_suba.html#a
pprox_subset_sum
 http://cse.hanyang.ac.kr/~jmchoi/class/19962/algorithm/classnote/node7.html
What I did:


Web and library research
Asked Ileana :-)