Divide-and-conquer - Zhejiang University

Download Report

Transcript Divide-and-conquer - Zhejiang University

Approximation Algorithm
Instructor: YE, Deshi
[email protected]
1
Dealing with
Hard Problems
What to do if:
Divide and conquer
Dynamic programming
Greedy
Linear Programming/Network Flows
…
does not give a polynomial time algorithm?
2
Dealing with
Hard Problems
Solution I: Ignore the problem
Can’t do it ! There are thousands of problems for
which we do not know polynomial time algorithms
For example:
Traveling Salesman Problem (TSP)
Set Cover
3
Traveling Salesman
Problem
Traveling
SalesmanProblem (TSP)
Input: undirected graph
with lengths on edges
Output: shortest cycle that
visits each vertex exactly
once
Best known algorithm:
O(n 2n) time.
4
The vertex-cover problem
A vertex cover of an undirected graph G = (V, E)
is a subset V ' ⊆ V such that if (u, v) ∈ E, then u
∈ V ' or v ∈ V ' (or both).
A vertex cover for G is a set of vertices that
covers all the edges in E.
As a decision problem, we define
VERTEX-COVER = {〈G, k〉 : graph G has a
vertex cover of size k}.
Best known algorithm: O(kn + 1.274k)
5
Dealing with
Hard Problems
Exponential time algorithms for small inputs.
E.g., (100/99)n time is not bad for n < 1000.
Polynomial time algorithms for some (e.g.,
average-case) inputs
Polynomial time algorithms for all inputs, but
which return approximate solutions
6
Approximation Algorithms
An algorithm A is ρ-approximate, if, on any
inputof size n:
The cost CA of the solution produced by the
algorithm, and
The cost COPT of the optimal solution are such that
CA ≤ ρ COPT
We will see:
2-approximation algorithm for TSP in the plane
2-approximation algorithm for Vertex Cover
7
Comments on
Approximation
“CA ≤ ρ COPT ” makes sense only for
minimization problems
For maximization problems, replace by
COPT ≤ ρ CA
Additive approximation “CA ≤ ρ + COPT “ also
makes sense, although difficult to achieve
8
The Vertex-cover problem
9
The vertex-cover problem
A vertex cover of an undirected graph G = (V, E)
is a subset V' ⊆ V such that if (u, v) ∈ E, then u ∈ V'
or v ∈ V' (or both).
A vertex cover for G is a set of vertices that covers
all the edges in E.
The goal is to find a vertex cover of minimum size
in a given undirected graph G.
10
Naive Algorithm
APPROX-VERTEX-COVER(G)
1C←Ø
2 E′ ← E[G]
3 while E′ ≠ Ø
4 do let (u, v) be an arbitrary edge of E′
5 C ← C ∪ {u, v}
6 remove from E′ every edge incident on either u or v
7 return C
11
Illustration of
Naive algorithm
Input
Edge bc is chosen
Set C = {b, c}
Edge ef is
chosen
Optimal solution
{b, e, d}
Naive algorithm
C={b,c,d,e,f,g}
12
Approximation 2
Theorem. APPROX-VERTEX-COVER is a 2-approximation
algorithm.
Pf. let A denote the set of edges that were picked by APPROXVERTEX-COVER.
To cover the edges in A, any vertex cover, in particular, an
optimal cover C* must include at least one endpoint of each
edge in A.
No two edges in A share an endpoint.
Thus no two edges in A are covered by the same vertex from C*,
and we have the lower bound
C* ≥ |A|
On the other hand, the algorithm picks an edge for which
neither of its endpoints is already in C.
|C| = 2|A|
13
Hence, |C| = 2|A| ≤ 2|C*|.
Vertex cover: summary
No better constant-factor approximation is known!!
More precisely, minimum vertex cover is known to
be approximable within (for a given |V|≥2) (ADM85)
log log | V |
2
2 log | V |
but cannot be approximated within 7/6 (Hastad
STOC97) for any sufficiently large vertex degree,
Dinur Safra (STOC02)1.36067
14
Vertex cover: summary
Eran Halperin, Improved Approximation
Algorithms for the Vertex Cover Problem in
Graphs and Hypergraphs, SIAM Journal
on Computing, 31/5 (2002): 1608 - 1623 .
Tomokazu Imamura , Kazuo Iwama,
Approximating vertex cover on dense graphs,
Proceedings of the sixteenth annual ACMSIAM symposium on Discrete algorithms
2005
15
The Traveling Salesman
Problem
Traveling SalesmanProblem
(TSP)
Input: undirected graph G = (V,
E) with edges cost c(u, v)
associated with each edge (u, v)
∈E
Output: shortest cycle that
visits each vertex exactly once
Triangle inequality if for all
vertices u, v, w ∈ V,
c(u, w) ≤ c(u, v) + c(v, w).
u
v
w
16
2-approximation for TSP
with triangle inequality
Compute MST T
An edge between any pair of points
Weight = distance between endpoints
Compute a tree-walk W of T
Each edge visited twice
Convert W into a cycle H using
shortcuts
17
Algorithm
APPROX-TSP-TOUR(G, c)
1 select a vertex r ∈ V [G] to be a "root" vertex
2 compute a minimum spanning tree T for G from root r
using MST-PRIM(G, c, r)
3 let L be the list of vertices visited in a
preorder tree walk of T
4 return the hamiltonian cycle H that visits
the vertices in the order L
18
Preorder Traversal
Preorder: (root-left-right)
Visit the root first; and then
traverse the left subtree; and then
traverse the right subtree.
Example:
Order: A,B,C,D,E,F,G,H,I
19
Illustration
MST
Tree walk W
A full walk of the tree
visits the vertices in the
order
a, b, c, b, h, b, a, d, e, f, e,
g, e, d, a.
preorder walk
(Final solution H)
OPT solution
20
2-approximation
Theorem. APPROX-TSP-TOUR is a polynomialtime 2-approximation algorithm for the travelingsalesman problem with the triangle inequality.
Pf. Let COPT be the optimal cycle
Cost(T) ≤ Cost(COPT)
Removing an edge from H gives a spanning tree, T is a spanning
tree of minimum cost
Cost(W) = 2 Cost(T)
Each edge visited twice
Cost(H) ≤ Cost(W)
Triangle inequality
Cost(H) ≤ 2 Cost(COPT )
21
Load Balancing
Input. m identical machines; n jobs, job j has
processing time tj.
Job j must run contiguously on one machine.
A machine can process at most one job at a time.
Def. Let J(i) be the subset of jobs assigned to
machine i. The
load of machine i is Li = j  J(i) tj.
Def. The makespan is the maximum load on any
machine L = maxi Li.
Load balancing. Assign each job to a machine to
minimize makespan.
22
Load Balancing: List Scheduling
List-scheduling algorithm.
Consider n jobs in some fixed order.
Assign job j to machine whose load is smallest so far.


List-Scheduling(m, n, t1,t2,…,tn) {
for i = 1 to m {
load on machine i
Li  0
jobs assigned to machine i
J(i)  
}
for j = 1 to n {
i = argmink Lk
J(i)  J(i)  {j}
Li  Li + tj
}
machine i has smallest load
assign job j to machine i
update load of machine i
}
Implementation. O(n log n) using a priority queue.
23
Load Balancing: List Scheduling Analysis
Theorem. [Graham, 1966] Greedy algorithm is a (2-1/m)-approximation.
First worst-case analysis of an approximation algorithm.
Need to compare resulting solution with optimal makespan L*.


Lemma 1. The optimal makespan L*  maxj tj.
Pf. Some machine must process the most time-consuming job. ▪
Lemma 2. The optimal makespan L *  m1  j t j .
Pf.
The total processing time is j tj .
One of m machines must
 do at least a 1/m fraction of total work. ▪


24
Load Balancing: List Scheduling Analysis
Theorem. Greedy algorithm is a (2-1/m)-approximation.
Pf. Consider load Li of bottleneck machine i.
Let j be last job scheduled on machine i.
When job j assigned to machine i, i had smallest load. Its load
before assignment is Li - tj  Li - tj  Lk for all 1  k  m.


blue jobs scheduled before j
machine i
j
0
Li - tj
L = Li
25
Load Balancing: List Scheduling Analysis
Theorem. Greedy algorithm is a (2-1/m)- approximation.
Pf. Consider load Li of bottleneck machine i.
Let j be last job scheduled on machine i.
When job j assigned to machine i, i had smallest load. Its load
before assignment is Li - tj  Li - tj  Lk for all 1  k  m.
Sum inequalities over all k and divide by m:



Lemma 2

Now
▪
Lemma 1
26
Load Balancing: List Scheduling Analysis
Q. Is our analysis tight?
A. Essentially yes. Indeed, LS algorithm has tight bound 2- 1/m
Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m
machine 2 idle
machine 3 idle
machine 4 idle
m = 10
machine 5 idle
machine 6 idle
machine 7 idle
machine 8 idle
machine 9 idle
machine 10 idle
list scheduling makespan = 19
27
Load Balancing: List Scheduling Analysis
Q. Is our analysis tight?
A. Essentially yes. Indeed, LS algorithm has tight bound 2- 1/m
Ex: m machines, m(m-1) jobs length 1 jobs, one job of length m
m = 10
optimal makespan = 10
28
Load Balancing on 2 Machines
Claim. Load balancing is hard even if only 2 machines.
Pf. NUMBER-PARTITIONING  P LOAD-BALANCE.
a
e
b
c
d
f
g
length of job f
a
machine 1
b
machine 2
0
d Machine 1
f
c Machine e2
yes
g
Time
L
29
Load Balancing: LPT Rule
Longest processing time (LPT). Sort n jobs in descending order of
processing time, and then run list scheduling algorithm.
LPT-List-Scheduling(m, n, t1,t2,…,tn) {
Sort jobs so that t1 ≥ t2 ≥ … ≥ tn
for i = 1 to m {
Li  0
J(i)  
}
load on machine i
jobs assigned to machine i
for j = 1 to n {
i = argmink Lk
J(i)  J(i)  {j}
L i  Li + tj
}
machine i has smallest load
assign job j to machine i
update load of machine i
}
30
Load Balancing: LPT Rule
Observation. If at most m jobs, then list-scheduling is optimal.
Pf. Each job put on its own machine. ▪
Lemma 3. If there are more than m jobs, L*  2 tm+1.
Pf.
Consider first m+1 jobs t1, …, tm+1.
Since the ti's are in descending order, each takes at least tm+1 time.
There are m+1 jobs and m machines, so by pigeonhole principle, at
least one machine gets two jobs. ▪



Theorem. LPT rule is a 3/2 approximation algorithm.
Pf. Same basic approach as for list scheduling.
L i  (Li  t j )  t j
 L*


3 L *.
2
▪
 12 L*
Lemma 3
( by observation, can assume number of jobs > m )
31
Load Balancing: LPT Rule
Q. Is our 3/2 analysis tight?
A. No.
Theorem. [Graham, 1969] LPT rule is a (4/3 – 1/(3m))-approximation.
Pf. More sophisticated analysis of same algorithm.
Q. Is Graham's (4/3 – 1/(3m))- analysis tight?
A. Essentially yes.
Ex: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, …, 2m-1 and
three jobs of length m.
32
LPT
Proof. Jobs are indexed t1 ≥ t2 ≥
… ≥ tn.
If n ≤ m, already optimal (one machine processes one job).
If n> 2m, then tn ≤ L*/3. Similar as the analysis of LS
algorithm.
Suppose total 2m – h jobs, 0 ≤ h < m
Check that LPT is already optimal solution
1
h
h+1
h+2
n
n-1
Time
33
Approximation Scheme
NP-complete problems allow polynomial-time
approximation algorithms that can achieve
increasingly smaller approximation ratios by
using more and more computation time
Tradeoff between computation time and the
quality of the approximation
For any fixed ∈>0, An approximation scheme
for an optimization problem is an (1 + ∈)approximation algorithm.
34
PTAS and FPTAS
We say that an approximation scheme is a
polynomial-time approximation scheme (PTAS) if
for any fixed ∈ > 0, the scheme runs in time
polynomial in the size n of its input instance.
Example: O(n2/∈).
an approximation scheme is a fully polynomial-time
approximation scheme (FPTAS) if it is an
approximation scheme and its running time is
polynomial both in 1/∈ and in the size n of the input
instance
Example: O((1/∈)2n3).
35
The Subset Sum
Input. A pair (S, t), where S is a set {x1, x2, ...,
xn} of positive integers and t is a positive
integer
Output. A subset S′ of S
Goal. Maximize the sum of S′ but its value is
not larger than t.
36
An exponential-time
exact algorithm
If L is a list of positive integers and x is
another positive integer,
then we let L + x denote the list of integers
derived from L by increasing each element of
L by x.
For example, if L = 〈1, 2, 3, 5, 9〉, then L +
2 = 〈3, 4, 5, 7, 11〉.
We also use this notation for sets, so that
S + x = {s + x : s ∈ S}.
37
Exact algorithm
MERGE-LISTS(L, L′): returns the sorted list that is
the merge of its two sorted input lists L and L′ with
duplicate values removed.
EXACT-SUBSET-SUM(S, t)
1 n ← |S|
2 L0 ← 〈0〉
3 for i ← 1 to n
4 do Li ← MERGE-LISTS(Li-1, Li-1 + xi)
5 remove from Li every element that is greater
than t
6 return the largest element in Ln
38
Example
For example, if S = {1, 4, 5}, then
P1 ={0, 1} ,
P2 ={0, 1, 4, 5} ,
P3 ={0, 1, 4, 5, 6, 9, 10} .
Given the identity
Pi 1  Pi  ( Pi  xi )
Since the length of Li can be as much as 2i, it is an
exponential-time algorithm .
39
The Subset-sum
problem: FPTAS
Trimming or rounding: if two values in L are
close to each other, then for the purpose of
finding an approximate solution there is no
reason to maintain both of them explicitly.
Let δ such that 0 < δ < 1.
L′ is the result of trimming L, for every element
y that was removed from L, there is an element z
still in L′ that approximates y, that is
y
z y
1 
40
Example
For example, if δ = 0.1 and
L = 〈10, 11, 12, 15, 20, 21, 22, 23, 24, 29〉,
then we can trim L to obtain
L′ = 〈10, 12, 15, 20, 23, 29〉,
TRIM(L, δ)
1 m ← |L|
2 L′ ← 〈y1〉
3 last ← y1
4 for i ← 2 to m
5 do if yi > last · (1 + δ)
▹ yi ≥ last because L is sorted
6 then append yi onto the end of L′
7 last ← yi
8 return L′
41
(1+ ∈)-Approximation
algorithm
APPROX-SUBSET-SUM(S, t, ∈)
1 n ← |S|
2 L0 ← 〈0〉
3 for i ← 1 to n
4 do Li ← MERGE-LISTS(Li-1, Li-1 + xi)
5 Li ← TRIM(Li, ∈/2n)
6 remove from Li every element that is greater than t
7 let z* be the largest value in Ln
8 return z*
42
FPTAS
Theorem. APPROX-SUBSET-SUM is a fully
polynomial-time approximation scheme for the
subset-sum problem.
Pf. The operations of trimming Li in line 5 and
removing from Li every element that is greater
than t maintain the property that every element
of Li is also a member of Pi. Therefore, the
value z* returned in line 8 is indeed the sum of
some subset of S.
43
Pf. Con.
Pf. Let y*∈ Pn denote an optimal solution to
the subset-sum problem. we know that z* ≤ y*.
We need to show that y*/z* ≤ 1 + ∈.
By induction on i, it can be shown that for
every element y in Pi that is at most t, there is a
z ∈ Li such that
y
z y
i
(1   / 2n)
Thus, there is a z ∈ Ln , such that
y*
*

z

y
(1   / 2n)n
44
Pf. Con.
And thus,
y*
 (1   / 2n) n
z
Since there is a z ∈ Ln
y
Hence,
 (1   / 2n)
*
n
z*
(1   / 2n) n  e / 2
 1   / 2  O( 2 )
 1 
45
Pf. Con.
To show FPTAS, we need to bound Li.
After trimming, successive elements z and z′ of
Li must have the relationship z′/z > 1+∈/2n
Each list, therefore, contains the value 0,
possibly the value 1, and up to ⌊log1+∈/2n t⌋
additional values
log1 / 2 n t 


ln t
ln (1   / 2n)
2n(1   / 2n) ln t

4n ln t

46
•Vertex Cover: Greedy Algorithm 1
•k! vertices of degree k
•Not a constant factor
•approximation algorithm!
•Generalizing
•the example!
•k!/k vertices of degree k
•k!/(k-1) vertices of degree k-1
•k! vertices of degree 1
•OPT = k!, all top vertices.
•SOL = k! (1/k + 1/(k-1) + 1/(k-2) + … + 1) ≈ k! log(k), all bottom vertices.
48