Final exam review

Download Report

Transcript Final exam review

Final Exam Review
Final exam will have the similar format and requirements as Mid-term
exam:
•
Closed book, no computer, no smartphone
•
Calculator is Ok
Final exam questions are contained in:
•
Questions in Homework 2 and Programming Assignment 2
•
Content listed in the following slides
String Similarity
How similar are two strings?

ocurrance

occurrence
o
c
u
r
r
a
n
c
e
-
o
c
c
u
r
r
e
n
c
e
6 mismatches, 1 gap
o
c
-
u
r
r
a
n
c
e
o
c
c
u
r
r
e
n
c
e
1 mismatch, 1 gap
o
c
-
u
r
r
-
a
n
c
e
o
c
c
u
r
r
e
-
n
c
e
0 mismatches, 3 gaps
2
Edit Distance
Applications.
Basis for Unix diff.
Speech recognition.
Computational biology.



Edit distance. [Levenshtein 1966, Needleman-Wunsch 1970]
Gap penalty ; mismatch penalty pq.
Cost = sum of gap and mismatch penalties.


C
T
G
A
C
C
T
A
C
C
T
-
C
T
G
A
C
C
T
A
C
C
T
C
C
T
G
A
C
T
A
C
A
T
C
C
T
G
A
C
-
T
A
C
A
T
TC + GT + AG+ 2CA
2 + CA
3
Sequence Alignment
Goal: Given two strings X = x1 x2 . . . xm and Y = y1 y2 . . . yn find
alignment of minimum cost.
Def. An alignment M is a set of ordered pairs xi-yj such that each item
occurs in at most one pair and no crossings.
Def. The pair xi-yj and xi'-yj' cross if i < i', but j > j'.
cost(M ) 

(x i , y j )  M
 xi y j 
 
i : x i unmatched
mismatch

Ex: CTACCG vs. TACATG.
Sol: M = x2-y1, x3-y2, x4-y3, x5-y4, x6-y6.
 
j : y j unmatched
gap
x1
x2
x3
x4
x5
C
T
A
C
C
-
G
-
T
A
C
A
T
G
y1
y2
y3
y4
y5
y6
x6
4
Cuts
Def. An s-t cut is a partition (A, B) of V with s  A and t  B.
Def. The capacity of a cut (A, B) is:
cap( A, B) 
 c(e)
e out of A

10
s
5
2
9
5
4
15
15
10
3
8
6
10
4
6
15
t
A
15
4
30
7
10
Capacity = 10 + 5 + 15
= 30
5
Cuts
Def. An s-t cut is a partition (A, B) of V with s  A and t  B.
Def. The capacity of a cut (A, B) is:
cap( A, B) 
 c(e)
e out of A

10
5
s
A
15
2
9
5
4
15
15
10
3
8
6
10
4
6
15
4
30
7
t
10
Capacity = 9 + 15 + 8 + 30
= 62
6
Residual Graph
Original edge: e = (u, v)  E.
Flow f(e), capacity c(e).

capacity
u
v
17
6
flow
Residual edge.
"Undo" flow sent.
e = (u, v) and eR = (v, u).
Residual capacity:
residual capacity



u
c(e)  f (e) if e  E
c f (e)  
if e R  E
f (e)
11
v
6
residual capacity
Residual graph: Gf = (V, Ef ).

Residual edges with positive residual capacity.
Ef = {e : f(e) < c(e)}  {eR : f(e) > 0}.


7
Ford-Fulkerson Algorithm
2
4
4
10
2
8
6
10
10
3
9
5
10
capacity
G:
s
t
8
Augmenting Path Algorithm
Augment(f, c, P) {
b  bottleneck(P)
foreach e  P {
if (e  E) f(e)  f(e) + b
else
f(eR) f(eR) - b
}
return f
}
forward edge
reverse edge
Ford-Fulkerson(G, s, t, c) {
foreach e  E f(e)  0
Gf  residual graph
while (there exists augmenting path P) {
f  Augment(f, c, P)
update Gf
}
return f
}
9
Certifiers and Certificates: 3-Satisfiability (3-SAT)
SAT. Given a CNF formula , is there a satisfying assignment?
Certificate. An assignment of truth values to the n boolean variables.
Certifier. Check that each clause in  has at least one true literal.
Ex.
 x1
 x2  x3  
 x1
 x2  x3  
 x1
 x2  x4   x1  x3  x4 
instance s

x1  1, x2  1, x3  0, x4  1
certificate t
Conclusion. SAT is inNP.
10
Subset Sum
SUBSET-SUM. Given natural numbers w1, …, wn and an integer W, is
there a subset that adds up to exactly W?
Ex: { 1, 4, 16, 64, 256, 1040, 1041, 1093, 1284, 1344 }, W = 3754.
Yes. 1 + 16 + 64 + 256 + 1040 + 1093 + 1284 = 3754.
Remark. With arithmetic problems, input integers are encoded in
binary. Polynomial reduction must be polynomial in binary encoding.
Claim. 3-SAT  P SUBSET-SUM.
Pf. Given an instance  of 3-SAT, we construct an instance of SUBSETSUM that has solution iff  is satisfiable.
11
Subset Sum
Construction. Given 3-SAT instance  with n variables and k clauses,
form 2n + 2k decimal integers, each of n+k digits, as illustrated below.
Claim.  is satisfiable iff there exists a subset that sums to W.
Pf. No carries possible.
C1  x  y  z
C2  x  y  z
C3  x  y  z

x
y
z
C1
C2
C3
x
1
0
0
0
1
0
100,010
x
1
0
0
1
0
1
100,101
y
0
1
0
1
0
0
10,100
y
0
1
0
0
1
1
10,011
z
0
0
1
1
1
0
1,110
z
0
0
1
0
0
1
1,001
0
0
0
1
0
0
100
0
0
0
2
0
0
200
0
0
0
0
1
0
10
0
0
0
0
2
0
20
0
0
0
0
0
1
1
0
0
0
0
0
2
2
1
1
1
4
4
4
111,444
dummies to get clause
columns to sum to 4
W
12
Weighted Vertex Cover
Definition. Given a graph G = (V, E), a vertex cover is a set S  V such
that each edge in E has at least one end in S.
Weighted vertex cover. Given a graph G with vertex weights, find a
vertex cover of minimum weight. (NP hard problem)
all nodes with weight of 1 reduces the problem to standard vertex
cover problem.
2
4
2
4
2
9
2
9
weight = 2 + 2 + 4
weight = 11
13
Pricing Method
Pricing method. Set prices and find vertex cover simultaneously.
Weighted-Vertex-Cover-Approx(G, w) {
foreach e in E
pe = 0
while ( edge e=(i,j) such that neither i nor j are
tight)
select such an edge e
increase pe as much as possible until i or j tight
}
S  set of all tight nodes
return S
 pe  wi
e  (i , j )
}
Why S is a vertex cover set? (use contradiction to prove)
14
Approximation method: Pricing Method
Pricing method. Each edge must be covered by some vertex.
Edge e = (i, j) pays price pe  0 to use vertex i and j.
Fairness. Edges incident to vertex i should pay  wi in total.
2
4
2
9
for each vertex i :  pe  wi
e(i, j)

Lemma. For any vertex cover S and any fair prices pe: e pe  w(S).
Pf.
▪
 pe  
e E
 pe   wi  w( S ).
i  S e (i , j )
each edge e covered by
at least one node in S
iS
sum fairness inequalities
for each node in S
15
Pricing Method
price of edge a-b
vertex weight
Figure 11.8
Example shows the pricing method does not provide the optimal
weighted vertex cover solution
16
Weighted Vertex Cover: IP Formulation
Weighted vertex cover. Given an undirected graph G = (V, E) with
vertex weights wi  0, find a minimum weight subset of nodes S such
that every edge is incident to at least one vertex in S.
Integer programming formulation.
Model inclusion of each vertex i using a 0/1 variable xi.

 0 if vertex i is not in vertex cover
xi  
 1 if vertex i is in vertex cover
Vertex covers in 1-1 correspondence with 0/1 assignments:
S = {i  V : xi = 1}



Objective function: minimize i wi xi.
– Constraints:…..
Must take either i or j: xi + xj  1.
17
Weighted Vertex Cover: IP Formulation
Weighted vertex cover. Integer programming formulation.
( ILP) min
 wi xi
i  V
s. t. xi  x j
xi
 1
(i, j)  E
 {0,1} i  V

Task: Show the concrete ILP equation set for an example graph.
18
Weighted Vertex Cover
Weighted vertex cover. Given an undirected graph G = (V, E) with
vertex weights wi  0, find a minimum weight subset of nodes S such
that every edge is incident to at least one vertex in S.
10
A
F
6
9
16
B
G
7
10
6
C
3
H
9
23
D
I
33
7
E
J
10
32
total weight = 55
19