Phase Transitions in Proof Complexity and Satisfiability Search
Download
Report
Transcript Phase Transitions in Proof Complexity and Satisfiability Search
Phase Transitions in
Proof Complexity and Satisfiability Search
Paul Beame
University of Washington
with
Dimitris Achlioptas Michael Molloy
Microsoft Research
U. Toronto
1
Satisfiability
F
(x1 x2 x4) (x1 x3) (x3 x2) (x4 x3)
satisfying assignment for F: x1, x2, x3, x4
Given F does such an assignment exist?
2
Satisfiability Algorithms
• Incomplete Algorithms
will (likely) find a satisfying assignment but
will simply give up if one is not found
• Complete Algorithms
will either find a satisfying assignment or
determine that no such assignment exists
3
Satisfiability Algorithms
• Incomplete Algorithms
Local search
GSAT
Walksat
[Selman,Levesque,Mitchell 92]
[Kautz,Selman 96]
Belief Propagation
SP
[Braunstein, Mezard, Zecchina 02]
• Complete Algorithms
Backtracking search
DPLL
[Davis,Putnam 60]
[Davis,Logeman,Loveland 62]
DPLL + “clause learning”
GRASP, SATO, zchaff
4
Simplification and Satisfaction
F (x1 x2 x4) (x1 x3) (x3 x2) (x4 x3)
satisfying assignment for F: x1, x2, x3, x4
• Simplifying F after setting literal x3 to true
F
(x1 x2 x4) (x1 x3) (x3 x2) (x4 x3)
F|x3 (x1 x2 x4)
(x2) (x4)
1-clauses
• F is satisfied if all clauses disappear under
simplification given the assignment
5
Backtracking search/DPLL
DPLL(F)
while F contains a 1-clause l’
F F|l’
if F has no clauses output ‘satisfiable’
Residual
halt
formula
if F has an empty clause
backtrack
else select a literal l = some x or x
DPLL(F|l)
if backtrack then DPLL(F|l)
6
Some standard select choices
for DPLL algorithms
• UC: Unit Clause/Ordered DLL
Choose variables in a fixed order
Always set True first
• UCwm: Unit Clause with majority
Choose variables in a fixed order
Apply a majority vote among 3-clauses for
assigning each value
• GUC: Generalized Unit Clause
Choose a variable v in a shortest clause C
Set v to satisfy C
7
Random k-CNF formulas
• Distribution Fk,n(r)
Randomly choose rn clauses over n
variables independently, each of size k
Each size k clause is equally likely
• Threshold value rk*
• r rk*, almost certainly satisfiable
• r rk*, almost certainly unsatisfiable
• Hardest problems near threshold
8
DPLL on random 3-CNF*
Proof complexity
# of DPLL
backtracks
shows 2Q(n/r) time
is required for
unsatisfiable formulas
for r r3*
[B,Karp,Saks,Pitassi 98]
[Ben-Sasson 02]
1
0
What about satisfiable
formulas below threshold?
4.267
ratio of clauses to variables
r
[Mitchell,Selman,Levesque 92]
* n = 50 variables
9
Exponential lower bounds for 3-CNF
formulas below ratio 4.267
Theorem Let A {UC, UCwm, GUC}. Let
r3UC
= 3.81
r3UCwm = 3.83
r3GUC = 4.01
w.h.p. algorithm A takes exponential time
on a random FF3,n (r) for r r3A
Exponential lower bounds for satisfiable
formulas below the k-CNF threshold
Theorem There exist lk2k/k and uk2k s.t. for
every k 4 and for FFk,n (r) with lk r uk
w.h.p.
• F is satisfiable
• UC takes exponential time on F
Note These formulas have huge numbers of
satisfying assignments (more than 2 (1-) n out
of a possible 2n) but still are hard
11
Ideas
Part I:
Use differential equations to analyze trajectory of
algorithm as a function of the clause-variable ratio
for r larger than lk
Use resolution proof complexity to show that some
residual formula along this trajectory requires
large DPLL running time
Part II:
Show that formulas up to ratio uk are satisfiable
[Achlioptas, Peres 03]
uk=2k ln 2 – (k+4)/2
12
Algorithmic behavior using
simple select choices
• On input FFk,n (r) before the first backtrack
occurs, the residual formula F’ is distributed
as F2Fk where
FjFj,n’ (rj) for j=2,,k only has clauses of size k
Fj are mutually independent
• Values of rj almost surely follow algorithmdependent trajectories given by differential
equations
13
Proof Complexity
• Study of the number of symbols required for proofs of
unsatisfiability (or tautology) in propositional logic
• Does not address algorithmic issue
How would you find short proofs if they existed?
• Existence of short proofs for every unsatisfiable
formula is equivalent to NP = co-NP (and is implied
by P=NP)
Generally believed that such proofs don’t exist
• Active research area with rich theory and many open
questions
14
Resolution
• Start with clauses of CNF formula F
• Resolution rule
Given (A x), (B x) can derive
(A B)
• The empty clause is derivable
F is unsatisfiable
• Proof size = # of clauses used
15
Resolution and DPLL
• Running DPLL with any select rule on
an unsatisfiable formula F generates a
Resolution refutation of F
# of clauses running time
16
Backtracking search/DPLL
DPLL(F)
while F contains a 1-clause l’
F F|l’
if F has no clauses output ‘satisfiable’
Residual
halt
formula
if F has an empty clause
backtrack
else select a literal l = some x or x
DPLL(F|l)
if backtrack then DPLL(F|l)
17
Long-running DPLL Executions
Residual formula at
each node is a
mix of 2- and 3-clauses
Residual formula at
is unsatisfiable
2
rn
Every
Algorithm’s
resolution
proof of unsatisfiability
is exponentially long
Satisfiability for mixed random
formulas: proven properties
[Achlioptas et al 96]
1
[Kaporis et al 03]
?
?
[Dubois 01]
2-clause ratio
? ? ?
?
UNSAT
? ?
? ?
SAT
?
2/3
2.28
3.52
3-clause ratio
?
4.501
19
Resolution proof complexity of
mixed random formulas
Theorem A random CNF formula FF2,n (r2) is
Easy
Satisfiable w.h.p. if r2<1
Unsatisfiable w.h.p if r2>1 and has linear size resolution
proofs [Chvatal-Reed 91], [Goerdt 91], [De La Vega 91]
Theorem For any constant r30, w.h.p. GF3,n (r3)
requires an exponential-size resolution proof of
unsatisfiability
[Chvatal,Szemeredi 88]
Hard
Theorem For any constants r21 and r3 0, w.h.p. for
FF2,n (r2) and GF3,n (r3) the combined formula
FG requires an exponential-size resolution proof of
unsatisfiability
Easy Hard = Hard
20
Sharp Threshold in Resolution
Proof Complexity
• Define distribution Hn(r) on CNF formulas of
the form H=FG where
GF3,n (r3) for some r32.28 and
FF2,n (r).
• Then for HHn(r) w.h.p.
H is unsatisfiable
For r 1, H has O(n) size resolution proofs
For r 1, H requires 2W(n) size resolution proofs
21
Trajectory on 3-CNF
2-clause ratio
1
UC Algorithm Trajectory
Provably
UNSAT
& Hard
Provably
SAT & Easy
3.52 3.81 4.267 4.51
3-clause ratio
22
UC trajectory for k 4
• Start with 2.752kn/k k-clauses
• Wait until 3n/(k-1) variables remain
• With high probability:
The 2-clauses remained satisfiable throughout
The residual formula overall is unsatisfiable
Its resolution complexity is exponential
23
Directions
• What price completeness?
• Closing gap for unsatisfiability of mixed formulas
would yield an algorithm-dependent phase
transition
Below rA algorithm runs
in linear time
Above rA algorithm
requires exponential time
24