Phase Transitions in Proof Complexity and Satisfiability Search

Download Report

Transcript Phase Transitions in Proof Complexity and Satisfiability Search

Phase Transitions in
Proof Complexity and Satisfiability Search
Paul Beame
University of Washington
with
Dimitris Achlioptas Michael Molloy
Microsoft Research
U. Toronto
1
Satisfiability
F
(x1  x2  x4) (x1  x3) (x3  x2) (x4  x3)
satisfying assignment for F: x1, x2, x3, x4
Given F does such an assignment exist?
2
Satisfiability Algorithms
• Incomplete Algorithms
 will (likely) find a satisfying assignment but
will simply give up if one is not found
• Complete Algorithms
 will either find a satisfying assignment or
determine that no such assignment exists
3
Satisfiability Algorithms
• Incomplete Algorithms
 Local search
GSAT
Walksat
[Selman,Levesque,Mitchell 92]
[Kautz,Selman 96]
 Belief Propagation
SP
[Braunstein, Mezard, Zecchina 02]
• Complete Algorithms
 Backtracking search
DPLL
[Davis,Putnam 60]
[Davis,Logeman,Loveland 62]
DPLL + “clause learning”
GRASP, SATO, zchaff
4
Simplification and Satisfaction
F (x1  x2  x4) (x1  x3) (x3  x2) (x4  x3)
satisfying assignment for F: x1, x2, x3, x4
• Simplifying F after setting literal x3 to true
F
(x1  x2  x4) (x1  x3) (x3  x2) (x4  x3)
F|x3 (x1  x2  x4)
(x2) (x4)
1-clauses
• F is satisfied if all clauses disappear under
simplification given the assignment
5
Backtracking search/DPLL
DPLL(F)
while F contains a 1-clause l’
F  F|l’
if F has no clauses output ‘satisfiable’
Residual
halt
formula
if F has an empty clause
backtrack
else select a literal l = some x or x
DPLL(F|l)
if backtrack then DPLL(F|l)
6
Some standard select choices
for DPLL algorithms
• UC: Unit Clause/Ordered DLL
 Choose variables in a fixed order
 Always set True first
• UCwm: Unit Clause with majority
 Choose variables in a fixed order
 Apply a majority vote among 3-clauses for
assigning each value
• GUC: Generalized Unit Clause
 Choose a variable v in a shortest clause C
 Set v to satisfy C
7
Random k-CNF formulas
• Distribution Fk,n(r)
 Randomly choose rn clauses over n
variables independently, each of size k
 Each size k clause is equally likely
• Threshold value rk*
• r  rk*, almost certainly satisfiable
• r  rk*, almost certainly unsatisfiable
• Hardest problems near threshold
8
DPLL on random 3-CNF*
Proof complexity
# of DPLL
backtracks
shows 2Q(n/r) time
is required for
unsatisfiable formulas
for r  r3*
[B,Karp,Saks,Pitassi 98]
[Ben-Sasson 02]
1
0
What about satisfiable
formulas below threshold?
4.267
ratio of clauses to variables
r
[Mitchell,Selman,Levesque 92]
* n = 50 variables
9
Exponential lower bounds for 3-CNF
formulas below ratio 4.267
Theorem Let A {UC, UCwm, GUC}. Let
r3UC
= 3.81
r3UCwm = 3.83
r3GUC = 4.01
w.h.p. algorithm A takes exponential time
on a random FF3,n (r) for r  r3A
Exponential lower bounds for satisfiable
formulas below the k-CNF threshold
Theorem There exist lk2k/k and uk2k s.t. for
every k  4 and for FFk,n (r) with lk  r  uk
w.h.p.
• F is satisfiable
• UC takes exponential time on F
Note These formulas have huge numbers of
satisfying assignments (more than 2 (1-) n out
of a possible 2n) but still are hard
11
Ideas
Part I:
Use differential equations to analyze trajectory of
algorithm as a function of the clause-variable ratio
for r larger than lk
Use resolution proof complexity to show that some
residual formula along this trajectory requires
large DPLL running time
Part II:
Show that formulas up to ratio uk are satisfiable
[Achlioptas, Peres 03]
uk=2k ln 2 – (k+4)/2
12
Algorithmic behavior using
simple select choices
• On input FFk,n (r) before the first backtrack
occurs, the residual formula F’ is distributed
as F2Fk where
 FjFj,n’ (rj) for j=2,,k only has clauses of size k
 Fj are mutually independent
• Values of rj almost surely follow algorithmdependent trajectories given by differential
equations
13
Proof Complexity
• Study of the number of symbols required for proofs of
unsatisfiability (or tautology) in propositional logic
• Does not address algorithmic issue
 How would you find short proofs if they existed?
• Existence of short proofs for every unsatisfiable
formula is equivalent to NP = co-NP (and is implied
by P=NP)
 Generally believed that such proofs don’t exist
• Active research area with rich theory and many open
questions
14
Resolution
• Start with clauses of CNF formula F
• Resolution rule
 Given (A  x), (B  x) can derive
(A  B)
• The empty clause is derivable
 F is unsatisfiable
• Proof size = # of clauses used
15
Resolution and DPLL
• Running DPLL with any select rule on
an unsatisfiable formula F generates a
Resolution refutation of F
 # of clauses  running time
16
Backtracking search/DPLL
DPLL(F)
while F contains a 1-clause l’
F  F|l’
if F has no clauses output ‘satisfiable’
Residual
halt
formula
if F has an empty clause
backtrack
else select a literal l = some x or x
DPLL(F|l)
if backtrack then DPLL(F|l)
17
Long-running DPLL Executions
Residual formula at
each node is a
mix of 2- and 3-clauses
Residual formula at
is unsatisfiable
2
rn

Every
Algorithm’s
resolution
proof of unsatisfiability
is exponentially long
Satisfiability for mixed random
formulas: proven properties
[Achlioptas et al 96]
1
[Kaporis et al 03]
?
?
[Dubois 01]
2-clause ratio
? ? ?
?
UNSAT
? ?
? ?
SAT
?
2/3
2.28
3.52
3-clause ratio
?
4.501
19
Resolution proof complexity of
mixed random formulas
Theorem A random CNF formula FF2,n (r2) is
Easy
 Satisfiable w.h.p. if r2<1
 Unsatisfiable w.h.p if r2>1 and has linear size resolution
proofs [Chvatal-Reed 91], [Goerdt 91], [De La Vega 91]
Theorem For any constant r30, w.h.p. GF3,n (r3)
requires an exponential-size resolution proof of
unsatisfiability
[Chvatal,Szemeredi 88]
Hard
Theorem For any constants r21 and r3 0, w.h.p. for
FF2,n (r2) and GF3,n (r3) the combined formula
FG requires an exponential-size resolution proof of
unsatisfiability
Easy  Hard = Hard
20
Sharp Threshold in Resolution
Proof Complexity
• Define distribution Hn(r) on CNF formulas of
the form H=FG where
 GF3,n (r3) for some r32.28 and
 FF2,n (r).
• Then for HHn(r) w.h.p.
 H is unsatisfiable
 For r  1, H has O(n) size resolution proofs
 For r  1, H requires 2W(n) size resolution proofs
21
Trajectory on 3-CNF
2-clause ratio
1
UC Algorithm Trajectory
Provably
UNSAT
& Hard
Provably
SAT & Easy
3.52 3.81 4.267 4.51
3-clause ratio
22
UC trajectory for k  4
• Start with 2.752kn/k k-clauses
• Wait until 3n/(k-1) variables remain
• With high probability:
 The 2-clauses remained satisfiable throughout
 The residual formula overall is unsatisfiable
 Its resolution complexity is exponential
23
Directions
• What price completeness?
• Closing gap for unsatisfiability of mixed formulas
would yield an algorithm-dependent phase
transition
 Below rA algorithm runs
in linear time
 Above rA algorithm
requires exponential time
24