Wei Wei Sampling, Counting, and Probabilistic Inference joint work with Bart Selman

Download Report

Transcript Wei Wei Sampling, Counting, and Probabilistic Inference joint work with Bart Selman

Sampling, Counting, and Probabilistic
Inference
Wei Wei
joint work with Bart Selman
1
The problem: counting solutions
¬a  b  c
¬a  ¬b
¬b  ¬c
cd
2
Motivation

Consider the standard logical inference

 iff (    ) is unsat
 there doesn’t exist a model in  in
which  is true.
 in all models of , query  holds
  holds with absolute certainty
3
Degree of belief

Natural generalization: degree of belief
of  is defined as P( | ) (Roth, 1996)
In absence of statistical information,
degree of belief can be calculated as
M(   ) / M( )
4
Bayesian Nets to Weighted Counting
(Sang, Beame, and Kautz, 2004)

Introduce new vars so all internal vars are
deterministic
A
Query: Pr(A  B)
A
.1
= Pr(A) * Pr (B|A)
= .1 * .2 = .02
B
B
A
~A
.2
.6
5
Complexity

SAT is NP-complete. 2-SAT is solvable in
linear time.
 Counting assignments (even for 2cnf,
Horn logic, etc) is #P-complete, and is
NP-hard to approximate to a factor within
(

(Valiant 1979, Roth 1996).
Approximate counting and sampling are
equivalent if the problem is “downward
self-reducible”.
6
(Roth, 1996)
7
Existing method: DPLL
(Davis, Logemann and Loveland, 1962)

(x1   x2  x3)  (x1  x2   x3)  (x1  x2)
DPLL was first proposed as a basic depth-first tree
search.
x1
T
T
null
x2
F
x2
F
solution
8
Existing Methods for Counting

CDP (Birnbaum and Lozinskii, 1999)

Relsat (Bayardo and Pehoushek, 2000)
9
Existing Methods

1.
2.
cachet (Sang, Beame, and Kautz, 2004)
Component caching
Clause learning
10
Conflict Graph
Known Clauses
(p  q  a)
( a   b   t)
(t  x1)
(t  x2)
(t  x3)
(x1  x2  x3  y)
(x2  y)
1-UIP scheme
(t)
x1
p
q
y
a
t
b
Current decisions
p  false
q  false
b  true
x2
false
y
x3
Decision scheme
(p  q   b)
11
Existing Methods

Pro: get exact count

Cons:
1. Cannot predict execution time
2. Cannot halt execution to get an approximation
3. Cannot handle large formulas
12
Our proposal: counting by sampling

The algorithm works as follows (Jerrum
and
Valiant, 1986):
1. Draw K samples from the solution space
2. Pick a variable X in current formula
3. Set variable X to its most sampled value t, and
the multiplier for X is K/#(X=t).
Note 1  multiplier  2
4. Repeat step 1-3 until all variables are set
5. The number of solutions of the original formula is
the product of all multipliers.
13
assignments
X1=T
X1=F
models
14
Research issues

how well can we estimate each multiplier?
we'll see that sampling works quite well.

how do errors accumulate? (note formula
can have hundreds of variables; could
potentially be very bad)
surprisingly, we will see that errors often
cancel each other out.
15
Standard Methods for Sampling MCMC
Based on setting up a Markov chain
with a predefined stationary distribution.
 Draw samples from the stationary
distribution by running the Markov chain
for sufficiently long.
 Problem: for interesting problems,
Markov chain takes exponential time to
converge to its stationary distribution

16
Simulated Annealing

Simulated Annealing uses Boltzmann
distribution as the stationary distribution.
 At low temperature, the distribution
concentrates around minimum energy states.
 In terms of satisfiability problem, each
satisfying assignment (with 0 cost) gets the
same probability.
 Again, reaching such a stationary distribution
takes exponential time for interesting
problems. – shown in a later slide.
17

Question: Can state-of-the-art local
search procedures be used for SAT
sampling? (as alternatives to standard
Monte Carlo Markov Chain)
Yes! Shown in this talk
18
Our approach – biased random
walk
Biased random walk = greedy bias +
pure random walk. Example: WalkSat
(Selman et al, 1994), effective on SAT.
 Can we use it to sample from solution
space?

– Does WalkSat reach all solutions?
– How uniform is the sampling?
19
WalkSat (50,000,000 runs in total)
visited 500,000 times
visited 60 times
Hamming distance
20
Probability Ranges in Different
Domains
Instance
Runs
Hits
Rarest
Hits
Common
Common-to
-Rare Ratio
Random
50  106
53
9  105
1.7  104
Logistics
planning
1  106
84
4  103
50
Verif.
1  106
45
318
7
21
Improving the Uniformity of Sampling
Nonergodic
Ergodic
Ergodic
Quickly reach sinks Slow convergence
WalkSat

+
SA
=
Does not satisfy DBC
SampleSat
SampleSat:
– With probability p, the algorithm makes a
biased random walk move
– With probability 1-p, the algorithm makes a
SA (simulated annealing) move
22
Comparison Between WalkSat and
SampleSat
WalkSat
SampleSat
10
104
23
WalkSat (50,000,000 runs in total)
Hamming distance
24
SampleSat
174 sols, r = 11
Total hits = 5.3m
Average hits = 30.1k
1186 sols, r = 14
Total hits = 17.3m
Average hits = 14.6k
39 sols, r = 7
Total hits = 5.1m
Average hits = 131k
704 sols, r = 14
Total hits = 11.1m
Average hits = 15.8k
24 sols, r = 5
Total hits = 0.6m
Average hits = 25k
192 sols, r = 11
Total hits = 5.7m
Average hits = 29.7k
212 sols, r = 11
Total hits = 2.9m
Average hits = 13.4k
Hamming Distance
25
Instance
Runs
Hits
Rarest
Hits
Common
Common-to
-Rare Ratio
WalkSat
Ratio
SampleSat
Random
50  106
53
9  105
1.7  104
10
Logistics
planning
1  106
84
4  103
50
17
Verif.
1  106
45
318
7
4
26
Analysis
c1
c2
c3
…
cn
a
b
F
F
F
…
F
F
F
F
F
F
…
F
F
T
27
Property of F*
Proposition 1 SA with fixed temperature
takes exponential time to find a solution
of F*
 This shows even for some simple
formulas in 2cnf, SA cannot reach a
solution in poly-time

28
Analysis, cont.
c1
c2
c3
…
cn
a
T
T
T
…
T
T
F
F
F
…
F
T
F
F
F
…
F
F
Proposition 2:
pure RW
reaches this
solution with
exp. small
29
prob.
SampleSat

In SampleSat algorithm, we can devide the
search into 2 stages. Before SampleSat
reaches its first solution, it behaves like
WalkSat.
instance
WalkSat
SampleSat
SA
random
382
677
24667
logistics
5.7  104
15.5  105
> 109
verification
36
65
10821
30
SampleSat, cont.

After reaching the solution, random walk
component is turned off because all clauses
are satisfied. SampleSat behaves like SA.
 Proposition 3 SA at zero temperature
samples all solutions within a cluster
uniformly.
 This 2-stage model explains why SampleSat
samples more uniformly than random walk
algorithms alone.
31
Back to Counting: ApproxCount

The algorithm works as follows (Jerrum
and
Valiant, 1986):
1. Draw K samples from the solution space
2. Pick a variable X in current formula
3. Set variable X to its most sampled value t, and
the multiplier for X is K/#(X=t).
Note 1  multiplier  2
4. Repeat step 1-3 until all variables are set
5. The number of solutions of the original formula is
the product of all multipliers.
32
Random 3-SAT, 75 Variables
(Sang, Beame, and Kautz, 2004)
Cachet
Relsat
CDP
sat/unsat
threshhold
33
34
Within the Capacity of Exact
Counters

We compare the results of approxcount with those of the exact
counters.
instances
#variables Exact
count
ApproxCount Average
Error per
step
prob004-log-a
1790
2.6  1016 1.4  1016
0.03%
wff.3.200.810
200
3.6  1012 3.0  1012
0.09%
dp02s02.shuffled 319
1.5  1025 1.2  1025
0.07%
35
And beyond …

We developed a family of formulas
whose solutions are hard to count
– The formulas are based on SAT encodings
of the following combinatorial problem
– If one has n different items, and you want
to choose from the n items a list (order
matters) of m items (m<=n). Let P(n,m)
represent the number of different lists you
can construct. P(n,m) = n!/(n-m)!
36
37
38
39
Conclusion and Future Work
Shows good opportunity to extend
SAT solvers to develop algorithms
for sampling and counting tasks.
 Next step: Use our methods in
probabilistic reasoning and
Bayesian inference domains.

40

The end.
41
42
43