sspv2006 6618

Download Report

Transcript sspv2006 6618

Satisfiability modulo
the Theory of Bit Vectors
Alessandro Cimatti
IRST, Trento, Italy
[email protected]
Joint work with R. Bruttomesso, A. Franzen, A. Griggio, R. Sebastiani
We gratefully acknowledge support from the Academic Research Program of Intel
1
Index of the talk
• Satisfiability Modulo Theory
• The theory of Bit Vectors
• Satisfiability Modulo BV
– Bit blasting
– Eager encoding into Linear Integer Arithmetic
– A lazy approach
• Conclusions
• ( A preview of QF_UFBV32 at SMT-COMP )
2
SMT in a nutshell
• Satisfiability Modulo Theory
– or: beyond boolean SAT
• Decide the satisfiability of a first order
formula with respect to a background theory
• Examples of relevant theories
–
–
–
–
–
–
uninterpreted functions: x=y & f(x) != f(y)
difference logic: x – y < 7
linear arithmetic: 3x + 2y < 12
arrays: read(write(M, a0, v0) a1)
their combinations
bit vectors
3
Why SMT
• From SAT-based to SMT-based verification
• Representation of interesting problems
–
–
–
–
timed automata
hybrid automata
pipelines
software
• Efficient solving
– leverage availability of structural information
– hopefully retaining efficiency of boolean SAT
4
Satisfiability Modulo Theory
• Satisfiability:
– is there a truth-assignment to boolean variables
– and a valuation to individual variables
– such that formula evaluates to true?
• Standard semantics for FOL
• Assignment to individual variables
– Induces truth values to atoms
• Truth assignment to boolean atoms
• Induced value to whole formula
5
+ -+
+-
Propositional
structure
+ -+
+-
TA TA
TA
TA
xyzw x
xyz
w x
P P P
6
Eager Approach to SMT
• Main idea: compilation to SAT
– STEP1: Theory part compiled to equisatisfiable
pure SAT problem
– STEP2: run propositional SAT solver
8
Propositional
structure
P P P
TA TA
TA
TA
xyzw x
xyz
w x
9
Lifted theory
Propositional
structure
TA TA TA TA P P P
10
The Lazy approach
• Ingredients
– a boolean SAT solver
– a theory solver
• The boolean solver is modified to enumerate
boolean (partial) models
• The theory solver is used to Check for theory
consistency
11
Propositional
structure
TA
TA P P P TA
TA
TA TA
TA
TA
xyzw x
xyz
w x
12
MathSAT: intuitions
• Two ingredients: boolean search and theory reasoning
– find boolean model
• theory atoms treated as boolean atoms
• truth values to boolean and theory atoms
• model propositionally satisfies the formula
– check consistency wrt theory
• set of constraints induced by truth values to theory atoms
• existence of values to theory variables
• Example: (P v (x = 3)) & (Q v (x – y < 1)) & (y < 2) & (P xor Q)
• Boolean model
– !P, (x = 3), Q, (x – y < 1), (y < 2)
– Check (x = 3), (x – y < 1), (y < 2)
– Theory contradiction!
• Another boolean model
– P , !(x = 3) , !Q, (x – y < 1), (y < 2)
– Check !(x = 3), (x – y < 1), (y < 2)
– Consistent: e.g. x := 0, y := 0
13
Boolean SAT: search space
P
Q
R
S
S

S
T
T
R
R


•
•
•
•
•
Q


T
SAT!
The DPLL procedure
Incremental construction of satisfying assignment
Backtrack/backjump on conflict
Learn reason for conflict
Splitting heuristics
14
MathSAT: approach
• DPLL-based enumeration of boolean models
– Retain all propositional optimizations
• Conflict-directed backjumping, learning
– No overhead if no theory reasoning
• Tight integration between
– boolean reasoning and
– theory reasoning
15
MathSAT: search space
P
Q
R
S
S
Bool 
Q
T
Bool T
Math 
S
T
R
R
Bool 
Bool T Bool 
Math 
Bool T
Math T
SAT!
Many boolean models are not theory consistent!
16
Early pruning
Check theory consistency of partial assignments
P
EP:Math 
Q
EP:Math T
S
EP:Math T
Pruned away
in the EP step
T
EP:Math T
R
EP:Math T
Bool 
Bool T
Math T
SAT!
17
THEORY OF
FIXED-WIDTH BIT VECTORS
18
Bit Vectors: Example
input a, b, c, d : reg[N];
LTmp0 = a;
LTmp1 = 2 * b;
LTmp2 = LTmp0 + LTmp1;
LTmp3 = 4 * c;
LTmp4 = LTmp2 + LTmp3;
LTmp5 = 8 * d;
LOut = LTmp4 + LTmp5;
Are they equivalent?
((a + 2b) + 4c) + 8d
RTmp0 = d;
RTmp1 = RTmp0 << 1;
RTmp2 = c + RTmp1;
RTmp3 = RTmp2 << 1;
RTmp4 = b + RTmp3;
RTmp5 = RTmp4 << 1;
ROut = a + RTmp5;
I.e. LOut = ROut ?
a + ((b + ((c + (d<<1)) <<1)) <<1)
19
Fixed Width Bit Vectors
• Constants
– 0b00001111, 0xFFFF, …
• Variables
– valued over BitVectors of corresponding width
– implicit restriction to finite domain
• Function symbols
–
–
–
–
–
selection: x[15:0]
concatenation: y :: z
bitwise operators: x && y, z || w, …
arithmetic operators: x + y, z * w, …
shifting: x << 2, y >> 3
• Predicate symbols
– comparators: =, ≠ , > , < , ≥ , ≤
20
Fragments of BV theory
• Core
– selection
– concatenation
• Bitwise operators
– x && y, x || y, x ^ y
• Arithmetic operators
– x +y, x – y, c * x
• Core + Bitwise + Arithmetic
• Complexity of equality between BV terms
– Core is in P
– Core + B + A in NP
• Variable width bit vectors: not covered here
– core is in NP
– small additions yield undecidability
21
Decision procedures for BV
• Many approaches
–
–
–
–
Cyrluk, Moeller, Ruess
Moeller, Ruess
Bjørner, Pichora
Barrett, Dill, Levitt
• Focus on deciding conjunctions of literals
• Emphasis on proof obligations in ITP
– some emphasis on variable width, generic wrt N
• Shostak-style integration
– canonization
– solving
22
SATISFIABILITY MODULO
THEORY OF BIT VECTORS
23
Satisfiability modulo Bit Vectors
• Applications of interest
– RTL hardware descriptions essentially bit vectors
– assembly-level programs
– software with finite precision arithmetic
• Key feature
– combination of control flow and data flow
• In principle, boolean logic can be encoded into BV
– control (boolean logic) encoded into width 1 BVs.
– Likely inefficient in comparison to SAT
• More natural to keep them separate at modeling
– structural info can be exploited for verification
24
Approaches to SMT(BV)
• Bit blasting
• Eager Encoding into LA
• Lazy approach
25
SMT(BV) via Bit Blasting
26
SMT(BV) via Bit Blasting
• Boolean variables: untouched
• Bit vector variables as collections of (unrelated)
boolean variables
– [x0, x1, …, x63]
• Selection/concatenations are trivial
– static detection
• Equalities / Assignments: x = y
– (x0 <-> y0) & (x1 <-> y1) & … & (x63 <-> y63)
• Bitwise operators: x && y
– [x0 & y0, x1 & y1, …, x63 & y63]
• Arithmetic operators: x + y
– BVADD([x0, …, x63], [y0, …, y63])
27
Comparison of Data Paths
input a, b, c, d : reg[N];
LTmp0 = a;
LTmp1 = 2 * b;
LTmp2 = LTmp0 + LTmp1;
LTmp3 = 4 * c;
LTmp4 = LTmp2 + LTmp3;
LTmp5 = 8 * d;
LOut = LTmp4 + LTmp5;
Are they equivalent?
((a + 2b) + 4c) + 8d
RTmp0 = d;
RTmp1 = RTmp0 << 1;
RTmp2 = c + RTmp1;
RTmp3 = RTmp2 << 1;
RTmp4 = b + RTmp3;
RTmp5 = RTmp4 << 1;
ROut = a + RTmp5;
I.e. LOut = ROut ?
a + ((b + ((c + (d<<1)) <<1)) <<1)
28
• a,b,c,d,…
Bit Blasting Words
– blasted to [a1,…aN], [b1,…bN], [c1,…cN], [d1,…dN], …
• LTmp6 != RTmp6
– (LOut.1 != ROut.1) or … or (LOut.N != ROut.N)
• LTmp1 = 2 * b
– formula in 2N vars, conjunction of N iffs
• LTmp2 = LTmp0 + LTmp1
– formula relating 3N vars
– possibly additional vars required (e.g. carries)
• N = 16 bits?
– 13 secs
• N = 32 bits?
– 170 secs
• “But obviously N = 64 bits!”
Scalability
with respect
to N???
– stopped after 2h CPU time
29
Bit-Blasting: Pros and Conses
• Bottlenecks
– dependency on word width
– “wrong” level of abstraction
• boolean synthesis of arithmetic circuits
• assignments are pervasive
• conflicts are very fine grained
– e.g. discover x < y
• Advantages
– let the SAT solver do all the work
• and nowadays SAT solvers are tough nuts to crack
– amalgamation of the decision process
• no distinction between control and data
• conflicts can be as fine grained as possible
– built-in capability to generate “new atoms”
30
Enhancements to BitBlasting
• Tuning SAT solver on structural information
– e.g. splitting heuristic for adders
• Preprocessing + SAT [GBD05]
– rewrite and normalize bit vector terms
– bit blasting to SAT
31
SMT(BV) via reduction to SMT(LA)
32
From BV to LIA
•
•
RTL-Datapath Verification using Integer Linear Programming [BD01]
BV constants as integers
•
BV variables as integer valued variables, with range constraints
•
•
Assignments treated as equality, e.g. x = y
Arithmetic, e.g. z = x + y
•
•
Concatenation: x :: y as 2^n x + y
Selection: relational encoding (based on integrity)
•
Bitwise operators
•
SOLVER
– 0b32_1111 as 15
– reg x [31:0] as x in range [0, 2^32)
– Linear arithmetic? not quite! BV Arithmetic is modulo 2^N
– z = x + y - 2^N s, with z in [0, 2^N)
– x[23:16] as xm, where
– x = 2^24 xh + 2^16 xm + xl,
xl in [0, 2^16), xm in [0, 2^8), xl in [0, 2^8)
– based on selection of individual bits
– the omega test
33
From SMT(BV) into SMT(LIA)
• Generalizes [BD01] to deal with boolean
structure
• Eager encoding into SMT(LIA)
• Unfortunately, not very efficient
• More precisely, a failure
34
Retrospective Analysis
•
Crazy approach?
–
Arithmetic
–
Selection and Concatenation
–
Bitwise operators
•
Linear arithmetic? not quite! BV Arithmetic is modulo 2^N
•
an easy problem becomes expensive!
•
HARD!!!
•
Available solvers not adequate
•
Functional dependencies are lost!
•
A clear culprit: static encoding
–
–
–
–
integers with infinite precision
reasoning with integers may be hard (e.g. BnB within real relaxation)
depending on control flow, same signal is split in different parts
z = if P then x[7:0] :: y[3:0] else x[5:2] :: y[10:3]
•
•
x, y and also z are split more than needed
the notion of “maximal chunk” depends on P !!!
35
SMT(BV) via online BV reasoning
36
A lazy approach
• Based on standard MathSAT schema
– DPLL-based model enumeation
– Dedicated Solver for Bit vectors
• The encoding leverages information resulting
from decisions
– Given values to control variables, the data path is
easier to deal with (e.g. maximal chunks are bigger)
• Layering in the theory solver
– equality reasoning
– limited simplification rules
– full blown bit vector solver only at the end
37
The architecture
Boolean enumeration
EUF reasoning
BV rewriter
BV solver
LIA
encoding
38
Rewriting rules
• evaluation of constant terms
– 0b8_01010101[4:2] becomes 0b3_101
• rules for equality
– x = y and Phi(x) becomes Phi(y)
– based on congruence closure
• splitting concatenations
– (x :: y) = z becomes x = z[h_n] && y == z[l_n]
39
Rewriting rules
• pushing selections
– (x && y)[7:0] becomes (x[7:0] && y[7:0])
– (x :: y)[23:8] becomes (x[7:0] :: y[15:8])
• “pigeon-hole” rules
– from (x != 0 & x != 1 & x != 2 & x < 3) derive false
40
BV rewriter
• Rules are applied until
– fix point reached
– contradiction found
• Implementation based on EUF reasoner
– rules as merges between eq classes
• Open issues
– incrementality/backtrackability
– selective rule activation
– conflic set reconstruction
• When it fails …
41
LIA encoding (the last hope)
• LIA encoding
– idenfication of maximal slices
– “purification”: separating out arithmetic and BW
by introduction of additional variables
• NB: on resulting problems
– LIA encoding always superior to bit blasting!!!
– cfr [DB01]
42
Status of Implementation
• Implementation still in prototypical state
• “Does a lot of stupid things”
–
–
–
–
conflict minimization by deletion filtering
checking that conflict are in fact minimal
unnecessary calls to LA for SAT clusters
calling LA solver implemented as dump on file, and
run external MathSAT
– huge conflict sets
43
A very very preliminary evaluation
44
Competitors
• Run against MiniSAT 1.14
– ~ winner of SAT competition in 2005
• KEY REMARK:
– boolean methods are very mature
• A good reason for giving up?
45
Test benches
• 74 benchmarks from industrial partner
– would have been ideal for SMT-COMP
• QF_UFBV32
• Unfortunately
– can not be disclosed
– “will have to be destroyed after the collaboration”
– hopefully our lives will be spared 
46
47
48
Conclusions
•
•
•
•
A “market need” for SMT(BV) solvers
Bit Blasting: tough competitors
After a failure, …
Preliminary results are encouraging
• Future challenges
–
–
–
–
optimize BV solver
better conflict sets
tackle some RTL verification cases
extension to memories
49
A small digression on
QF_UFBV32 at SMT-COMP
50
QF_UFBV[32] at SMT-COMP
• the MathSAT you will see there IS NOT the one I
described
• We currently have no results for QF_UFBV
• Easy benchmarks:
– QF_UFBV[32] not particularly “SMT”
– the boolean component is nearly missing
– the BV part is “easily” solvable by bit blasting
• We entered SMT-COMP QF_UFBV32
– MathSAT based on BIT BLASTING to SAT
– NuSMV based on bit blasting to BDDs
51
QF_UFBV: Bit Blasting to SAT
• Preprocessing based on
– Ackerman’s elimination of function symbols
– rewriting simplification
– bit blasting
• Core: call SAT solver underlying MathSAT
– every SAT problem in < 0.3 secs
– most UNSAT within seconds
– a handful of hard ones between 300 and 500 secs
52
BDDs (???) on SMT-COMP tests
•
•
•
•
Even NuSMV entered SMT-COMP
Ackerman’s elimination of functional symbols
Rewriting preprocessor
Core solver
–
–
–
–
–
–
based on BDDs
conjunctively partitioned problem
structural BDD-based ordering (bit interleaving)
(almost) no dynamic reordering
affinity-based clustering, threshold 100 nodes
early quantification
• Seems to work well both on SAT and UNSAT
instances
53
RESULTS
•
•
•
•
•
first STP
then YICES
then NuSMV
then CVC3 (but no results on two samples)
then MathSAT BITBLASTING
– 3rd on SAT
– last on UNSAT
54
SAT instances
1000
100
10
YICES
NUSMV
CVC3
MATHSAT
1
0.1
0.01
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
55
UNSAT instances
1000.00
100.00
10.00
YICES
NUSMV
CVC3
MATHSAT
1.00
0.10
0.01
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
56