Decision Procedures Customized for Formal Verification Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri.
Download ReportTranscript Decision Procedures Customized for Formal Verification Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri.
Decision Procedures Customized for Formal Verification
Randal E. Bryant
Carnegie Mellon University
http://www.cs.cmu.edu/~bryant Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri
Outline
Context
Infinite state models of hardware systems
Verification techniques Needs
Requirements for decision procedures
Dealing with quantifiers Our Solution
SAT-based procedure “Eager” Boolean encoding
– 2 – CADE ‘05
Verification Example
Task
Verify that microprocessor correctly implements instruction set definition
Even though heavily pipelined
– 3 –
Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996
CADE ‘05
Existing Hardware Verification Methods
Simulators, equivalence checkers, model checkers, … All Operate at Bit Level
View each register or memory bit as state variable
Behavior of each state variable defined by Boolean function Strengths
Finite-state systems conceptually simple
BDDs & SAT procedures allow high degrees of automation Limitations
State space can be very large
Only verify fixed instantiation of system
Specific memory sizes, number of processes, buffer lengths, …
– 4 – CADE ‘05
Verification Challenges
Sources of Complexity
Lots of internal state
Complex control logic Opportunities
Most of the logic serves to store, select, and communicate data
– 5 –
Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996
CADE ‘05
Applying Data Abstraction to Hardware Verification
Idea
Abstract details of data encodings and operations
Keep control logic precise Applications
Verify overall correctness of system
Assuming individual functional units correct Advantages of Abstraction
Abstract infinite-state system easier to verify than detailed finite-state one
Parametric representation allows verification of many different system variants
Arbitrary number of processes, buffer lengths, etc.
– 6 – CADE ‘05
Word Abstraction
Control Logic Com.
Log.
1 Data Path Com.
Log.
2
– 7 –
Data: Abstract details of form & functions Control: Keep at bit level Timing: Keep at cycle level
CADE ‘05
Data Abstraction #1: Bits → Terms
x
0
x
1
x
2
x x n
-1
View Data as Symbolic Words
Arbitrary integers
No assumptions about size or encoding
Classic model for reasoning about software
Can store in memories & registers
– 8 – CADE ‘05
Abstracting Data Bits
Control Logic Com.
?
1 Com.
?
2
What do we do about logic functions?
– 10 – CADE ‘05
Abstraction #2: Uninterpreted Functions
A L U
f
For any Block that Transforms or Evaluates Data:
Replace with generic, unspecified function
Only assumed property is functional consistency:
a
=
x
b
=
y
f
(
a, b
) =
f
(
x, y
)
– 11 – CADE ‘05
Abstracting Functions
Control Logic Com.
Log.
1 Data Path Com.
Log.
1 For Any Block that Transforms Data:
Replace by uninterpreted function
Ignore detailed functionality
Conservative approximation of actual system
– 12 – CADE ‘05
Abstraction #3: Modeling Memories as Mutable Functions
Memory M Modeled as Function
a
M
M
(
a
): Value at location
a
Initially M
a
m
0
Arbitrary state
Modeled by uninterpreted function
m
0
– 14 – CADE ‘05
Effect of Memory Write Operation
Writing Transforms Memory
M
= Write(
M
,
wa
,
wd
)
wa
M
=
wd a
M 1 0 Express with Lambda Notation
M
=
a
. ITE(
a
=
wa
,
wd
,
M
(
a
)) – 15 – Reading from updated memory: Address
wa
will get
wd
Otherwise get what’s already in M CADE ‘05
Systems with Buffers
Unbounded Buffer In Use Circular Queue In Use • • • Modeling Method
Mutable function to describe buffer contents
Integers to represent head & tail pointers
Parameterize buffer capacity with symbolic value Max Max 1
– 16 – CADE ‘05
Some History of Term-Level Modeling
Historically
Standard model used for program verification
Unbounded integer data types
Widely used with theorem-proving approaches to hardware verification
E.g, Hunt ’85 Automated Approaches to Hardware Verification
Burch & Dill, ’95
Tool for verifying pipelined microprocessors
Implemented by form of symbolic simulation
Continued application to pipelined processor verification
– 17 – CADE ‘05
UCLID
Seshia, Lahiri, Bryant, CAV ‘02 Term-Level Verification System
Language for describing systems
Inspired by CMU SMV
Symbolic simulator
Generates integer expressions describing system state after sequence of steps
Decision procedure
Determines validity of formulas
Support for multiple verification techniques Available by Download http://www.cs.cmu.edu/~uclid
– 18 – CADE ‘05
Required Logic
Scalar Data Types
Formulas (
F
)
Control signals
Terms (
T
)
Data values Boolean Expressions Integer Expressions Functional Data Types
Functions (
Fun
) Integer
Integer
Immutable: Functional units
Mutable: Memories
Predicates (
P
) Integer
Boolean
Immutable: Data-dependent control
Mutable: Bit-level memories
– 19 – CADE ‘05
CLU Logic
C ounter Arithmetic, L ambda Expressions and U interpreted Functions Terms (
T
)
ITE
(
F
,
T
1 ,
T
2 )
Fun
(
T
1 , …,
T k
)
succ
(
T
)
pred
(
T
) Integer Expressions If-then-else Function application Increment Decrement Formulas (
F
)
F
,
F
1
F
2 ,
F
1
T
1 =
T
2
T
1 <
T
2
P
(
T
1 , …,
T k
)
F
2 Boolean Expressions Boolean connectives Equation Inequality Predicate application
– 20 –
To support pointer operations
CADE ‘05
CLU Logic (Cont.)
Functions (
Fun
)
f
x
1
, …, x k . T
Predicates (
P
)
p
x
1
, …, x k . F
Integer
Integer Uninterpreted function symbol Function definition Integer
Boolean Uninterpreted predicate symbol Predicate definition
– 21 – CADE ‘05
Outline
Context
Infinite state models of hardware systems
Verification techniques Needs
Requirements for decision procedures
Dealing with quantifiers Our Solution
SAT-based procedure “Eager” Boolean encoding
– 22 – CADE ‘05
Verifying Safety Properties
Present State Next State Bad States
Reachable States Reset States Reset Inputs (Arbitrary) State Machine Model
State encoded as Booleans, integers, and functions
Next state function expresses how updated on each step Prove: System will never reach bad state
– 23 – CADE ‘05
Bounded Model Checking
Reachable R n Bad States
– 24 –
R 2 R 1 Reset States Repeatedly Perform Image Computations
Set of all states reachable by one more state transition Underapproximation of Reachable State Set
But, typically catch most bugs with 8 –10 steps
CADE ‘05
S
Implementing BMC
Reset
Bad
Satisfiable?
– 25 –
X 1 X 2 X n
Construct verification condition formula for step n by symbolically simulating system for n cycles
Check with decision procedure
Do as many cycles as tractable
CADE ‘05
True Model Checking
R n Bad States R 2 R 1 Reset States Reach Fixed-Point
R n = R n+1 = Reachable
– 26 –
Impractical for Term-Level Models
Many systems never reach fixed point
Can keep adding elements to buffer
Convergence test undecidable (Bryant, Lahiri, Seshia, CHARME ’03)
CADE ‘05
Inductive Invariant Checking
I
Bad States Reachable States Reset States Key Properties of System that Make it Operate Correctly
Formulate as formula
I
Prove Inductive
– 27 –
Holds initially
I
(s 0 ) Preserved by all state changes
I
(s)
I
(
(i, s))
CADE ‘05
Inductive Invariants
Formulas
I
1 , …,
I n
I I j
(
s
0 ) holds for any initial state
s
0 , for 1
1 (
s
)
I
2 (
s
)
…
successor state
s I n
(
s
)
for 1
j I j
(
s
n
j
n
) for any current state
s
and Overall Correctness
Follows by induction on time Restricted form of invariants
x 1
x 2 …
x k
(x 1 …x k )
(x 1 …x k ) is a CLU formula without quantifiers x 1 …x k are integer variables free in
(x 1 …x k )
Express properties that hold for all buffer indices, register IDs, etc.
– 28 – CADE ‘05
Proving Invariants
Proving invariants inductive requires quantifiers |= [
x 1
x 2 …
x k
(x 1 …x k ) ]
[
y 1
y 2 …
y m
(y 1 …y m ) ] Prove unsatisfiability of formula
x 1
x 2 …
x k
(x 1 …x k )
(y 1 …y m ) Undecidable Problem
In logic with uninterpreted functions and equality
– 29 – CADE ‘05
Invariant Checking: Out-of-Order Processor Designs
Total Invariants UCLID time Person time base 13 54 s 2 days exc 34 exc / br 39 exc / br / mem-simp 67 exc / br / mem 71 236 s 7 days 403 s 9 days 1594 s 24 days 2200 s 34 days
Generating invariants requires considerable human effort
Impractical for realistic designs
– 30 – CADE ‘05
Constructing Invariants from Predicates
Predicates rob.head
reg.tag(r) reg.valid(r)
– 31 –
reg.tag(r) = t rob.dest(t) = r Invariant
r,t.
reg.valid(r)
reg.tag(r) = t ( rob.head
reg.tag(r) < rob.tail rob.dest(t) = r ) Result: Correctness
CADE ‘05
Automatic Predicate Abstraction
Graf & Saïdi, CAV ’97 Idea
Given set of predicates
P
1 (
s
), …,
P k
(
s
)
Boolean formulas describing properties of system state
View as abstraction mapping:
States
{0,1}
k
Defines abstract FSM over state set {0,1}
k
Form of abstract interpretation
Do reachability analysis similar to symbolic model checking Early Implementations Inefficient
Guess at possible next abstract states
Test with call to decision procedure
– 32 – CADE ‘05
P.E. as Invariant Generator
A R n Reach Fixed-Point on Abstract System R 2 Abstract System R 1 Reset States
Termination guaranteed, since finite state Concretize
Equivalent to Computing Invariant for Concrete System Concrete System
C
I
Strongest possible invariant that can be expressed by formula over these predicates Reset States
– 33 – CADE ‘05
Symbolic Formulation of Predicate Abstraction
Lahiri, Bryant, Cook, CAV ‘03 Basic Operation
Compute set of legal abstract next states
( B
) given current abstract states
( B ) B, B
: Abstract current and next-state state variables
,
: Boolean formulas
Create formula of form
( S , B
) Possible combinations of current concrete state S abstract state B
and next Formulate as Quantifier Elimination Problem
Generate formula of form
( B
)
S
( S , B
) S : Integer variables
For interpretation of B
, formula
true iff
( S , B
) satisfiable
– 34 – CADE ‘05
Outline
Context
Infinite state models of hardware systems
Verification techniques Needs
Requirements for decision procedures
Dealing with quantifiers Our Solution
SAT-based procedure “Eager” Boolean encoding
– 35 – CADE ‘05
Decision Procedure Needs
Bounded Model Checking
Satisfiability of quantifier-free CLU formula
Handled by decision procedure Invariant Checking
Satisfiability of quantified CLU formula
Undecidable Predicate Abstraction
Eliminate quantifiers from CLU formula Role of Decision Procedure
Apply in sound, but incomplete way
– 36 – CADE ‘05
UCLID Decision Procedure Operation
CLU Formula Lambda Expansion
Series of transformations leading to propositional formula
Except for lambda expansion, each has polynomial complexity
-free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability
– 37 – CADE ‘05
SAT-based Decision Procedures
Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver – 38 – satisfiable unsatisfiable
EAGER ENCODING
Input Formula Approximate Boolean Encoder additional clause unsatisfiable Boolean Formula SAT Solver satisfiable satisfying assignment First-order Conjunctions SAT Checker unsatisfiable
LAZY ENCODING
satisfiable CADE ‘05
Eager Encoding Characteristics
Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver –
Must encode all information about domain properties into Boolean formula
–
Some properties can give exponential blowup
+
Lets SAT solver do all of the work Good Approach for Some Domains
Modern SAT solvers have remarkable capacity
Good at extracting relevant portions out of very large formulas
Learns about formula properties as search proceeds
satisfiable unsatisfiable – 39 – CADE ‘05
Encoding Methods
Difference Logic Formula
– 41 – Small Domain Encoding (SD)
Boolean Formula
SAT Solver satisfiable/unsatisfiable Per-Constraint Encoding (PC) CADE ‘05
Small Domain Encoding (SD)
[Bryant, Lahiri, Seshia, CAV’02]
x
y
y
z
z
x+
1
0
x
1
x
0
0
y
1
y
0
0
y
1
y
0
0
z
1
z
0
0
z
1
z
0
0
x
1
x
0
+
1 Observation: To check satisfiability, need to consider all possible
relative
orderings of
finitely-many
expressions
z x x
+1 Values increase
– 42 –
y z y x x
+1 Can use Boolean encoding of finite range of values
–
4 values in this case, so 2-bit encoding
CADE ‘05
Per-Constraint Encoding (PC)
[Strichman, Seshia, Bryant, CAV’02]
x
y
y
z
z
x+
1
Overall Boolean Encoding
e 1
e 2
e 3 e 1
e 2
e 4
e 3 e 4 e 1 e 2 e 3 z
x y
x+ z y
1
New Difference Predicate
e 4 x
z
Transitivity Constraints – 43 – CADE ‘05
Size of Boolean Encoding: SD better than PC
Let
N
be size of original difference logic formula
Size of a directed acyclic graph representation SD encoding size is worst-case
O
(
N
2
)
PC encoding size is worst-case
O
( 2
N
)
Can generate
O
( 2
N
) transitivity constraints Example:
N =
6813 Method PC SD Boolean Encoding Size > 1000000 54465
– 44 – CADE ‘05
Impact on SAT problem: SD vs PC
Experimentally compared zChaff performance on SD and PC encodings of several unsatisfiable formulas Sample result: Method PC # Boolean variables 57211 # CNF Clauses 169387 # Conflict Clauses 150 zChaff Time (sec) 0.56
SD 23112 67699 15811 21.63
PC better than SD for zChaff – 45 – CADE ‘05
How to Choose Encoding
Hybrid Strategy
Partition variables into classes
Which ones are compared to each other
For each class, choose encoding method
PC except SD when PC blows up How to Determine Whether PC Will Work
Try to predict based on formula characteristics
Number of constraints, density, …
Selection procedure trained by machine learning
– 46 – CADE ‘05
Some Lessons We’ve Learned About Decision Procedures
Preserve Boolean Structure
Other approaches require collapsing to conjunctions of predicates (or extracting them dynamically) Exploit Problem Characteristics
Sparseness
Polarity structure Let SAT Solver Do the Work
Eager encoding: provide sufficient set of constraints to prove / disprove formula
They are good at digesting large volume of information
– 47 – CADE ‘05
Invariant Checking Revisited
Prove Unsatisfiability of Formula
x 1
x 2 …
x k
(x 1 …x k )
General Form:
X
(y 1 …y m )
(X)
(Y) Quantifier Instantiation
Generate expressions E 1 (Y), …, E n (Y)
Using terms that appear in Q
Expand as
( E 1 (Y) )
…
( E n (Y) )
(Y) If unsatisfiable, then so is quantified formula
Sound, but incomplete Trade-off
Be clever about instantiation, or
Instantiate many terms and rely on decision procedure capacity
– 48 – CADE ‘05
Predicate Abstraction Revisited
Formulate as Quantifier Elimination Problem
Generate formula of form
( B
)
S
( S , B
) S : Integer variables Use Eager SAT Encoding of
Get formula
A P( A , B
) A : Boolean variables
Satisfying solutions for P w.r.t. B
same as those for
Core problem of symbolic model checking
– 49 – CADE ‘05
Quantifier Elimination for P.A.
Formula
A P( A , B
) A : Boolean variables
Typically: 200+ variables for A , ~20 for B BDD-Based
Use partitioning techniques developed for symbolic model checking
Typically too many total Boolean variables SAT Enumeration
Find satisfying solution
( A )
( B
) to P
Enumerate solution
( B
) Reformulate P as P
( B
)
Performance: about 1000 solutions / second
– 50 – CADE ‘05
Why Verification Tasks Feasible
CLU Logic Fairly Simple
Equality, uninterpreted functions, difference constraints
Small model property “Deep” Reasoning Not Required
Formulas large and messy, but straightforward
Verifying systems that are designed to have constrained behaviors
Only checking effect of a few cycles of system operation
– 51 – CADE ‘05
Decision Procedures Revisited
SAT-Based Approaches Effective
Good performance as decision procedures
Key to implementing predicate abstraction
Quantifier elimination Eager Encoding Gives Good Performance
Avoids many iterations of theory-specific checkers
Extends to linear integer arithmetic
Seshia & Bryant, LICS ‘04
Quantifier-free Presburger
Small domain encoding exploiting sparseness
– 52 – CADE ‘05
Areas of Research
Bit-Vector Decision Procedures
True model for hardware & low-level software
Bit-field extraction
Bit-wise Boolean operations
Overflow effects
Automatically apply abstractions
Abstract to symbolic terms whenever possible Boolean Quantifier Elimination
SAT enumeration still not good enough
Limits predicate abstraction to ~25 predicates
Core problem for symbolic model checking
– 53 – CADE ‘05
More Research
Proof Generation
Hard to see how to generate unsatisfiability proof for CLU formula Debugging Support
Bounded model checking: provide counterexample trace
Invariant checking: hard to determine why invariant fails
And may be due to weakness in quantifier instantiation
Predicate abstraction: Gets nowhere without right set of predicates Proving Liveness
Current abstractions do not preserve liveness properties
Can help in proving progress invariant
– 54 – CADE ‘05