A Tutorial on Automated Verification Tevfik Bultan Who are these people and what do they have in common? 2007 Clarke, Edmund M 2007 Emerson,

Download Report

Transcript A Tutorial on Automated Verification Tevfik Bultan Who are these people and what do they have in common? 2007 Clarke, Edmund M 2007 Emerson,

A Tutorial on Automated Verification
Tevfik Bultan
Who are these people and what do they have in
common?
2007 Clarke, Edmund M
2007 Emerson, E Allen
2007 Sifakis, Joseph
1996 Pnueli, Amir
1991 Milner, Robin
1980 Hoare, C. Antony R.
1978 Floyd, Robert W
1972 Dijkstra, E. W.
Outline
•
•
•
•
•
Software’s Chronic Crisis
Temporal Logics and Model Checking Problem
Symbolic Model Checking
Automata Theoretic Model Checking
Software Verification Using Explicit State Model Checking
with Java Path Finder
• Bounded Model Checking
• Symbolic Software Model Checking with Predicate
Abstraction and Counter-Example Guided Abstraction
Refinement
Software’s Chronic Crisis
Large software systems often:
• Do not provide the desired functionality
• Take too long to build
• Cost too much to build
• Require too much resources (time, space) to run
• Cannot evolve to meet changing needs
– For every 6 large software projects that become
operational, 2 of them are canceled
– On the average software development projects
overshoot their schedule by half
– 3 quarters of the large systems do not provide required
functionality
Software Failures
• There is a long list of failed software projects and software
failures
• You can find a list of famous software bugs at:
http://www5.in.tum.de/~huckle/bugse.html
• I will talk about two famous and interesting software bugs
Ariane 5 Failure
• A software bug caused European
Space Agency’s Ariane 5 rocket to
crash 40 seconds into its first flight
in 1996 (cost: half billion
dollars)
• The bug was caused because of a software component that was
being reused from Ariane 4
• A software exception occurred during execution of a data conversion
from 64-bit floating point to 16-bit signed integer value
– The value was larger than 32,767, the largest integer storable in a
16 bit signed integer, and thus the conversion failed and an
exception was raised by the program
• When the primary computer system failed due to this problem, the
secondary system started running.
– The secondary system was running the same software, so it failed
too!
Ariane 5 Failure
• The programmers for Ariane 4 had decided that this
particular velocity figure would never be large enough to
raise this exception.
– Ariane 5 was a faster rocket than Ariane 4!
• The calculation containing the bug actually served no
purpose once the rocket was in the air.
– Engineers chose long ago, in an earlier version of the
Ariane rocket, to leave this function running for the first
40 seconds of flight to make it easy to restart the system
in the event of a brief hold in the countdown.
• You can read the report of Ariane 5 failure at:
http://www.ima.umn.edu/~arnold/disasters/ariane5rep.html
Mars Pathfinder
• A few days into its mission, NASA’s Mars
Pathfinder computer system started rebooting
itself
– Cause: Priority inversion during preemptive
priority scheduling of threads
• Priority inversion occurs when
– a thread that has higher priority is waiting for a resource held by
thread with a lower priority
• Pathfinder contained a data bus shared among multiple threads and
protected by a mutex lock
• Two threads that accessed the data bus were: a high-priority bus
management thread and a low-priority meteorological data gathering
thread
• Yet another thread with medium-priority was a long running
communications thread (which did not access the data bus)
Mars Pathfinder
•
The scenario that caused the reboot was:
– The meteorological data gathering thread accesses the bus and obtains the
mutex lock
– While the meteorological data gathering thread is accessing the bus, an
interrupt causes the high-priority bus management thread to be scheduled
– Bus management thread tries to access the bus and blocks on the mutex
lock
– Scheduler starts running the meteorological thread again
– Before the meteorological thread finishes its task yet another interrupt
occurs and the medium-priority (and long running) communications thread
gets scheduled
– At this point high-priority bus management thread is waiting for the lowpriority meteorological data gathering thread, and the low-priority
meteorological data gathering thread is waiting for the medium-priority
communications thread
– Since communications thread had long-running tasks, after a while a
watchdog timer would go off and notice that the high-priority bus
management thread has not been executed for some time and conclude
that something was wrong and reboot the system
Software’s Chronic Crisis
• Software product size is increasing exponentially
– faster, smaller, cheaper hardware
• Software is everywhere: from TV sets to cell-phones
• Software is in safety-critical systems
– cars, airplanes, nuclear-power plants
• We are seeing more of
– distributed systems
– embedded systems
– real-time systems
• These kinds of systems are harder to build
• Software requirements change
– software evolves rather than being built
Summary
• Software’s chronic crisis: Development of large software
systems is a challenging task
– Large software systems often: Do not provide the
desired functionality; Take too long to build; Cost too
much to build Require too much resources (time, space)
to run; Cannot evolve to meet changing needs
What is this?
First Computer Bug
• In 1947, Grace Murray Hopper was working on the Harvard
University Mark II Aiken Relay Calculator (a primitive
computer).
• On the 9th of September, 1947, when the machine was
experiencing problems, an investigation showed that there
was a moth trapped between the points of Relay #70, in
Panel F.
• The operators removed the moth and affixed it to the log.
The entry reads: "First actual case of bug being found."
• The word went out that they had "debugged" the machine
and the term "debugging a computer program" was born.
Can Model Checking Help
• The question is: Can the automated verification techniques
we have been discussing be used in finding bugs in
software systems?
• Today I will discuss some automated verification
techniques that have been successful in identifying bugs
State of the art in automated verification:
Model Checking
• What is model checking?
– Automated verification technique
– Focuses on bug finding rather than proving correctness
– The basic idea is to exhaustively search for bugs in
software
– Has many flavors
• Explicit-state model checking
• Symbolic model checking
• Bounded model checking
Model Checking Evolution
• Earlier model checkers had their own input specification
languages
– For example Spin, SMV
• This requires translation of the system to be verified to the
input langauge of the model checker
– Most of the time these translations are not automated
and use ad-hoc simplifications and abstractions
• More recently several researchers developed tools for
model checking programs
– These model checkers work directly on programs, i.e.,
their input language is a programming language
– These model checkers use well-defined techniques for
restricting the state space or use automated abstraction
techniques
Explicit-State Model Checking Programs
• Verisoft from Bell Labs
– C programs, handles concurrency, bounded search,
bounded recursion.
– Uses stateless search and partial order reduction.
• Java Path Finder (JPF) at NASA Ames
– Explicit state model checking for Java programs,
bounded search, bounded recursion, handles
concurrency.
– Uses techniques similar to the techniques used in Spin.
• CMC from Stanford for checking systems code written in C
Symbolic Model Checking of Programs
• CBMC
– This is the bounded model checker we discussed earlier,
bounds the loop iterations and recursion depth.
– Uses a SAT solver.
• SLAM project at Microsoft Research
– Symbolic model checking for C programs. Can handle
unbounded recursion but does not handle concurrency.
– Uses predicate abstraction and BDDs.
Beyond Model Checking
• Promising results obtained in the model checking area
created a new interest in automated verification
• Nowadays, there is a wide spectrum of
verification/analysis/testing techniques with varying levels
of power and scalability
– Bounded verification using SAT solvers
– Symbolic execution using combinations of decision
procedures
– Dynamic symbolic execution (aka concolic execution)
– Various types of symbolic analysis: shape analysis,
string analysis, size analysis, etc.
What to Verify?
• Before we start talking about automated verification
techniques, we need to identify what we want to verify
• It turns out that this is not a very simple question
• First we will discuss issues related to this question
Temporal Logics and Model Checking Problem
A Mutual Exclusion Protocol
Two concurrently executing processes are trying to enter a
critical section without violating mutual exclusion
Process 1:
while (true) {
out: a := true; turn := true;
wait: await (!b or !turn);
cs:
a := false;
}
||
Process 2:
while (true) {
out: b := true; turn := false;
wait: await (!a or turn);
cs:
b := false;
}
Reactive Systems: A Very Simple Model
• We will use a very simple model for reactive systems
• A reactive system generates a set of execution paths
• An execution path is a concatenation of the states
(configurations) of the system, starting from some initial
state
• There is a transition relation which specifies the next-state
relation, i.e., given a state what are the states that can
follow that state
State Space
• The state space of a program can be captured by the
valuations of the variables and the program counters
• For our example, we have
– two program counters: pc1, pc2
domains of the program counters: {out, wait, cs}
– three boolean variables: turn, a, b
boolean domain: {True, False}
• Each state of the program is a valuation of all the variables
State Space
• Each state can be written as a tuple
(pc1,pc2,turn,a,b)
• Initial states: {(o,o,F,F,F), (o,o,F,F,T),
(o,o,F,T,F), (o,o,F,T,T), (o,o,T,F,F),
(o,o,T,F,T), (o,o,T,T,F), (o,o,T,T,T)}
– initially: pc1=o and pc2=o
• How many states total?
3 * 3 * 2 * 2 * 2 = 72
exponential in the number of variables and the number of
concurrent components
Transition Relation
• Transition Relation specifies the next-state relation, i.e.,
given a state what are the states that can come
immediately after that state
• For example, given the initial state (o,o,F,F,F)
Process 1 can execute:
out: a := true; turn := true;
or Process 2 can execute:
out: b := true; turn := false;
• If process 1 executes, the next state is (w,o,T,T,F)
• If process 2 executes, the next state is (o,w,F,F,T)
• So the state pairs ((o,o,F,F,F),(w,o,T,T,F)) and
((o,o,F,F,F),(o,w,F,F,T)) are included in the
transition relation
Transition Relation
The transition relation is like a graph, edges represent the
next-state relation
(o,o,F,F,F)
(o,w,F,F,T)
(o,c,F,F,T)
(w,o,T,T,F)
(w,w,T,T,T)
Transition System
• A transition system T = (S, I, R) consists of
– a set of states
S
– a set of initial states
IS
– and a transition relation
RSS
• A common assumption in model checking
– R is total, i.e., for all s  S, there exists s’ such
that (s,s’)  R
Execution Paths
• An execution path is an infinite sequence of states
x = s0, s1, s2, ...
such that
s0  I and for all i  0, (si,si+1)  R
Notation: For any path x
xi denotes the i’th state on the path (i.e., si)
xi denotes the i’th suffix of the path (i.e., si, si+1, si+2, ... )
Execution Paths
A possible execution path:
((o,o,F,F,F), (o,w,F,F,T), (o,c,F,F,T))
( means repeat the above three states infinitely many times)
(o,o,F,F,F)
(o,w,F,F,T)
(o,c,F,F,T)
(w,o,T,T,F)
(w,w,T,T,T)
Temporal Logics
• Pnueli proposed using temporal logics for reasoning about
the properties of reactive systems
• Temporal logics are a type of modal logics
– Modal logics were developed to express modalities such
as “necessity” or “possibility”
– Temporal logics focus on the modality of temporal
progression
• Temporal logics can be used to express, for example, that:
– an assertion is an invariant (i.e., it is true all the time)
– an assertion eventually becomes true (i.e., it will become
true sometime in the future)
Temporal Logics
• We will assume that there is a set of basic (atomic)
properties called AP
– These are used to write the basic (non-temporal)
assertions about the program
– Examples: a=true, pc0=c, x=y+1
• We will use the usual boolean connectives: 
• We will also use four temporal operators:
Invariant p
:
Gp
(aka
p)
Eventually p
:
Fp
(aka
p)
Next p
:
Xp
(aka
p)
p Until q
:
pUq
,,
(Globally)
(Future)
(neXt)
LTL Properties
...
Xp
p
...
Gp
p
p
p
p
p
...
Fp
pUq
p
p
...
p
p
p
p
q
Example Properties
mutual exclusion: G (  (pc1=c  pc2=c))
starvation freedom:
G(pc1=w  F(pc1=c))  G(pc2=w  F(pc2=c))
Given the execution path:
x =((o,o,F,F,F), (o,w,F,F,T), (o,c,F,F,T))
x |= pc1=o
x |= X (pc2=w)
x |= F (pc2=c)
x |= (turn) U (pc2=c  b)
x |= G (  (pc1=c  pc2=c))
x |= G(pc1=w  F(pc1=c))  G(pc2=w  F(pc2=c))
LTL Model Checking
• Given a transition system T and an LTL property p
T |= p
iff
for all execution paths x in T, x |= p
For example:
T |=? G (  (pc1=c  pc2=c))
T |=? G(pc1=w  F(pc1=c))  G(pc2=w  F(pc2=c))
Model checking problem: Given a transition system T and
an LTL property p, determine if T is a model for p (i.e., if
T |=p)
Complexity: (|S|+|R|)  2O(|f|)
Linear Time vs. Branching Time
• In linear time logics we look at the execution paths
individually
• In branching time logics we view the computation as a tree
– computation tree: unroll the transition relation
Transition System
Execution Paths
Computation Tree
s3
s3
s1
s2
s3
s3
s4
s4
s3
.
.
.
s1
s2
s3
.
.
.
s4
..
..
..
s4
s1
s3
s2
s3
s1
.
.
.
s4
..
..
..
s1
.
.
.
Computation Tree Logic (CTL)
• In CTL we quantify over the paths in the computation tree
• We use the same four temporal operators: X, G, F, U
• However we attach path quantifiers to these temporal
operators:
– A : for all paths
– E : there exists a path
• We end up with eight temporal operators:
– AX, EX, AG, EG, AF, EF, AU, EU
CTL Properties
Transition System
p
s1
s2
s3 |= p
s4 |= p
s1 |=  p
s2 |=  p
s3
Computation Tree
s3 p
p
s4
s3 |= EX p
s3 |= EX  p
s3 |=  AX p
s3 |=  AX  p
s3 |= EG p
s3 |=  EG  p
s3 |= AF p
s3 |= EF  p
s3 |=  AF  p
p s4
..
..
..
s4 p
s1
s3 p
s2
s3 p
s1
.
.
.
p s4
.
.
.
.
.
.
s1
.
.
.
CTL Model Checking
• Given a transition system T= (S, I, R) and a CTL property p
T |= p
iff
for all initial state s  I, s |= p
Model checking problem: Given a transition system T and a
CTL property p, determine if T is a model for p (i.e., if T |=p)
Complexity: O(|f|  (|S|+|R|))
For example:
T |=? AG (  (pc1=c  pc2=c))
T |=? AG(pc1=w  AF(pc1=c))  AG(pc2=w  AF(pc2=c))
• Question: Are CTL and LTL equivalent?
CTL vs. LTL
• CTL and LTL are not equivalent
– There are properties that can be expressed in LTL but
cannot be expressed in CTL
• For example: FG p
– There are properties that can be expressed in CTL but
cannot be expressed in LTL
• For example: AG(EF p)
• Hence, expressive power of CTL and LTL are not
comparable
Symbolic Model Checking
Temporal Properties  Fixpoints
[Emerson and Clarke 80]
Here are some interesting CTL equivalences:
AG p = p  AX AG p
EG p = p  EX EG p
AF p = p  AX AF p
EF p = p  EX EF p
p AU q = q  (p  AX (p AU q))
p EU q = q  (p  EX (p EU q))
Note that we wrote the CTL temporal operators in terms of
themselves and EX and AX operators
Fixpoint Characterizations
Fixpoint Characterization
Equivalences
AG p =  y . p  AX y
EG p =  y . p  EX y
AG p = p  AX AG p
EG p = p  EX EG p
AF p =  y . p  AX y
EF p =  y . p  EX y
AF p = p  AX AF p
EF p = p  EX EF p
p AU q =  y . q  (p  AX (y))
p EU q =  y . q  (p  EX (y))
p AU q=q  (p  AX (p AU q))
p EU q = q  (p  EX (p EU q))
Least Fixpoint
Given a monotonic function F, the least fixpoint  y . F y is the
limit of the following sequence (assuming F is continuous):
, F , F2 , F3 , ...
If S is finite, then we can compute the least fixpoint using the
above sequence
EF Fixpoint Computation
EF p =  y . p  EX y is the limit of the sequence:
, pEX , pEX(pEX ) , pEX(pEX(p EX )) , ...
which is equivalent to
, p, p  EX p , p  EX (p  EX (p) ) , ...
EF Fixpoint Computation
p
p s1
s2
s3
s4
Start

1st iteration
pEX  = {s1,s4}  EX()= {s1,s4}   ={s1,s4}
2nd iteration
pEX(pEX ) = {s1,s4}  EX({s1,s4})= {s1,s4} {s3}={s1,s3,s4}
3rd iteration
pEX(pEX(p EX )) = {s1,s4}  EX({s1,s3,s4})= {s1,s4} {s2,s3,s4}={s1,s2,s3,s4}
4th iteration
pEX(pEX(pEX(p EX ))) = {s1,s4}  EX({s1,s2,s3,s4})= {s1,s4}  {s1,s2,s3,s4}
= {s1,s2,s3,s4}
EF Fixpoint Computation
EF(p)  states that can reach p
p
p
 EX(p)  EX(EX(p))  ...
• • •
EF(p)
Greatest Fixpoint
Given a monotonic function F, the greatest fixpoint  y . F y is
the limit of the following sequence (assuming F is continuous):
S, F S, F2 S, F3 S, ...
If S is finite, then we can compute the greatest fixpoint using
the above sequence
EG Fixpoint Computation
Similarly, EG p =  y . p  EX y is the limit of the sequence:
S, pEX S, pEX(p  EX S) , pEX(p  EX (p  EX S)) , ...
which is equivalent to
S, p, p  EX p , p  EX (p  EX (p) ) , ...
EG Fixpoint Computation
p
s1
s2
s3
p
s4
p
Start
S = {s1,s2,s3,s4}
1st iteration
pEX S = {s1,s3,s4}EX({s1,s2,s3,s4})= {s1,s3,s4}{s1,s2,s3,s4}={s1,s3,s4}
2nd iteration
pEX(pEX S) = {s1,s3,s4}EX({s1,s3,s4})= {s1,s3,s4}{s2,s3,s4}={s3,s4}
3rd iteration
pEX(pEX(pEX S)) = {s1,s3,s4}EX({s3,s4})= {s1,s3,s4}{s2,s3,s4}={s3,s4}
EG Fixpoint Computation
EG(p)  states that can avoid reaching p
 p  EX(p)  EX(EX(p))  ...
• • •
EG(p)
Symbolic Model Checking
[McMillan et al. LICS 90]
• Basic idea: Represent sets of states and the transition
relation as Boolean logic formulas
• Fixpoint computation becomes formula manipulation
– pre-condition (EX) computation: Existential variable
elimination
– conjunction (intersection), disjunction (union) and
negation (set difference), and equivalence check
• Use an efficient data structure for boolean logic formulas
– Binary Decision Diagrams (BDDs)
Symbolic Pre-condition Computation
• Remember the function
EX : 2S  2S
which is defined as:
EX(p) = { s | (s,s’)  R and s’  p }
• We can symbolically compute pre as follows
EX(p)  V’ R  p[V’ / V]
– V : current-state boolean variables
– V’ : next-state boolean variables
– p[V’ / V] : rename variables in p by replacing currentstate variables with the corresponding next-state
variables
– V’ f : existentially quantify out all the variables in V’
from f
An Extremely Simple Example
Variables: x, y: boolean
Set of states:
S = {(F,F), (F,T), (T,F), (T,T)}
S  True
F,F
T,F
F,T
T,T
Initial condition:
Ixy
Transition relation (negates one variable at a time):
R  x’=x  y’=y  x’=x  y’=y
(= means )
An Extremely Simple Example
Given p  x  y, compute EX(p)
F,F
T,F
F,T
T,T
EX(p)  V’ R  p[V’ / V]
 V’ R  x’  y’
 V’ (x’=x  y’=y  x’=x  y’=y )  x’  y’
 V’ (x’=x  y’=y)  x’  y’  (x’=x  y’=y)  x’  y’
 V’ x  y  x’  y’  x  y  x’  y’
 x  y  x  y
EX(x  y)  x  y  x  y
In other words EX({(T,T)})  {(F,T), (T,F)}
An Extremely Simple Example
3
F,F
T,F
Let’s compute compute EF(x  y)
2
1
F,T
T,T
The fixpoint sequence is
False, xy , xy  EX(xy) , xy  EX (xy  EX(xy)) , ...
If we do the EX computations, we get:
False, x  y , x  y  x  y  x  y,
True
0
1
2
3
EF(x  y)  True
In other words EF({(T,T)})  {(F,F),(F,T), (T,F),(T,T)}
An Extremely Simple Example
• Based on our results, for our extremely simple transition
system T=(S,I,R) we have
I  EF(x  y) hence:
T |= EF(x  y)
(i.e., there exists a path from each initial state where
eventually x and y both become true at the same time)
I  EX(x  y) hence:
T |= EX(x  y)
(i.e., there does not exist a path from each initial state where
in the next state x and y both become true)
An Extremely Simple Example
• Let’s try one more property AF(x  y)
• To check this property we first convert it to a formula which
uses only the temporal operators in our basis:
AF(x  y)   EG((x  y))
If we can find an initial state which satisfies EG((x  y)), then
we know that the transition system T, does not satisfy the
property AF(x  y)
An Extremely Simple Example
Let’s compute compute EG((x  y))
F,F
T,F
1 F,T
T,T
The fixpoint sequence is
0
True, x  y, (x  y)  EX(x  y) , …
If we do the EX computations, we get:
True, x  y,
x  y,
0
1
2
EG((x  y))  x  y
Since I  EG((x  y))   we conclude that T |= AF(x  y)
Symbolic CTL Model Checking Algorithm
• Translate the formula to a formula which uses the basis
– EX p, EG p, p EU q
• Atomic formulas can be interpreted directly on the state
representation
• For EX p compute the precondition using existential
variable elimination as we discussed
• For EG and EU compute the fixpoints iteratively
SMV [McMillan 93]
•
•
•
•
BDD-based symbolic model checker
Finite state
Temporal logic: CTL
Focus: hardware verification
– Later applied to software specifications, protocols, etc.
• SMV has its own input specification language
– concurrency: synchronous, asynchronous
– shared variables
– boolean and enumerated variables
– bounded integer variables (binary encoding)
• SMV is not efficient for integers, but that can be fixed
– fixed size arrays
SMV Language
• An SMV specification consists of a set of modules (one of
them must be called main)
• Modules can have access to shared variables
• Modules can be composed asynchronously using the
process keyword
• Module behaviors can be specified using the ASSIGN
statement which assigns values to next values of variables
in parallel
• Module behaviors can also be specified using the TRANS
statements which allow specification of the transition
relation as a logic formula where next state values are
identified using the next keyword
Example Mutual Exclusion Protocol
Two concurrently executing processes are trying to enter a
critical section without violating mutual exclusion
Process 1:
while (true) {
out: a := true; turn := true;
wait: await (b = false or turn = false);
cs:
a := false;
}
||
Process 2:
while (true) {
out: b := true; turn := false;
wait: await (a = false or turn);
cs:
b := false;
}
Example Mutual Exclusion Protocol in SMV
MODULE process1(a,b,turn)
VAR
pc: {out, wait, cs};
ASSIGN
init(pc) := out;
next(pc) :=
case
pc=out : wait;
pc=wait & (!b | !turn) : cs;
pc=cs : out;
1 : pc;
esac;
next(turn) :=
case
pc=out : 1;
1 : turn;
esac;
next(a) :=
case
pc=out : 1;
pc=cs : 0;
1 : a;
esac;
next(b) := b;
FAIRNESS
running
MODULE process2(a,b,turn)
VAR
pc: {out, wait, cs};
ASSIGN
init(pc) := out;
next(pc) :=
case
pc=out : wait;
pc=wait & (!a | turn) : cs;
pc=cs : out;
1 : pc;
esac;
next(turn) :=
case
pc=out : 0;
1 : turn;
esac;
next(b) :=
case
pc=out : 1;
pc=cs : 0;
1 : b;
esac;
next(a) := a;
FAIRNESS
running
Example Mutual Exclusion Protocol in SMV
MODULE main
VAR
a : boolean;
b : boolean;
turn : boolean;
p1 : process process1(a,b,turn);
p2 : process process2(a,b,turn);
SPEC
AG(!(p1.pc=cs & p2.pc=cs))
-- AG(p1.pc=wait -> AF(p1.pc=cs)) & AG(p2.pc=wait -> AF(p2.pc=cs))
Here is the output when I run SMV on this example to
check the mutual exclusion property
% smv mutex.smv
-- specification AG (!(p1.pc = cs & p2.pc = cs)) is true
resources used:
user time: 0.01 s, system time: 0 s
BDD nodes allocated: 692
Bytes allocated: 1245184
BDD nodes representing transition relation: 143 + 6
Example Mutual Exclusion Protocol in SMV
The output for the starvation freedom property:
% smv mutex.smv
-- specification AG (p1.pc = wait -> AF p1.pc = cs) & AG ... is true
resources used:
user time: 0 s, system time: 0 s
BDD nodes allocated: 1251
Bytes allocated: 1245184
BDD nodes representing transition relation: 143 + 6
Example Mutual Exclusion Protocol in SMV
Let’s insert an error
change
pc=wait & (!b | !turn) : cs;
to
pc=wait & (!b | turn) : cs;
% smv mutex.smv
-- specification AG (!(p1.pc = cs & p2.pc = cs)) is false
-- as demonstrated by the following execution sequence
state 1.1:
a = 0
b = 0
turn = 0
p1.pc = out
p2.pc = out
[stuttering]
state 1.2:
[executing process p2]
state 1.3:
b = 1
p2.pc = wait
[executing process p2]
state 1.4:
p2.pc = cs
[executing process p1]
state 1.5:
a = 1
turn = 1
p1.pc = wait
[executing process p1]
state 1.6:
p1.pc = cs
[stuttering]
resources used:
user time: 0.01 s, system time: 0 s
BDD nodes allocated: 1878
Bytes allocated: 1245184
BDD nodes representing transition relation: 143 + 6
Symbolic Model Checking with BDDs
• BDDs are used as a data structure for encoding trust sets
of Boolean logic formulas in symbolic model checking
• One can use BDD-based symbolic model checking for any
finite state system using a Boolean encoding of the state
space and the transition relation
• Why are we using symbolic model checking?
– We hope that the symbolic representations will be more
compact than the explicit state representation on the
average
– In the worst case we may not gain anything
Problems with BDDs
• The BDD for the transition relation could be huge
– Remember that the BDD could be exponential in the
number of disjuncts and conjuncts
– Since we are using a Boolean encoding there could be a
large number of conjuncts and disjuncts
• The EX computation could result in exponential blow-up
– Exponential in the number of existentially quantified
variables
Heuristics
• Instead of computing a monolithic BDD for the whole
transition system partition the transition relation in order to
keep the BDD size small
• Use good variable ordering in order to keep the BDD sizes
small
– Use heuristics to find good variable orderings,
– Use dynamic variable ordering heuristics that change the
variable ordering dynamically if the BDD size grows too
much
• Use other data structures (such as multi-terminal decision
diagrams)
Counter-Example Generation
• Remember: Given a transition system T= (S, I, R) and a
CTL property p T |= p iff for all initial state s  I, s |= p
• Verification vs. Falsification
– Verification:
• Show: initial states  truth set of p
– Falsification:
• Find: a state  initial states  truth set of p
• Generate a counter-example starting from that state
• The ability to find counter-examples is one of the biggest
strengths of the model checkers
An Example
• We want to check the property AG(p)
• We compute the fixpoint for EF(p)
• We check if the intersection of the set of initial states I and
the truth set of EF(p) is empty
– If it is not empty we generate a counter-example path
starting from the intersection
EF(p)  states that can reach p
 p
• In order to generate the
p
counter-example path, save
the fixpoint iterations.
• After the fixpoint computation
converges, do a second pass
to generate the counter-example path.
 EX(p)  EX(EX(p))  ...
• • •
I
• • •
EF(p)
Generate a counter-example
path starting from a state here
Automata Theoretic Model Checking
LTL Properties  Büchi automata
[Vardi and Wolper LICS 86]
• Büchi automata: Finite state automata that accept infinite
strings
– The better known variant of finite state automata accept
finite strings (used in lexical analysis for example)
• A Büchi automaton accepts a string when the
corresponding run visits an accepting state infinitely often
– Note that an infinite run never ends, so we cannot say
that an accepting run ends at an accepting state
• LTL properties can be translated to Büchi automata
– The automaton accepts a path if and only if the path
satisfies the corresponding LTL property
LTL Properties  Büchi automata
true
Gp
p
p
true
Fp
G (F p)
p
p
p
p
p
p
The size of the property automaton can be exponential in the
size of the LTL formula (recall the complexity of LTL model
checking)
Büchi Automata: Language Emptiness Check
• Given a Buchi automaton, one interesting question is:
– Is the language accepted by the automaton empty?
• i.e., does it accept any string?
• A Büchi automaton accepts a string when the
corresponding run visits an accepting state infinitely often
• To check emptiness:
– Look for a cycle which contains an accepting state and is
reachable from the initial state
• Find a strongly connected component that contains
an accepting state, and is reachable from the initial
state
– If no such cycle can be found the language accepted by
the automaton is empty
LTL Model Checking
• Generate the property automaton from the negated LTL
property
• Generate the product of the property automaton and the
transition system
• Show that there is no accepting cycle in the product
automaton (check language emptiness)
– i.e., show that the intersection of the paths generated by
the transition system and the paths accepted by the
(negated) property automaton is empty
• If there is a cycle, it corresponds to a counterexample
behavior that demonstrates the bug
LTL Model Checking Example
Example transition system
Property to be verified
Gq
Negation of the property
p,q
q
1
2
 G q  F q
3
p
Property automaton for
the negated property
true
Each state is labeled with
the propositions that hold
in that state
q
q
Equivalently
{q},{p,q}
,{p},{q},
{p,q}
, {p}
1
2
Transition System to Buchi Automaton Translation
Example transition system
p,q
Corresponding Buchi automaton
i
1
{p,q}
1
q
2
3
p
{p,q}
{q}
{q}
2
Each state is labeled with
the propositions that hold
in that state
3
{p}
Buchi automaton for
the transition system
(every state is accepting)
Product automaton
1,1
{p,q}
1
{p,q}
2,1
{p,q}
2
{q}
{q}
3,1
{q}
3
4
{p}
{p}
{q}
3,2
4,2
Property Automaton
{q},{p,q}
,{p},{q},
{p,q}
, {p}
1
{p,q}
2
{p}
Accepting cycle:
(1,1), (2,1), (3,1), ((4,2), (3,2))
Corresponds to a counter-example
path for the property G q
SPIN [Holzmann
•
•
•
•
91, TSE 97]
Explicit state model checker
Finite state
Temporal logic: LTL
Input language: PROMELA
– Asynchronous processes
– Shared variables
– Message passing through (bounded) communication
channels
– Variables: boolean, char, integer (bounded), arrays
(fixed size)
– Structured data types
SPIN
Verification in SPIN
• Uses the LTL model checking approach
• Constructs the product automaton on-the-fly
– It is possible to find an accepting cycle (i.e. a counterexample) without constructing the whole state space
• Uses a nested depth-first search algorithm to look for an
accepting cycle
• Uses various heuristics to improve the efficiency of the
nested depth first search:
– partial order reduction
– state compression
Example Mutual Exclusion Protocol
Two concurrently executing processes are trying to enter a
critical section without violating mutual exclusion
Process 1:
while (true) {
out: a := true; turn := true;
wait: await (b = false or turn = false);
cs:
a := false;
}
||
Process 2:
while (true) {
out: b := true; turn := false;
wait: await (a = false or turn);
cs:
b := false;
}
Example Mutual Exclusion Protocol in Promela
#define cs1 process1@cs
#define cs2 process2@cs
#define wait1 process1@wait
#define wait2 process2@wait
#define true
1
#define false
0
bool a;
bool b;
bool turn;
proctype process1()
{
out:
a = true; turn = true;
wait:
(b == false || turn == false);
cs:
a = false; goto out;
}
proctype process2()
{
out:
b = true; turn = false;
wait:
(a == false || turn == true);
cs:
b = false; goto out;
}
init {
run process1(); run process2()
}
Property automaton generation
% spin -f "! [] (! (cs1 && cs2))“
never {
/* ! [] (! (cs1 && cs2)) */
T0_init:
if
:: ((cs1) && (cs2)) -> goto accept_all
:: (1) -> goto T0_init
fi;
accept_all:
skip
}
% spin -f "!([](wait1 -> <>(cs1)))“
• Input formula
“[]” means G
“<>” means F
• “spin –f” option
generates a Buchi
automaton for the
input LTL formula
never {
/* !([](wait1 -> <>(cs1))) */
T0_init:
if
:: ( !((cs1)) && (wait1) ) -> goto accept_S4
:: (1) -> goto T0_init
fi;
accept_S4:
if
:: (! ((cs1))) -> goto accept_S4
fi;
}
Concatanate the generated never claims to the end of the specification file
SPIN
• “spin –a mutex.spin” generates a C program “pan.c” from
the specification file
– This C program implements the on-the-fly nested-depth
first search algorithm
– You compile “pan.c” and run it to the model checking
• Spin generates a counter-example trace if it finds out that a
property is violated
%mutex -a
warning: for p.o. reduction to be valid the never claim must be stutter-invariant
(never claims generated from LTL formulae are stutter-invariant)
(Spin Version 4.2.6 -- 27 October 2005)
+ Partial Order Reduction
Full statespace search for:
never claim
assertion violations
acceptance
cycles
invalid end states
+
+ (if within scope of claim)
+ (fairness disabled)
- (disabled by never claim)
State-vector 28 byte, depth reached 33, errors: 0
22 states, stored
15 states, matched
37 transitions (= stored+matched)
0 atomic steps
hash conflicts: 0 (resolved)
2.622
memory usage (Mbyte)
unreached in proctype process1
line 18, state 6, "-end-"
(1 of 6 states)
unreached in proctype process2
line 27, state 6, "-end-"
(1 of 6 states)
unreached in proctype :init:
(0 of 3 states)
Problems/Heuristics for explicit state model checking
• State space explosion: Number of states can be
exponential in the number of variables and concurrent
components
• Heuristics used by Spin:
– On the fly checking: use a depth first search that
computes the product of the property automaton and the
transition relation while looking for a violation
– Bit-state hashing
• do not store the full state information
• might skip some unvisited states so it is not sound
– Partial order reduction
• only explore a representative subset of interleavings
among the concurrent processes
• this can be done in a sound manner
Software Verification Using Explicit State Model
Checking with Java Path Finder
Java Path Finder
• Program checker for Java
• Properties to be verified
– Properties can be specified as assertions
• static checking of assertions
– It can also verify LTL properties
• Implements both depth-first and breadth-first search and
looks for assertion violations statically
• Uses static analysis techniques to improve the efficiency of
the search
• Requires a complete Java program
• It can only handle pure Java, it cannot handle native code
Java Path Finder, First Version
• First version
– A translator from Java to PROMELA
– Use SPIN for model checking
• Since SPIN cannot handle unbounded data
– Restrict the program to finite domains
• A fixed number of objects from each class
• Fixed bounds for array sizes
• Does not scale well if these fixed bounds are increased
• Java source code is required for translation
Java Path Finder, Current Version
• Current version of the JPF has its own virtual machine:
JPF-JVM
– Executes Java bytecode
• can handle pure Java but can not handle native code
– Has its own garbage collection
– Stores the visited states and stores current path
– Offers some methods to the user to optimize verification
• Traversal algorithm
– Traverses the state-graph of the program
– Tells JPF-JVM to move forward, backward in the
state space, and evaluate the assertion
• The rest of the slides are on the current version of JPF:
W. Visser, K. Havelund, G. Brat, S. Park and F. Lerda. "Model
Checking Programs." Automated Software Engineering
Journal Volume 10, Number 2, April 2003.
Storing the States
• JPF implements a depth-first search on the state space of
the given Java program
– To do depth first search we need to store the visited
states
• There are also verification tools which use stateless
search such as Verisoft
• The state of the program consists of
– information for each thread in the Java program
• a stack of frames, one for each method called
– the static variables in classes
• locks and fields for the classes
– the dynamic variables (fields) in objects
• locks and fields for the objects
Storing States Efficiently
• Since different states can have common parts each state is
divided to a set of components which are stored separately
– locks, frames, fields
• Keep a pool for each component
– A table of field values, lock values, frame values
• Instead of storing the value of a component in a state store
an index at which the component is stored in the table in
the state
– The whole state becomes an integer vector
• JPF collapses states to integer vectors using this idea
State Space Explosion
• State space explosion if one of the major challenges in
model checking
• The idea is to reduce the number of states that have to be
visited during state space exploration
• Here are some approaches used to attack state space
explosion
– Symmetry reduction
• search equivalent states only once
– Partial order reduction
• do not search thread interleavings that generate
equivalent behavior
– Abstraction
• Abstract parts of the state to reduce the size of the
state space
Symmetry Reduction
• Some states of the program may be equivalent
– Equivalent states should be searched only once
• Some states may differ only in their memory layout, the
order objects are created, etc.
– these may not have any effect on the behavior of the
program
• JPF makes sure that the order which the classes are
loaded does not effect the state
– There is a canonical ordering of the classes in the
memory
Symmetry Reduction
• A similar problem occurs for location of dynamically
allocated objects in the heap
– If we store the memory location as the state, then we
can miss equivalent states which have different memory
layouts
– JPF tries to remedy this problem by storing some
information about the new statements that create an
object and the number of times they are executed
Partial Order Reduction
• Statements of concurrently executing threads can generate
many different interleavings
– all these different interleavings are allowable behavior of
the program
• A model checker has to check all possible interleavings that
the behavior of the program is correct in all cases
– However different interleavings may generate equivalent
behaviors
• In such cases it is sufficient to check just one interleaving
without exhausting all the possibilities
– This is called partial order reduction
state space search generates 258 states
with symmetry reduction: 105 states
with partial order reduction: 68 states
with symmetry reduction + partial order reduction : 38 states
class S1 { int x;}
class FirstTask
extends Thread {
public void run() {
S1 s1; int x = 1;
s1 = new S!();
x = 3;
}}
class S2 { int y;}
class SecondTask
extends Thread {
public void run() {
S2 s2; int x = 1;
s2 = new S2();
x = 3;
}}
class Main {
public static void main(String[] args) {
FirstTask task1 = new FirstTask();
SecondTask task2 = new SecondTask();
task1.statr(); task2.start();
}}
Static Analysis
• JPF uses following static analysis techniques for reducing
the state space:
– slicing
– partial evaluation
• Given a slicing criterion slicing reduces the size of a
program by removing the parts of the program that have no
effect on the slicing criterion
– A slicing criterion could be a program point
– Program slices are computed using dependency
analysis
• Partial evaluation propagates constant values and
simplifies expressions
Abstraction vs. Restriction
• JPF also uses abstraction techniques such as predicate
abstraction to reduce the state space
• Still, in order to check a program with JPF, typically, you
need to restrict the domains of the variables, the sizes of
the arrays, etc.
• Abstraction over approximates the program behavior
– causes spurious counter-examples
• Restriction under approximates the program behavior
– may result in missed errors
• If both under and over approximation techniques are used
then the resulting verification technique is neither sound nor
complete
– However, it is still useful as a debugging tool and it is
helpful in finding bugs
JPF Java Modeling Primitives
• Atomicity (used to reduce the state space)
– beginAtomic(), endAtomic()
• Nondeterminism (used to model non-determinism caused
by abstraction)
– int random(int);
boolean randomBool();
Object randomObject(String cname);
• Properties (used to specify properties to be verified)
– AssertTrue(boolean cond)
Annotated Java Code for a Reader-Writer Lock
import gov.nasa.arc.ase.jpf.jvm.Verify;
class ReaderWriter {
private int nr;
private boolean busy;
private Object Condr_enter;
private Object Condw_enter;
public ReaderWriter() {
Verify.beginAtomic();
nr = 0; busy=false ;
Condr_enter =new Object();
Condw_enter =new Object();
Verify.endAtomic();
}
public boolean read_exit(){
boolean result=false;
synchronized(this){
nr = (nr - 1);
result=true;
}
Verify.assertTrue(!busy || nr==0 );
return result;
}
private boolean Guarded_r_enter(){
boolean result=false;
synchronized(this){
if(!busy){nr = (nr +
1);result=true;}}
return result;
}
public void read_enter(){
synchronized(Condr_enter){
while (! Guarded_r_enter()){
try{Condr_enter.wait();}
catch(InterruptedException e){}
}}
Verify.assertTrue(!busy || nr==0 );
}
private boolean Guarded_w_enter(){…}
public void write_enter(){…}
public boolean write_exit(){…}
};
JPF Output
>java gov.nasa.arc.ase.jpf.jvm.Main rwmain
JPF 2.1 - (C) 1999,2002 RIACS/NASA Ames Research Center
JVM 2.1 - (C) 1999,2002 RIACS/NASA Ames Research Center
Loading class gov.nasa.arc.ase.jpf.jvm.reflection.JavaLangObjectReflection
Loading class gov.nasa.arc.ase.jpf.jvm.reflection.JavaLangThreadReflection
==============================
No Errors Found
==============================
----------------------------------States visited
: 36,999
Transitions executed : 68,759
Instructions executed: 213,462
Maximum stack depth : 9,010
Intermediate steps
: 2,774
Memory used
: 22.1MB
Memory used after gc : 14.49MB
Storage memory
: 7.33MB
Collected objects
: 51
Mark and sweep runs : 55,302
Execution time
: 20.401s
Speed
: 3,370tr/s
-----------------------------------
Example Error Trace
1 error found: Deadlock
========================
*** Path to error: ***
========================
Steps to error: 2521
Step #0 Thread #0
Step #1 Thread #0
rwmain.java:4
ReaderWriter monitor=new ReaderWriter();
Step #2 Thread #0
ReaderWriter.java:10
public ReaderWriter( ) {
…
Step #2519 Thread #2
ReaderWriter.java:71
while (! Guarded_w_enter()){
Step #2520 Thread #2
ReaderWriter.java:73
Condw_enter.wait();
Bounded Model Checking
Bounded Model Checking
• Represent sets of states and the transition relation as
Boolean logic formulas
• Instead of computing the fixpoints, unroll the transition
relation up to certain fixed bound and search for violations
of the property within that bound
• Transform this search to a Boolean satisfiability problem
and solve it using a SAT solver
What Can We Guarantee?
• Note that in bounded model checking we are checking only
for bounded paths (paths which have at most k+1 distinct
states)
– So if the property is violated by only paths with more
than k+1 distinct states, we would not find a counterexample using bounded model checking
– Hence if we do not find a counter-example using
bounded model checking we are not sure that the
property holds
• However, if we find a counter-example, then we are sure
that the property is violated since the generated counterexample is never spurious (i.e., it is always a concrete
counter-example)
Bounded Model Checking: Proving Correctness
• One can also show that given an LTL property f, if E f holds
for a finite state transition system, then E f also holds for
that transition system using bounded semantics for some
bound k
• So if we keep increasing the bound, then we are
guaranteed to find a path that satisfies the formula
– And, if we do not find a path that satisfies the formula,
then we decide that the formula is not satisfied by the
transition system
– Is there a problem here?
Proving Correctness
• We can modify the bounded model checking algorithm as
follows:
– Start from an initial bound.
– If no counter-examples are found using the current
bound, increment the bound and try again.
• The problem is: We do not know when to stop
Proving Correctness
• If we can find a way to figure out when we should stop then
we would be able to provide guarantee of correctness.
• There is a way to define a diameter of a transition system
so that a property holds for the transition system if and only
if it is not violated on a path bounded by the diameter.
• So if we do bounded model checking using the diameter of
the system as our bound, then we can guarantee
correctness if no counter-example is found.
Bounded Model Checking
• What are the differences between bounded model checking
and BDD-based symbolic model checking?
– In bounded model checking we are using a SAT solver
instead of a BDD library
– In symbolic model checking we do not unroll the
transition relation as in bounded model checking
– In bounded model checking we do not execute the
iterative fixpoint computations as in symbolic model
checking
– In symbolic model checking for finite state systems both
verification and falsification results are guaranteed
• In bounded model checking we can only guarantee
the falsification results, in order to guarantee the
verification results we need to know the diameter of
the system
Bounded Model Checking
• A bounded model checker needs an efficient SAT solver
– zChaff SAT solver is one of the most commonly used
ones
• Most SAT solvers require their input to be in Conjunctive
Normal Form (CNF)
– So the final formula has to be converted to CNF
• Similar to BDD-based symbolic model checking, bounded
model checking was also first used for hardware verification
• More recently, it has been applied to verification of software
Bounded Model Checking for Software
CBMC is a bounded model checker for ANSI-C programs
• Handles function calls using inlining
• Unwinds the loops a fixed number of times
• Allows user input to be modeled using non-determinism
– So that a program can be checked for a set of inputs
rather than a single input
• Allows specification of assertions which are checked using
the bounded model checking
Loops
• Unwind the loop n times by duplicating the loop body n
times
– Each copy is guarded using an if statement that checks
the loop condition
• At the end of the n repetitions an unwinding assertion is
added which is the negation of the loop condition
– Hence if the loop iterates more than n times in some
execution, the unwinding assertion will be violated and
we know that we need to increase the bound in order to
guarantee correctness
• A similar strategy is used for recursive function calls
– The recursion is unwound up to a certain bound and
then an assertion is generated stating that the recursion
does not go any deeper
A Simple Loop Example
Original code
Unwinding the loop 3 times
x=0;
while (x < 2) {
y=y+x;
x++;
}
x=0;
if (x < 2) {
y=y+x;
x++;
}
if (x < 2) {
y=y+x;
x++;
}
if (x < 2) {
y=y+x;
x++;
}
Unwinding
assertion:
assert (! (x < 2))
From Code to SAT
• After eliminating loops and recursion, CBMC converts the
input program to the static single assignment (SSA) form
– In SSA each variable appears at the left hand side of an
assignment only once
– This is a standard program transformation that is
performed by creating new variables
• In the resulting program each variable is assigned a value
only once and all the branches are forward branches (there
is no backward edge in the control flow graph)
• CBMC generates a Boolean logic formula from the program
using bit vectors to represent variables
Another Simple Example
Original code
x=x+y;
if (x!=1)
x=2;
else
x++;
assert(x<=3);
Convert to static single assignment
x1=x0+y0;
if (x1!=1)
x2=2;
else
x3=x1+1;
x4=(x1!=1)?x2:x3;
assert(x4<=3);
Generate constraints
C  x1=x0+y0  x2=2  x3=x1+1 (x1!=1  x4=x2  x1=1  x4=x3)
P  x4 <= 3
Check if C   P is satisfiable, if it is then the assertion is
violated
C   P is converted to boolean logic using a bit vector
representation for the integer variables y0,x0,x1,x2,x3,x4
Bounded Verification Approaches
• What we have discussed above is bounded verification by
bounding the number of steps of the execution.
• For this approach to work the variable domains also need
to be bounded, otherwise we cannot convert the problems
to boolean SAT
• Bounding the execution steps and bounding the data
domain are two orthogonal approaches.
– When people say bounded verification it may refer to
either of these
– When people say bounded model checking it typically
refers to bounding the execution steps
Symbolic Software Model Checking with
Predicate Abstraction and Counter-Example
Guided Abstraction Refinement
Model Checking Programs Using Abstraction
• Program model checking tools generally rely on automated
abstraction techniques to reduce the state space of the
system such as:
– Abstract interpretation
– Predicate abstraction
• If the abstraction is conservative then, if there is no error in
the abstracted program we can conclude that there is no
error in the original program
• In general the problem is to construct a finite state model
from the program such that the errors or absence of errors
can be demonstrated on the finite state model
– Model extraction problem
Model Checking Programs via Abstraction
• Bandera
– A tool for extracting finite state models from programs
– Uses various abstract domains to map the state space of
the program to a finite set of states via abstraction
• SLAM project at Microsoft Research
– Symbolic model checking for C programs
– Can handle unbounded recursion but does not handle
concurrency
– Uses predicate abstraction, counter-example guided
abstraction refinement and BDDs
Abstraction (A simplified view)
• Abstraction is an effective tool in verification
• Given a transition system, we want to generate an abstract
transition system which is easier to analyze
• However, we want to make sure that
– If a property holds in the abstract transition system, it
also holds in the original (concrete) transition system
Abstraction (A simplified view)
• How do we generate an abstract transition system?
• Merge states in the concrete transition system (based on
some criteria)
– This reduces the number of states, so it should be easier
to do verification
• Do not eliminate transitions
– This will make sure that the paths in the abstract
transition system subsume the paths in the concrete
transition system
Abstraction (A simplified view)
• For every path in the concrete transition system, there is an
equivalent path in the abstract transition system
– If no path in the abstract transition system violate a
property, then no path in the concrete system can violate
the property
• Using this reasoning we can verify ACTL, LTL and ACTL*
properties in the abstract transition system
– If the property holds on the abstract transition system,
we are sure that the property holds in the concrete
transition system
– If the property does not hold in the abstract transition
system, then we are not sure if the property holds or not
in the concrete transition system
Abstraction (A simplified view)
• If the property does not hold in the abstract transition
system, what can we do?
• We can refine the abstract transition system (split some
states that we merged)
• We have to make sure that the refined transition system is
still an abstraction of the concrete transition system
• Then, we can recheck the property again on the refined
transition system
– If the property does not hold again, we can refine again
Predicate Abstraction
• An automated abstraction technique which can be used to
reduce the state space of a program
• The basic idea in predicate abstraction is to remove some
variables from the program by just keeping information
about a set of predicates about them
• For example a predicate such as x = y maybe the only
information necessary about variables x and y to determine
the behavior of the program
– In that case we can just store a boolean variable which
corresponds to the predicate x = y and remove variables
x and y from the program
– Predicate abstraction is a technique for doing such
abstractions automatically
Predicate Abstraction
• Given a program and a set of predicates, predicate
abstraction abstracts the program so that only the
information about the given predicates are preserved
• The abstracted program adds nondeterminism since in
some cases it may not be possible to figure out what the
next value of a predicate will be based on the predicates in
the given set
• One needs an automated theorem prover to compute the
abstraction
Predicate Abstraction, A Very Simple Example
• Assume that we have two integer variables x,y
• We want to abstract the program using a single predicate
“x=y”
• We will divide the states of the program to two:
1. The states where “x=y” is true
2. The states where “x=y” is false, i.e., “xy”
• We will then merge all the states in the same set
– This is an abstraction
– Basically, we forget everything except the value of the
predicate “x=y”
Predicate Abstraction, A Very Simple Example
• We will represent the predicate “x=y” as the boolean
variable B in the abstract program
– “B=true” will mean “x=y” and
– “B=false” will mean “xy”
• Assume that we want to abstract the following program
which contains only one statement:
y := y+1
Predicate Abstraction, Step 1
• Calculate preconditions based on the predicate
{x = y + 1} y := y + 1 {x = y}
precondition for B being true after
executing the statement y:=y+1
{x  y + 1} y := y + 1 {x  y}
precondition for B being false after
executing the statement y:=y+1
Using our temporal logic notation
we can say something like:
{x=y+1}  AX{x=y}
Again, using our temporal logic
notation:
{x≠y+1}  AX{x≠y}
Predicate Abstraction, Step 2
• Use decision procedures to determine if the predicates
used for abstraction imply any of the preconditions
x = y  x = y + 1 ? No
x  y  x = y + 1 ? No
x = y  x  y + 1 ? Yes
x  y  x  y + 1 ? No
Predicate Abstraction, Step 3
• Generate abstract code
Predicate abstraction
wrt the predicate “x=y”
IF B THEN B := false
ELSE B := true | false
y := y + 1
1) Compute
preconditions
3) Generate
abstract code
{x = y + 1} y := y + 1 {x = y}
{x  y + 1} y := y + 1 {x  y}
2) Check
implications
x = y  x = y + 1 ? No
x  y  x = y + 1 ? No
x = y  x  y + 1 ? Yes
x  y  x  y + 1 ? No
Model Checking Push-down Automata
A class of infinite state systems for which model checking is
decidable
• Push-down automata: Finite state control + one stack
• LTL model checking for push-down automata is decidable
• This may sound like a theoretical result but it is the basis of
the approach used in SLAM toolkit for model checking C
programs
– A program with finite data domains which uses recursion
can be modeled as a pushdown automaton
– A Boolean program generated by predicate abstraction
can be represented as a pushdown automaton
Predicate Abstraction + Model Checking Push Down
Automata
• Predicate abstraction combined with results on model
checking pushdown automata led to some promising tools
– SLAM project at Microsoft Research for verification of C
programs
– This tool is being used to verify device drivers at
Microsoft
• The main idea:
– Use predicate abstraction to obtain finite state
abstractions of a program
– A program with finite data domains and recursion can be
modeled as a pushdown automaton
– Use results on model checking push-down automata to
verify the abstracted (recursive) program
SLAM Toolkit
• SLAM toolkit was developed to find errors in windows
device drivers
– Examples in my slides are from the following paper:
• “The SLAM Toolkit”, Thomas Ball and Sriram K.
Rajamani, CAV 2001
• Windows device drivers are required to interact with the
windows kernel according to certain interface rules
• SLAM toolkit has an interface specification language called
SLIC (Specification Language for Interface Checking) which
is used for writing these interface rules
• The SLAM toolkit instruments the driver code with
assertions based on these interface rules
A SLIC Specification for a Lock
SLIC specification:
state {
enum { Unlocked=0, Locked=1 }
state = Unlocked;
}
KeAcquireSpinLock.return {
if (state == Locked)
abort;
else
state = Locked;
}
KeReleaseSpinLock.return {
if (state == Unlocked)
abort;
else
state = Unlocked;
}
• This specification states
that KeAcquireSpinLock
has to be called before
KeReleaseSpinLock is
called,
• and KeAcquireSpinLock
cannot be called back to
back before a
KeReleaseSpinLock is
called, and vice versa
A SLIC Specification for a Lock
SLIC specification:
state {
enum { Unlocked=0, Locked=1 }
state = Unlocked;
}
KeAcquireSpinLock.return {
if (state == Locked)
abort;
else
state = Locked;
}
KeReleaseSpinLock.return {
if (state == Unlocked)
abort;
else
state = Unlocked;
}
Generated C Code:
enum { Unlocked=0, Locked=1 }
state = Unlocked;
void slic_abort() {
SLIC_ERROR: ;
}
void KeAcquireSpinLock_return() {
if (state == Locked)
slic_abort();
else
state = Locked;
}
void KeReleaseSpinLock_return {
if (state == Unlocked)
slic_abort();
else
state = Unlocked;
}
void example() {
Instrumented code do {
KeAcquireSpinLock();
A: KeAcquireSpinLock_return();
void example() {
nPacketsOld = nPackets;
do {
req = devExt->WLHV;
KeAcquireSpinLock();
if(req && req->status){
nPacketsOld = nPackets;
devExt->WLHV = req->Next;
req = devExt->WLHV;
KeReleaseSpinLock();
if(req && req->status){
B:
KeReleaseSpinLock_return();
devExt->WLHV = req->Next;
irp = req->irp;
KeReleaseSpinLock();
if(req->status > 0){
irp = req->irp;
irp->IoS.Status = SUCCESS;
if(req->status > 0){
irp->IoS.Info = req->Status;
irp->IoS.Status = SUCCESS;
} else {
irp->IoS.Info = req->Status;
irp->IoS.Status = FAIL;
} else {
irp->IoS.Info = req->Status;
irp->IoS.Status = FAIL;
}
irp->IoS.Info = req->Status;
SmartDevFreeBlock(req);
}
IoCompleteRequest(irp);
SmartDevFreeBlock(req);
nPackets++;
IoCompleteRequest(irp);
}
nPackets++;
} while(nPackets!=nPacketsOld);
}
KeReleaseSpinLock();
} while(nPackets!=nPacketsOld);
C: KeReleaseSpinLock_return();
KeReleaseSpinLock();
}
}
An Example
Boolean Programs
• After instrumenting the code, the SLAM toolkit converts the
instrumented C program to a Boolean program using
predicate abstraction
• The Boolean program consists of only Boolean variables
– The Boolean variables in the Boolean program are the
predicates that are used during predicate abstraction
• The Boolean program can have unbounded recursion
Boolean Programs
C Code:
enum { Unlocked=0, Locked=1 }
state = Unlocked;
void slic_abort() {
SLIC_ERROR: ;
}
void KeAcquireSpinLock_return() {
if (state == Locked)
slic_abort();
else
state = Locked;
}
void KeReleaseSpinLock_return {
if (state == Unlocked)
slic_abort();
else
state = Unlocked;
}
Boolean Program:
decl {state==Locked},
{state==Unlocked} := F,T;
void slic_abort() begin
SLIC_ERROR: skip;
end
void KeAcquireSpinLock_return()
begin
if ({state==Locked})
slic_abort();
else
{state==Locked},
{state==Unlocked} := T,F;
end
void KeReleaseSpinLock_return()
begin
if ({state == Unlocked})
slic_abort();
else
{state==Locked},
{state==Unlocked} := F,T;
end
C Code:
void example() {
do {
KeAcquireSpinLock();
A: KeAcquireSpinLock_return();
nPacketsOld = nPackets;
req = devExt->WLHV;
if(req && req->status){
devExt->WLHV = req->Next;
KeReleaseSpinLock();
B:
KeReleaseSpinLock_return();
irp = req->irp;
if(req->status > 0){
irp->IoS.Status = SUCCESS;
irp->IoS.Info = req->Status;
} else {
irp->IoS.Status = FAIL;
irp->IoS.Info = req->Status;
}
SmartDevFreeBlock(req);
IoCompleteRequest(irp);
nPackets++;
}
} while(nPackets!=nPacketsOld);
KeReleaseSpinLock();
C: KeReleaseSpinLock_return();
}
Boolean Program:
void example()
begin
do
skip;
A: KeAcquireSpinLock_return();
skip;
if (*) then
skip;
B:
KeReleaseSpinLock_return();
skip;
if (*) then
skip;
else
skip;
fi
skip;
fi
while (*);
skip;
C: KeReleaseSpinLock_return();
end
Abstraction Preserves Correctness
• The Boolean program that is generated with predicate
abstraction is non-deterministic.
– Non-determinism is used to handle the cases where the
predicates used during predicate abstraction are not
sufficient enough to determine which branch will be
taken
• If we find no error in the generated abstract Boolean
program then we are sure that there are no errors in the
original program
– The abstract Boolean program allows more behaviors
than the original program due to non-determinism.
– Hence, if the abstract Boolean program is correct then
the original program is also correct.
Counter-Example Guided Abstraction Refinement
• However, if we find an error in the abstract Boolean
program this does not mean that the original program is
incorrect.
– The erroneous behavior in the abstract Boolean program
could be an infeasible execution path that is caused by
the non-determinism introduced during abstraction.
• Counter-example guided abstraction refinement is a
technique used to iteratively refine the abstract program in
order to remove the spurious counter-example traces
Counter-Example Guided Abstraction Refinement
The basic idea in counter-example guided abstraction
refinement is the following:
• First look for an error in the abstract program (if there are
no errors, we can terminate since we know that the original
program is correct)
• If there is an error in the abstract program, generate a
counter-example path on the abstract program
• Check if the generated counter-example path is feasible
using a theorem prover.
• If the generated path is infeasible add the predicate from
the branch condition where an infeasible choice is made to
the predicate set and generate a new abstract program.
Counter-Example Guided Abstraction Refinement
Refined Boolean Program:
Boolean Program:
(using the predicate (nPackets = npacketsOld))
void example()
the boolean variable b void example()
begin
represents the predicate begin
do
(nPackets = npacketsOld) do
skip;
skip;
A: KeAcquireSpinLock_return();
A: KeAcquireSpinLock_return();
b := T;
skip;
if (*) then
if (*) then
skip;
skip;
B:
KeReleaseSpinLock_return();
B:
KeReleaseSpinLock_return();
skip;
skip;
if (*) then
if (*) then
skip;
skip;
else
else
skip;
skip;
fi
fi
b := b ? F : *;
skip;
fi
fi
while (!b);
while (*);
skip;
skip;
C: KeReleaseSpinLock_return();
C: KeReleaseSpinLock_return();
end
end
Counter-Example Guided Abstraction Refinement
• Using counter-example guided abstraction refinement we
are iteratively creating more an more refined abstractions
• This iterative abstraction refinement loop is not guaranteed
to converge for infinite domains
– This is not surprising since automated verification for
infinite domains is undecidable in general
• The challenge in this approach is automatically choosing
the right set of predicates for abstraction refinement
– This is similar to finding a loop invariant that is strong
enough to prove the property of interest