Document 7360490

Download Report

Transcript Document 7360490

Assume-Guarantee Testing
Colin Blundell (UPenn)
Dimitra Giannakopoulou (Riacs)
Corina Păsăreanu (QSS)
Robust Software Engineering
NASA Ames Research Center
Moffett Field, CA, USA
Component-Based Systems
Component A
Component B
Systems built in a modular fashion (from components)
Also verify systems in a modular fashion!
infer properties form local properties of components
construct appropriate environments to check components in
isolation
Efficient verification of assembled system based on
knowledge about its structure
Compositional Verification
Does system made up of M1 and M2 satisfy property P?
M1
satisfies P?
A
M2
Check P on entire system: too many states!
Use the natural decomposition of the system into
its components to break-up the verification task
Check components in isolation: M1 satisfies P?
Typically a component is designed to satisfy
its requirements in specific contexts
Assume-guarantee reasoning: introduces
assumption A representing M1’s “context”
Reasons about triples: A M1 P
Assume-Guarantee Rules
M1
A1
Simplest assume-guarantee rule
1. A
2. true
M1
M2
P
A
“discharge” the
assumption
3. true M1 || M2 P
satisfies P?
Symmetric rules - example
A2
M2
1. A1 M1
P
2. A2 M2
P
3. C (A1, A2, P)
true A1 || A2 P
4. true M1 || M2 P
Coming up with assumptions is a non-trivial process
Approaches
Infer assumptions automatically
Two solutions developed
1. Algorithmic generation of assumption (controller);
knowledge of environment is not required
– ASE’02, JASE’05
2. Incremental assumption computation based on
counterexamples, learning and knowledge of environment
– TACAS’03, SAVCBS’03 (symmetric rules)
Challenge: what about actual implementations?
Context
Components modeled as labeled transition systems (LTSs)
assembled with parallel composition operator “||” that synchronizes
shared actions and interleaves remaining actions
a trace t is a sequence of actions that can be performed from the
initial state; t ↾ is t where all actions not in  are removed
A property P is an LTS
describes legal behaviors in terms of its alphabet P
An assumption A is an LTS
restricts the environment to the set of its legal behaviors
in
P
in
M1
0
1
send
send
M2
0
0
1
1
out
{in, out}
send
A
0
1
{out, send}
Design/Code Level Analysis
Design
M1
A M2
P
Code
C1
A C2
P
Does M1 || M2 satisfy P? Model check.
Does C1 || C2 satisfy P?
Model check (ICSE’2004) – good results but may not scale
TEST!
Assume Guarantee Testing
in
M1
0
send
M2
0
0
1
send
in
C1
0
1
1
{in, out}
out
send
1
in
P
out
C2
2
0
i1
Assume-Guarantee traces:
C1: in  send  i1
C2: out  send  i2
send
1
send
A
2
0
1
{out, send}
i2
Monolithic - C1 || C2 traces:
in  out  send  i1  i2
in  out  send  i2  i1
out  in  send  i1  i2
out  in  send  i2  i1
Discovering Bugs with Fewer Tests
send
A
0
in
P
1
0
{out, send}
1
{in, out}
Monolithic
AG
t1
t2
in
in
out
out
in
out
out
out
in
in
send
send
send
send
send
send
i1
i2
i2
i1
i2
i1
i1
i2
i1
i2
AG Testing in Practice
Create testing environments for C1 and C2
for C1, universal environment restricted by assumption A
for C2, universal environment
Components can be checked separately as soon as they are code
complete
More refined testing environments; fewer false positives
More control over component interface; easier to reproduce
behavior and exercise more traces
Potential for checking more behaviors of the system with the same
amount of coverage, as in our example
incompatible traces may not execute shared events in the same order
enough traces must be generated to ensure appropriate coverage
Open problem: what is an appropriate measure of component
coverage?
potentially use component models as coverage metric
Predictive Analysis
t1
in
trace t in C1||C2
in
send
t↾C1
i1
out
send
i2
i1
t↾C2
t2
out
send
i2
AG
reasoning
Error
Predictive Analysis
No problem with incompatible traces
AG approach is more efficient that existing predictive
approaches that compose traces
If models are unavailable, can apply our assumption
generation techniques to the projected traces (needs
further experimentation)
Case Study: K9 Executive
Exec Timer
Plan
Watcher
Alternate
plan library
Event Queue
Internal
Executive
Action
Execution
DBMonitor
Database
ExecCondChecker
Executive component executes flexible plans
Branching on state / temporal conditions and support for floating contingencies
35K lines of C++ code
multiple threads that communicate through an event queue
Models created in collaboration with developers
assumptions generated at model level; several synchronization problems detected
compositional MC achieved x10 improvement in space over monolithic MC
AG testing detected inconsistency between model and code
Testing Framework
Design-level
Assumptions and
Properties
instrumentation
K9 Executive
Plan
Generator
(JPF)
input plans
Exec Timer
Plan
Watcher
Alternate
plan library
Event Queue
Internal
Executive
Action
Execution
DBMonitor
Database
ExecCondChecker
event stream
Observer
(Eagle)
reports
Related Work
Component based design and verification
Mocha model checker (Alur et al.), interface automata (de Alfaro &
Henzinger), thread-modular verification (Flanagan et al.)
Assume guarantee model checking for source code
with Verisoft using manually provided assumptions (Dingel)
with JPF (Giannakopoulou, Pasareanu, Cobleigh) using automatically
generated design level assumptions
Specification-based testing
Jagadeesan et al., Raymond et al., use specifications (assumptions) to
generate test inputs and (guarantees) to generate oracles
Grieskamp et al. use AsmL to generate FSMs that are traversed in different
ways to generate test inputs
Predictive analysis, Sen et al.
Use of AG reasoning to combine runtime monitors, Levy et al.
Conclusions & Future Work
AG testing improves traditional testing
detects violations of system requirements when testing
individual components
predicts violations of system requirements based on correct
system runs
Measure the benefits
come up with appropriate coverage criteria for componentbased testing; when have individual components been
tested enough to guarantee correctness of assembly?
potentially use models for definition of coverage