Automating Construction of Prvably Correct Software

Download Report

Transcript Automating Construction of Prvably Correct Software

Your
Wish
is my Command
Automating Construction of
Provably Correct Software
Viktor Kuncak
EPFL School of Computer and Communication Sciences
Laboratory for Automated Reasoning and Analysis
http://lara.epfl.ch
This Talk
wish
requirement
formalization
specification (constraint): C
implementation (program): p
conventional
compilation
How to automatically transform
specifications into implementations?
Command
11011001 01011101
11011001 01011101
11011001 01011101
11011001 01011101
Example Wish: Sorting
input
8900 > 6000
output
8900
2900
6000
6000
24140
8900
2900
24140
2900 < 6000
6000 < 8900
8900 < 24140
Given a list of numbers, make this list sorted
wish
Sorting Specification as a Program
input
8900 > 6000
output
8900
2900
6000
6000
24140
8900
2900
24140
2900 < 6000
6000 < 8900
8900 < 24140
Given a list of numbers, make this list sorted
wish
def sort_spec(input : List, output : List) : Boolean =
content(output)==content(input) /\ isSorted(output)
Specification (for us) is a program that checks, for a
given input, whether the given output is acceptable
Specification vs Implementation
constraint on the output
def C(i : List, o : List) : Boolean =
content(o)==content(i) /\ isSorted(o)
more behaviors
true / false
specification
input
8900
6000
24140
2900
output
U
2900
implementation
pC
6000
8900
24140
fewer behaviors
function that computes the output
def p(i : List) : List =
sort i using a sorting algorithm and return the result
Synthesizing Sort in Leon System
http://leon.epfl.ch
OOPSLA 2013:
Synthesis Modulo Recursive Functions
Etienne Kneuss Ivan Kuraj
Philippe Suter
Example Results
Techniques used:
–
–
–
–
–
–
–
–
–
Leon’s verification capabilities
synthesis for theory of trees
recursion schemas
case splitting
symbolic exploration of the
space of programs
synthesis based on type
inhabitation
fast falsification using previous
counterexamples
learning conditional
expressions
cost-based search over
possible synthesis steps
Approaches and Their Guarantees
both specification C and program p are given:
a) Check assertion while
program p runs: C(i,p(i))
b) Verify whether program
always meets the spec:
i. C(i,p(i))
only specification C is given:
c) Constraint
programming: once i is
known, find o to satisfy a
given constraint: find o
such that C(i,o)
run-time
d) Synthesis: solve C
symbolically to obtain
program p that is correct
by construction, for all
inputs: find p such that
i.C(i,p(i))
i.e. p  C
compile-time
Runtime Assertion Checking
a) Check assertion while program p runs: C(i,p(i))
def p(i : List) : List = {
sort i using a sorting algorithm and return the result
} ensuring (o ⇒ content(i)==content(o) && isSorted(o))
def content(lst : List) = lst match {
Already works in Scala!
case Nil() ⇒ Set.empty
Key design decision:
case Cons(x, xs) ⇒ Set(x) ++ content(xs)
constraints are programs
}
def isSorted(lst : List) = lst match {
Ongoing:
case Nil()
⇒ true
high-level optimization
case Cons(_, Nil()) ⇒ true
of run-time checks
case Cons(x, Cons(y, ys)) ⇒
Can we give stronger guarantees?
x < y && isSorted(Cons(y, ys))
 prove postcondition always true
}
Static Verification in Leon
b) Verify that program always meets spec: i. C(i,p(i))
def p(i : List) : List = {
sort i using a sorting algorithm and return the result
} ensuring (o ⇒ content(i)==content(o) && isSorted(o))
def content(lst : List) = lst match {
Type in a Scala program
case Nil() ⇒ Set.empty
and spec, see it verified
case Cons(x, xs) ⇒ Set(x) ++ content(xs)
}
def isSorted(lst : List) = lst match {
timeout
case Nil()
⇒ true
case Cons(_, Nil()) ⇒ true
proof of
input i such that
case Cons(x, Cons(y, ys)) ⇒
not C (i,p(i))
i. C(i,p(i))
x < y && isSorted(Cons(y, ys))
}
Insertion Sort Verified as You Type It

Web interface: http://lara.epfl.ch/leon
Reported Counterexample in Case of a Bug
Approaches and Their Guarantees
both specification C and program p are given:
a) Check assertion while
program p runs: C(i,p(i))
b) Verify that program
always meets spec:
i. C(i,p(i))
only specification C is given:
c) Constraint
programming: once i is
known, find o to satisfy a
given constraint: find o
such that C(i,o)
run-time
d) Synthesis: solve C
symbolically to obtain
program p that is correct
by construction, for all
inputs: find p such that
i.C(i,p(i))
i.e. p  C
compile-time
Using Assertions to Compute
• what to do when assertion fail?
– presumably some values are wrong
– what to change (e.g. in repair)
• alternative: leave some variables unknown
(logical variables); find their values to satisfy
the assertions: constraint programming
• like CLP, but
– richer constraints
– new compilation techniques (synthesis)
– embedded in Scala, no Prolog with "cut"
Programming with Specifications
c) Constraint programming: find a value that
satisfies a given constraint: find o such that C(i,o)
Method: use verification technology, try to prove
that no such o exists, report counter-examples!
Philippe Suter Ali Sinan Köksal
Etienne Kneuss
Sorting a List Using Specifications
def content(lst : List) = lst match {
case Nil() ⇒ Set.empty
case Cons(x, xs) ⇒ Set(x) ++ content(xs)
}
def isSorted(lst : List) = lst match {
case Nil()
⇒ true
case Cons(_, Nil()) ⇒ true
case Cons(x, Cons(y, ys)) ⇒ x < y && isSorted(Cons(y,ys))
}
((l : List) ⇒ isSorted(lst) && content(lst) == Set(0, 1, -3))
.solve
> Cons(-3, Cons(0, Cons(1, Nil())))
Comparison: Date Conversion in C
Knowing number of days since 1980, find current year and day
BOOL ConvertDays(UINT32 days) {
year = 1980;
while (days > 365) {
if (IsLeapYear(year)) {
if (days > 366) {
days -= 366;
year += 1;
}
} else {
days -= 365;
year += 1;
} ...
}
Enter December 31, 2008
All music players (of a major brand)
froze in the boot sequence.
Date Conversion using Specifications
Knowing number of days since 1980, find current year and day
val origin = 1980 // beginning of the universe
def leapsTill(y : Int) = (y-1)/4 - (y-1)/100 + (y-1)/400
val (year, day)=choose( (year:Int, day:Int) => {
days == (year-origin)*365 + leapsTill(year)-leapsTill(origin) + day &&
0 < day && day <= 366
}) // Choose year and day such that the property holds.
• We did not write how to compute year and day
• Instead, we gave a constraint they should satisfy
• We defined them implicitly, though this constraint
• More freedom (can still do it the old way, if needed)
• Correctness, termination simpler than with loop
Implementation:
next 30 pages
invariants specification
Formalizing Tree Invariants (in Scala)
sealed abstract class Tree
case class Empty() extends Tree
case class Node(color: Color, left: Tree, value: Int, right: Tree) extends Tree
def blackBalanced(t : Tree) : Boolean = t match {
case Node(_,l,_,r) => blackBalanced(l) && blackBalanced(r) &&
blackHeight(l) ==blackHeight(r)
case Empty() => true
}
def blackHeight(t : Tree) : Int = t match {
case Empty() => 1
case Node(Black(), l, _, _) => blackHeight(l) + 1
case Node(Red(), l, _, _) => blackHeight(l)
}
def rb(t: Tree) : Boolean = t match {
case Empty() => true
case Node(Black(), l, _, r) => rb(l) && rb(r)
case Node(Red(), l, _, r) => isBlack(l) && isBlack(r) && rb(l) && rb(r)
}
... def isSorted(t:Tree) = ...
Define Abstraction as ‘tree fold’
def content(t: Tree) : Set[Int] = t match {
case Empty() => Set.empty
case Node(_, l, v, r) => content(l) ++ Set(v) ++ content(r)
}
7
4
2
9
5
{ 2, 4, 5, 7, 9 }
We can now define insertion
def insert(x : Int, t : Tree) = choose(t1:Tree =>
isRBT(t1) && content(t1) = content(t) ++ Set(x))
Objection: it took a lot of effort to write isRBT
Answer:
• no more effort than implementation - wrote some functions
• these invariants is what drives data structure design
• this is how things are explained in a textbook
• it promotes reuse!
Evolving the Program
Suppose we have a red-black tree implementation
We only implemented ‘insert’ and ‘lookup’
Now we also need to implement ‘remove’
void RBDelete(rb_red_blk_tree* tree, rb_red_blk_node* z){
rb_red_blk_node* y;
rb_red_blk_node* x;
rb_red_blk_node* nil=tree->nil;
rb_red_blk_node* root=tree->root;
y= ((z->left == nil) || (z->right == nil)) ? z : TreeSuccessor(tree,z);
x= (y->left == nil) ? y->right : y->left;
if (root == (x->parent = y->parent)) { /* assignment of y->p to x->p is intentional */
root->left=x;
} else {
if (y == y->parent->left) {
y->parent->left=x;
} else {
y->parent->right=x;
}
}
if (y != z) { /* y should not be nil in this case */
#ifdef DEBUG_ASSERT
Assert( (y!=tree->nil),"y is nil in RBDelete\n");
#endif
/* y is the node to splice out and x is its child */
if (!(y->red)) RBDeleteFixUp(tree,x);
tree->DestroyKey(z->key);
tree->DestroyInfo(z->info);
y->left=z->left;
y->right=z->right;
y->parent=z->parent;
y->red=z->red;
z->left->parent=z->right->parent=y;
if (z == z->parent->left) {
z->parent->left=y;
} else {
z->parent->right=y;
}
free(z);
} else {
tree->DestroyKey(y->key);
tree->DestroyInfo(y->info);
if (!(y->red)) RBDeleteFixUp(tree,x);
free(y);
}
#ifdef DEBUG_ASSERT
Assert(!tree->nil->red,"nil not black in RBDelete");
#endif
void RBDeleteFixUp(rb_red_blk_tree* tree, rb_red_blk_node* x) {
rb_red_blk_node* root=tree->root->left;
rb_red_blk_node* w;
while( (!x->red) && (root != x)) {
if (x == x->parent->left) {
w=x->parent->right;
if (w->red) {
w->red=0;
x->parent->red=1;
LeftRotate(tree,x->parent);
w=x->parent->right;
}
if ( (!w->right->red) && (!w->left->red) ) {
w->red=1;
x=x->parent;
} else {
if (!w->right->red) {
w->left->red=0;
w->red=1;
RightRotate(tree,w);
w=x->parent->right;
}
w->red=x->parent->red;
x->parent->red=0;
w->right->red=0;
LeftRotate(tree,x->parent);
x=root; /* this is to exit while loop */
}
} else { /* the code below is has left and right switched from above */
w=x->parent->left;
if (w->red) {
w->red=0;
x->parent->red=1;
RightRotate(tree,x->parent);
w=x->parent->left;
}
if ( (!w->right->red) && (!w->left->red) ) {
w->red=1;
x=x->parent;
} else {
if (!w->left->red) {
w->right->red=0;
w->red=1;
LeftRotate(tree,w);
w=x->parent->left;
}
w->red=x->parent->red;
x->parent->red=0;
w->left->red=0;
RightRotate(tree,x->parent);
x=root; /* this is to exit while loop */
}
}
}
x->red=0;
#ifdef DEBUG_ASSERT
140 lines of tricky C, even
reusing existing functions
remove using specifications: 2 lines
def remove(x : Int, t : Tree) = choose(t1:Tree =>
isRBT(t1) && content(t1)=content(t) – Set(x))
The biggest expected payoff:
properties are more reusable
Further Features Supported
• computing minimal / maximal solution of
constraints value using binary search
• on-the-fly construction of constraints
– first-class constraints, like first-class functions
– but they can also be syntactically manipulated
• enumeration of all values that satisfy constraint
– application in automated test input generation
– can be used like Korat tool for test generation
Approaches and Their Guarantees
both specification C and program p are given:
a) Check assertion while
program p runs: C(i,p(i))
b) Verify that program
always meets spec:
i. C(i,p(i))
only specification C is given:
c) Constraint
programming: once i is
known, find o to satisfy a
given constraint: find o
such that C(i,o)
run-time
d) Synthesis: solve C
symbolically to obtain
program p that is correct
by construction, for all
inputs: find p such that
i.C(i,p(i))
i.e. p  C
compile-time
Implicit Programming (ERC project)
o
y
x
specification
(constraint)
implicit
i
i is assignment for some vars of a
propositional formula
o is its completion to make formula true
x2 + y2 = 1
synthesis
U
o
U
x
implementation
(function)
explicit
i
y = sqrt(1-x2)
compute missing part of
a satisfying assignment (SAT)
Synthesis for Linear Arithmetic
def secondsToTime(totalSeconds: Int) : (Int, Int, Int) =
choose((h: Int, m: Int, s: Int) ⇒ (
h * 3600 + m * 60 + s == totalSeconds
&& h ≥ 0
could infer from types
&& m ≥ 0 && m < 60
&& s ≥ 0 && s < 60 ))
def secondsToTime(totalSeconds: Int) : (Int, Int, Int) =
val t1 = totalSeconds div 3600
val t2 = totalSeconds -3600 * t1
val t3 = t2 div 60
val t4 = totalSeconds - 3600 * t1 - 60 * t3
(t1, t3, t4)
close to a wish
Compile-time warnings
def secondsToTime(totalSeconds: Int) : (Int, Int, Int) =
choose((h: Int, m: Int, s: Int) ⇒ (
h * 3600 + m * 60 + s == totalSeconds
&& h ≥ 0
&& m ≥ 0 && m ≤ 60
&& s ≥ 0 && s < 60
))
Warning: Synthesis predicate has multiple
solutions for variable assignment:
totalSeconds = 60
Solution 1: h = 0, m = 0, s = 60
Solution 2: h = 0, m = 1, s = 0
Synthesis for sets (BAPA)
def splitBalanced[T](s: Set[T]) : (Set[T], Set[T]) =
choose((a: Set[T], b: Set[T]) ⇒ (
a.size – b.size ≤ 1 &&
balanced
b.size – a.size ≤ 1 &&
we can conjoin specs
a union b == s && a intersect b == empty
))
partition
def splitBalanced[T](s: Set[T]) : (Set[T], Set[T]) =
val k = ((s.size + 1)/2).floor
val t1 = k
s
val t2 = s.size – k
val s1 = take(t1, s)
val s2 = take(t2, s minus s1)
(s1, s2)
a
Mikael
Mayer
Ruzica
Piskac
Philippe
Suter
b
Synthesis for Theories
assert(i % 2 == 1)
3 i + 2 o = 13
o = (13 – 3 i)/2
• Wanted: "Gaussian elimination" for programs
– for linear integer equations: extended Euclid’s algorithm
– need to handle disjunctions, negations, more data types
• For every formula in e.g. Presburger arithmetic
– synthesis algorithm terminates
– produces the most general precondition
(assertion characterizing when the result exists)
– generated code always terminates and gives correct result
• If there are multiple or no solutions for some input
parameters, the algorithm identifies those inputs
• Works not only for arithmetic but also for e.g.
sets with sizes and for trees
• Goal: lift everything done for SMT solvers to synthesizers
Decision & Synthesis Procedures
For a well-defined class of formulas:
Decision procedure
(modelgenerating)
a theorem prover that always succeeds
Synthesis procedure
a synthesizer that always succeeds
• Input: a formula
• Input: a formula, with
input and output variables
• Output: a model
of the formula
• Output: a program to compute
output values from input values
5a + 7x = 31
a↦2
x↦3
Inputs: { a } outputs: { x }
5a + 7x = 31
x ↦ (31 – 5a) / 7
33
Framework: Transforming Relations
⟦ a̅ ⟨ C1 ⟩ x̅ ⟧ ⊦ ⟦ a̅ ⟨ C2 ⟩ x̅ ⟧
Input variables
Synthesis predicate
Output variables
∀ a̅, x̅. C2 ⇒ C1
Refinement
∀ a̅. (∃ x̅ : C1) ⇒ (∃ x̅ : C2)
Domain preservation
34
Programs as Relations
⟦ a̅ ⟨ C ⟩ x̅ ⟧ ⊦ ⟨ P | T̅ ⟩
Program terms
Input variables
Output variables
Represents the relation:
Precondition
P ∧ (x̅ = T̅)
∀ a̅. P ⇒ C[x̅ ↦T̅]
∀ a̅. (∃ x̅ : C) ⇒ P
Refinement
Domain preservation
35
Compare to Quantifier Elimination
• A problem of the form:
⟦ a̅ ⟨ C ⟩ x̅ ⟧
• Corresponds to constructively
solving the quantifier
elimination problem:
∃ x̅ : C( a̅, x̅ )
• In the solution, P corresponds to
the result of Q.E. and T̅ are
witness terms.
⟨ P | T̅ ⟩
36
Transforming Relations
Equivalence
Case-Split
⟦ a̅ ⟨ C1 ⟩ x̅ ⟧ ⊦ ⟨ P | T̅ ⟩
⟦ a̅ ⟨ C2 ⟩ x̅ ⟧ ⊦ ⟨ P | T̅ ⟩
⟦ a̅ ⟨ C1 ⟩ x̅ ⟧ ⊦ ⟨ P1 | T̅1 ⟩
One-Point
C1 ⇔ C2
⟦ a̅ ⟨ C2 ⟩ x̅ ⟧ ⊦ ⟨ P2 | T̅2 ⟩
⟦ a̅ ⟨ C1 ∨ C2 ⟩ x̅ ⟧ ⊦ ⟨ P1 ∨ P2 | if(P1) T̅1 else T̅2 ⟩
⟦ a̅ ⟨ C[x0↦t] ⟩ x̅ ⟧ ⊦ ⟨ P | T̅ ⟩
x0 ∉ vars(t)
⟦ a̅ ⟨ x0 = t ∧ C ⟩ x0;x̅ ⟧ ⊦ ⟨ P | let x̅ := T̅ in t;x̅ ⟩
37
Synthesis for Linear Integer Arithmetic
⟦ a ⟨ 7t ≤ a ∧ 5a ≤ 12t ⟩ t ⟧ ⊦ ⟨ ⌈5a/12⌉ ≤ ⌊3a/7⌋ | ⌈5a/12⌉ ⟩
⟦ a ⟨ 5x + 7y = a ∧ 0 ≤ x ∧ x ≤ y ⟩ x,y ⟧ ⊦
⟨ ⌈5a/12⌉
⌈5a/12⌉ ≤≤ ⌊3a/7⌋
⌊3a/7⌋ | let t = ⌈5a/12⌉ in (-7t+3a, 5t-2a) ⟩
5x + 7y = a
One-dimensional
solution space.
x = -7t + 3a
y = 5t – 2a
7t ≤ a ∧ 5a ≤ 12t
t is bound on both sides, and
admits a solution whenever
⌈5a/12⌉ ≤ ⌊3a/7⌋
is a solution for any t.
38
And/Or Search for Rule Applications
Synthesis problem
Rule application
…
F
5
E
C
4
1
…
…
Driven by cost
- For rule applications: size of
term contributed to program.
- For (sub)problems: estimate
based on variables and boolean
structure.
…
D
…
2
6
H
7
J
G
A
3
B
39
Synthesis in http://lara.epfl.ch/leon
Techniques used:
–
–
–
–
–
–
–
–
–
Leon’s verification capabilities
synthesis for theory of trees
recursion schemas
case splitting
symbolic exploration of the
space of programs
synthesis based on type
inhabitation
fast falsification using previous
counterexamples
learning conditional
expressions
cost-based search over
possible synthesis steps
Generating Expression Terms
• What do we do with problems that:
– do not fall in a well-defined, synthesizable subset,
– do not get simplified by decomposition?
• Use counter-example guided inductive
synthesis to search over small expressions
• Two algorithms
– use SMT solvers to enumerate terms and evaluate
them to find new blocking clauses
– type-based enumeration (Gvero,Piskac,Kuraj)
combined with discovery of preconditions
41
Approaches and Their Guarantees
both specification C and program p are given:
a) Check assertion while
program p runs: C(i,p(i))
b) Verify that program
always meets spec:
i. C(i,p(i))
only specification C is given:
c) Constraint
programming: once i is
known, find o to satisfy a
given constraint: find o
such that C(i,o)
run-time
d) Synthesis: solve C
symbolically to obtain
program p that is correct
by construction, for all
inputs: find p such that
i.C(i,p(i))
i.e. p  C
compile-time
Synthesis and Constraint Solving
If we did not find an expression that solves it in
all cases, we emit a runtime call to solver
Result: solver invoked only in some cases
– for some components of result
– for some conditions on inputs
after timeout, close the remaining
branches by inserting a
C
runtime solver call
F
5
4
1
2
A
3
E
…
…
…
…
D
…
6
H
7
J
G
B
Example Data Structure with Cache
case class CTree(cache : Int, data : Tree)
def inv(ct : CTree) : Boolean = isRBT(data) &&
(ct.data = Empty || content(ct.data) contains ct.cache)
def member(v : Int, ct : CTree) : Boolean = {
require(inv(ct))
choose( (x:Boolean) => x == (content(ct.data) contains v)) }
ADT and equality split, one point rule, simplifications
def member(v : Int, ct : CTree) : Boolean = { require(inv(ct))
ct.data match {
case n:Node => if (ct.cache == v) true
else choose( (x:Boolean) => x == (content(ct.data) contains v))
case Empty => false }
Synthesis did not solve fully but optimized spec for 2 common cases
From In-Memory to External Sorting
treeFold[2k]([],
unfoldR(
funcPow[k](mrg))
in-memory sort
external 2k-way
merge sort with blocking
C implementation
• transformation rules for monad algebra of nested sequences
• exploration of equivalent algorithms through
performance estimation w/ non-linear constraint solving
Synthesis of Out-of-Core Algorithms (SIGMOD 2013)
Ioannis Klonatos
Christoph Koch
Andres Nötzli
Andrej Spielmann
Real-World Reasoning
Gap between floating points and reality
– input measurement error
– floating-point round-off error
– numerical method error
x<y need not mean x*<y*
Automated verification tools to
• compute upper error bound
• generate code to match math
Applied to code fragments for
• embedded systems (car,train)
• physics simulations
OOPSLA'11,RV'12, EMSOFT'13,
POPL'14
Eva Darulova
wish
requirement
formalization
specification (constraint): C
implementation (program): p
conventional
compilation
Can we help with designing specification
themselves, to make programming
accessible to non-experts?
Command
11011001 01011101
11011001 01011101
11011001 01011101
11011001 01011101
Programming by Demonstration
Describe functionality by demonstrating and modifying
behaviors while the program runs
– demonstrate desired actions by moving back in time and
referring to past events
– system generalizes demonstrations into rules
http://www.youtube.com/watch?v=bErU--8GRsQ
Try "Pong Designer" in Android Play Store
Mikael Mayer and Lomig Mégard
SPLASH Onward'13