www.eecs.harvard.edu

Download Report

Transcript www.eecs.harvard.edu

Notes on Cyclone
Extended Static Checking
Greg Morrisett
Harvard University
Static Extended Checking: SEX-C
Similar approach to ESC-M3/Java:
• Calculate a 1st-order predicate describing
the machine state at each program point.
• Generate verification conditions (VCs)
corresponding to run-time checks.
• Feed VCs to a theorem prover.
• Only insert check (and issue warning) if
prover can't show VC is true.
• Key goal: needs to scale well (like typechecking) so it can be used on every editcompile-debug cycle.
2
Example: strcpy
strcpy(char ?d, char
{
while (*s != 0) {
*d = *s;
s++;
d++;
}
*d = 0;
}
?s)
Run-time checks are inserted to
ensure that s and d are not NULL
and in bounds.
6 words passed in instead of 2.
3
Better
strcpy(char ?d, char ?s)
{
unsigned i, n = numelts(s);
assert(n < numelts(d));
for (i=0; i < n && s[i] != 0; i++)
d[i] = s[i];
d[i] = 0;
}
This ought to have no run-time checks
beyond the assert.
4
Even Better:
strncpy(char *d, char *s, uint n)
@assert(n < numelts(d) &&
n <= numelts(s))
{
unsigned i;
for (i=0; i < n && s[i] != 0; i++)
d[i] = s[i];
d[i] = 0;
}
No fat pointers or dynamic checks.
But caller must statically satisfy the
pre-condition.
5
In Practice:
strncpy(char *d, char *s, uint n)
@checks(n < numelts(d) &&
n <= numelts(s))
{
unsigned i;
for (i=0; i < n && s[i] != 0; i++)
d[i] = s[i];
d[i] = 0;
}
If caller can establish pre-condition, no check.
Otherwise, an implicit check is inserted.
Clearly, checks are a limited class of assertions.
6
Results so far…
For the 165 files (~78 Kloc) that make up the
standard libraries and compiler:
• CLibs: stdio, string, …
• CycLib: list, array, splay, dict, set, bignum, …
• Compiler: lex, parse, typing, analyze, xlate to C,…
Eliminated 96% of the (static) checks
• null
: 33,121 out of 34,437 (96%)
• bounds: 13,402 out of 14,022 (95%)
• 225s for bootstrap compared to 221s with all
checks turned off (2% slower) on this laptop.
Optimization standpoint: seems pretty good.
7
Scaling
Compile Times (sec)
LOC vs. Compile Time
12
10
8
6
4
2
0
0
2000
4000
6000
8000
10000
Lines of Cyclone Code
8
Not all Rosy:
Don't do as well at array-intensive code.
For instance, on the AES reference:
• 75% of the checks (377 out of 504)
• 2% slower than all checks turned off.
• 24% slower than original C code.
(most of the overhead is fat pointers)
The primary culprit:
• we are very conservative about arithmetic.
• i.e., x[2*i+1] will throw us off every time.
9
Challenges
Assumed I could use off-the-shelf technology.
But ran into a few problems:
• scalable VC generation
• previously solved problem (see ESC guys.)
• but entertaining to rediscover the solutions.
• usable theorem provers
• for now, rolled our own
• (not the real focus.)
10
Verification-Condition Generation
We started with textbook strongest postconditions:
SP[x := e] A = A[a/x]  x=e[a /x]
(a fresh)
SP[S1;S2] A = SP[S2] (SP[S1] A)
SP[if (e) S1 else S2] A =
SP[S1](A  e0)  SP[S2](A  e=0)
11
Why SP instead of WP?
SP[if (c) skip else fail] A = A  c
When A  c then we can eliminate the check.
Either way, the post-condition is still A  c.
WP[if (c) skip else fail] A =(c  A)  c
For WP, this will be propagated backwards
making it difficult to determine which part of
the pre-condition corresponds to a particular
check.
12
1st Problem with Textbook SP
SP[x := e] A = A[a/x]  x=e[a/x]
What if e has effects?
In particular, what if e is itself an assignment?
Solution: use a monadic interpretation:
SP : Exp  Assn  Term  Assn
13
For Example:
SP[x] A = (x, A)
SP[e1 + e2] A = let (t1,A1) = SP[e1] A
(t2,A2) = SP[e2] A1
in (t1 + t2, A2)
SP[x := e] A =
let (t,A1) = SP[e] A
in (t[a/x], A1[a/x]  x == t[a/x])
14
Or as in Haskell
SP[x] = return x
SP[e1 + e2] = do { t1  SP[e1] ;
t2  SP[e2] ;
return t1 + t2 }
SP[x := e] = do { t  SP[e] ;
replace [a/x] ;
and x == t[a/x] ;
return t[a/x] }
15
One Issue
Of course, this over sequentializes the
code.
C has very liberal order of evaluation
rules which are hopelessly unusable for
any sound analysis.
So we force the evaluation to be left-toright and match our sequentialization.
16
Next Problem: Diamonds
SP[if (e1) S11 else S12 ;
if (e2) S21 else S22 ;
...
if (en) Sn1 else Sn2]A
Textbook approach explodes paths into a tree.
SP[if (e) S1 else S2] A =
SP[S1](A  e0)  SP[S2](A  e=0)
This simply doesn't scale.
• e.g., one procedure had assn with ~1.5B nodes.
• WP has same problem. (see Flanagan & Leino)
17
Hmmm…a lot like naïve CPS
SP[if (e1) S11 else S12 ;
if (e2) S21 else S22 ]A =
Duplicate
result of
1st conditional
which duplicates
the original
assertion.
SP[S21] ((SP[S11](A  e10) 
SP[S12](A  e1=0))  e20)

SP[S22] ((SP[S11](A  e10) 
SP[S12](A  e1=0))  e2=0)
18
Aha! We need a "let":
SP[if (e) S1 else S2] A =
let X=A in
(e0  SP[S1]X)  (e=0  SP[S2]X)
Alternatively, make sure we physically share A.
Oops:
SP[x := e] X = X[a/x]  x=e[a/x]
This would require adding explicit substitutions
to the assertion language to avoid breaking
the sharing.
19
Handling Updates (Necula)
Factor out a local environment:
A = {x=e1  y=e2  …}  B
where neither B nor ei contains program variables
(i.e., x,y,…)
Only the environment needs to change on update:
SP[x := 3] {x=e1  y=e2  …}  B =
{x=3  y=e2  …}  B
So most of the assertion (B) remains unchanged and
can be shared.
20
So Now:
SP : Exp  (Env  Assn)  (Term  Env  Assn)
SP[x] (E,A) = (E(x), (E,A))
SP[e1 + e2] (E,A) =
let (t1,E1,A1) = SP[e1] (E,A)
(t2,E2,A2) = SP[e2] (E,A1)
in (t1 + t2, E2, A2)
SP[x := e] (E,A) =
let (t,E1,A1) = SP[e] (E,A)
in (t, E1[x:=t], A1)
21
Or as in Haskell:
SP[x] = lookup x
SP[e1 + e2] = do { t1  SP[e1] ;
t2  SP[e2] ;
return t1 + t2 }
SP[x := e] = do { t  SP[e] ; set x t;
return t }
22
Note:
Monadic encapsulation crucial from a software
engineering point of view:
• actually have multiple out-going flow edges
due to exceptions, return, etc.
• (see Tan & Appel, VMCAI'06)
• so the monad actually accumulates
(Term  Env  Assn) values for each edge.
• but it still looks as pretty as the previous slide.
• (modulo the fact that it's written in Cyclone.)
23
Diamond Problem Revisited:
SP[if (e) S1 else S2] {x=e1  y=e2  …}  B =
(SP[S1] {x=e1  y=e2  …}Be0) 
(SP[S2] {x=e1  y=e2  …}Be=0) =
({x=t1 y=t2…}  B1) 
({x=u1y=u2 …}  B2) =
{x=ax  y=ay  …} 
((ax= t1  ay = t2  … B1) 
(ax= u1  ay = u2 …  B2))
24
How does the environment help?
SP[if (a) x:=3 else x:= y;
if (b) x:=5 else skip; ] {x=e1  y=e2}  B

{x=v  y=e2}



b=0  v=t
b0  v=5


a0  t=3

B
a=0  t=e2
25
Tah-Dah!
I've rediscovered SSA.
• monadic translation sequentializes and names
intermediate results.
• only need to add fresh variables when two paths
compute different values for a variable.
• so the added equations for conditionals
correspond to -nodes.
Like SSA, worst-case O(n2) but in practice O(n).
Best part: all of the VCs for a given procedure
share the same assertion DAG.
26
Space Scaling
VC Nodes
Function vs. VC Sizes
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
0
5000
10000
15000
20000
25000
AST Nodes
27
So far so good:
Of course, I've glossed over the hard bits:
• loops
• memory
• procedures
Let's talk about loops first…
28
Widening:
Given AB, calculate some C such that
A  C and B  C and |C| < |A|, |B|.
Then we can compute a fixed-point for loop
invariants iteratively:
•
•
•
•
•
start with pre-condition P
process loop-test & body to get P'
see if P'  P. If so, we're done.
if not, widen PP' and iterate.
(glossing over variable scope issues.)
29
Our Widening:
Conceptually, to widen AB
• Calculate the DNF
• Factor out syntactically common
primitive relations:
• In practice, we do a bit of closure first.
• e.g., normalize terms & relations.
• e.g., x==e expands to x  e  x  e.
• Captures any primitive relation that was
found on every path.
30
Widening Algorithm (Take 1):
assn = Prim of reln*term*term
| True | False | And of assn*assn
| Or of assn*assn
widen (Prim(…)) = expand(Prim(…))
widen (True) = {}
widen (And(a1,a2)) =
widen(a1)  widen(a2)
widen (Or(a1,a2)) =
widen(a1)  widen(a2)
...
31
Widening for DAG:
Can't afford to traverse tree so memoize:
widen A = case lookup A of
SOME s => s
| NONE => let s = widen' A in
insert(A,s); s end
widen' (x as Prim(…)) = {x}
widen' (True) = {}
widen' (And(a1,a2)) =
widen(a1)  widen(a2)
widen' (Or(a1,a2)) =
widen(a1)  widen(a2)
32
Hash Consing (ala Shao's Flint)
To make lookup's fast, we hash-cons all
terms and assertions.
• i.e., value numbering
• constant time syntactic [in]equality test.
Other information cached in hash-table:
• widened version of assertion
• negation of assertion
• free variables
33
Note on Explicit Substitution
Originally, we used explicit substitution.
widen S (Subst(S',a)) = widen (S  S') a
widen S (x as Prim(…)) = {S(x)}
widen S (And(a1,a2)) = widen S a1  widen S a2
...
Had to memoize w.r.t. both S and A.
• rarely encountered same S and A.
• result was that memoizing didn't help.
• ergo, back to tree traversal.
Of course, you get more precision if you do the
substitution (but it costs too much.)
34
Back to Loops:
The invariants we generate aren't great.
• worst case is that we get "true"
• we do catch loop-invariant variables.
• if x starts off at i, is incremented and is guarded by
x < e < MAXINT then we can get x >= i.
But:
• covers simple for-loops well
• it's fast: only a couple of iterations
• user can override with explicit invariant
(note: only 2 loops in string library annotated this
way, but plan to do more.)
35
Memory
As in ESC, use a functional array:
terms: t ::= … | upd(tm,ta,tv) | sel(tm,ta)
with the environment tracking mem:
SP[*e] = do { a  SP[e] ;
m  lookup mem;
return sel(m,a) }
McCarthy axioms:
• sel(upd(m,a,v),a) == v
• sel(upd(m,a,v),b) == sel(m,b) when a  b
36
The realities of C bite again…
Consider:
pt x = new Point{1,2};
int *p = &x->y;
*p = 42;
*x;
sel(upd(upd(m,x,{1,2}),
x+offsetof(pt,y),42),x) = {1,2} ??
37
Explode Aggregates?
update(m,x,{1,2}) =
upd(upd(m,x+offsetof(pt,x),1),
x+offsetof(pt,y),2)
This turns out to be too expensive in
practice because you must model
memory down to the byte level.
38
Refined Treatment of Memory
Memory maps roots to aggregate values:
Aggregates: {t1,…,tn} | set(a,t,v) | get(a,t)
Roots:
malloc(n,t)
where n is a program point and t is a term
used to distinguish different dynamic
values allocated at the same point.
Pointer expressions are mapped to paths:
Paths: path ::= root | path  t
39
Selects and Updates:
Sel and upd operate on roots only:
sel(upd(m,r,v),r) = v
sel(upd(m,r,v),r') = sel(m,r') when r != r'
Compound select and update for paths:
select(m,r) = sel(m,r)
select(m,a  t) = get(select(m,a),t)
update(m,r,v) = update(m,r,v)
update(m, a  t, v) =
update(m, a, set(select(m,a),t,v))
40
For Example:
*x = {1,2};
int *p = &x->y;
*p = 42;
update(upd(m,x,{1,2}), xoff(pt,y), 42) =
upd(upd(m, x,{1,2}), x, set({1,2},off(pt, y), 42) =
upd(upd(m, x,{1,2}), x, {1,42})) =
upd(m, x,{1,42})
41
Reasoning about memory:
To reduce:
select(update(m,p1,v),p2)) to select(m,p2)
we need to know p1 and p2 are disjoint paths.
In particular, if one is a prefix of the other, we
cannot reduce (without simplifying paths).
Often, we can show their roots are distinct.
Many times, we can show they are updates to
distinct offsets of the same path prefix.
Otherwise, we give up.
42
Procedures:
Originally, intra-procedural only:
• Programmers could specify pre/postconditions.
Recently, extended to inter-procedural:
• Calculate SP's and propagate to callers.
• If too large, we widen it.
• Go back and strengthen pre-condition of
(non-escaping) callee's by taking
"disjunction" of all call sites' assertions.
43
Summary of VC-Generation
• Started with textbook strongest postconditions.
• Effects: Rewrote as monadic translation.
• Diamond: Factored variables into an
environment to preserve sharing (SSA).
• Loops: Simple but effective widening for
calculating invariants.
• Memory: array-based approach, but care to
avoid blowing up aggregates.
• Extended to inter-procedural summaries.
44
Proving:
• Original plan was to use off-the-shelf
technology.
• eg., Simplify, SAT solvers, etc.
• But found:
• either didn't have decision procedures that
I needed.
• or were way too slow to use on every
compile.
• so like an idiot, decided to roll my own…
45
2 Prover(s):
Simple Prover:
Given a VC: A  C
Widen A to a set of primitive relns.
Calculate DNF for C and check that each
disjunct is a subset of A.
(C is quite small so no blowup here.)
This catches a lot:
•
•
•
•
all but about 2% of the checks we eliminate!
void f(int @x) { …*x… }
if (x != NULL) …*x…
for (i=0; i < numelts(A); i++)…A[i]…
46
2nd Prover:
Given A  C, try to show A  C inconsistent.
Conceptually:
• explore DNF tree (i.e., program paths)
• the real exponential blow up is here.
• so we have a programmer-controlled throttle on
the number of paths we'll explore (default 33).
• accumulate a set of primitive facts.
• at leaves, run simple decision procedures to
look for inconsistencies and prune path.
47
Problem: Arithmetic
To eliminate an array bounds check on an
expression x[i], we can try to prove a
predicate similar to this:
A  0  i < numelts(x)
where A describes the state of the
machine at that program point.
48
Do we need checks here?
char *malloc(unsigned n)
@ensures(n == numelts(result));
void foo(unsigned x) {
char *p = malloc(x+1);
for (int i = 0; i <= x; i++)
p[i] = ‘a’;
}
}
0  i < numelts(p)?
49
You bet!
foo(-1)
void foo(unsigned x) {
char *p = malloc(x+1);
for (int i = 0; i <= x; i++)
p[i] = ‘a’;
i x from loop guard, but this is
}
an unsigned comparison. That is,
}
we are comparing i against 0xffffffff
which always succeeds.
50
Integer Overflow
This example is based on a vulnerability in the
GNU mail utilities (i.e., IMAP servers)
http://archives.neohapsis.com/archives/
fulldisclosure/2005-05/0580.html
There are other situations where wrap-around gets you
into trouble.
So we wanted to take machine arithmetic seriously.
Unfortunately, I haven't yet found a prover that I can
effectively use. (If you know of any, please tell me!)
51
Our (Dumb) Arithmetic Solver
Determines [un]satisfiability of a conjunction of
difference constraints (similar to approach
used by Touchstone & ABCD):
Constraints: x – y S c and x – y U  c
• care needed when generating constraints
• e.g., x + c <= y + k cannot (in general) be
simplified to x - y  (k - c).
Algorithm tries to find cycles in the graphs:
x–x1 U<= c1, x1– x2 U<= c2, … , xn–x U<= cn
where c1+c2+…+cn < 0. That is, x–x < 0.
• again, care needed to avoid internal overflow.
52
Future?
We need provers as libraries/services:
• can we agree upon a logic?
• typed, untyped?
• theories must include useful domains (e.g., Z mod).
• can we agree upon an API?
•
•
•
•
sharing must be preserved
need incremental support, control over search
need counter-example support
need witnesses?
• we can now generate some useful benchmarks.
• multiple metrics: precision vs. time*space
53
Currently:
Memory?
• The functional array encoding of memory doesn't
work well.
• Can we adapt separation logic? Will it actually help?
• Can we integrate refinements into the types?
• Work with A. Nanevski & L. Birkedal a start.
Loops?
• Can we divorce VC-generation from theorem proving
all together? (e.g., by compiling to a language with
inductive predicates?)
54
False Positives:
We still have 2,000 checks left.
I suspect that most are not needed.
How to draw the eye to the ones that are?
• strengthen pre-conditions artificially
(e.g., assume no aliasing, overflow, etc.)
• if we still can't prove the check, then it
should be moved up to a "higher-rank"
warning.
55
Lots of Borrowed Ideas
•
•
•
•
•
ESC M3 & Java
Touchstone, Special-J, Ccured
SPLint (LCLint)
FLINT
ABCD
56
More info...
http://cyclone.thelanguage.org
57