Transcript [pptx]

Ross Tate, Juan Chen, Chris Hawblitzel
Typed Assembly Languages
 Compilers are great
 but they make mistakes
 and can introduce vulnerabilities
 Typed assembly language
 includes a proof of (memory) safety
 verified by a trusted proof checker
 no need to trust the compiler
 Certifying compilers
 generate typed assembly language
 traditionally use “type-preservation”
C#
Certifying
Compiler
TAL
Trusted
Proof
Checker
Source Program
Intermediate
Representation
Class/Function
Signatures
Type/Proof
Annotations
IR1
types/proofs
sigs
annots
Optimizations/Conversions
Type-Preserving
types/proofs
IR2 Compiler
sigs
annots
Optimizations/Conversions
x86
types/proofs
annots
sigs
Proof Checker
• Burden to preserve types at each stage
• Hard to adopt in existing compilers
• Types/proofs increase size of executable
Source Program
IR1
sigs
Optimizations/Conversions
Traditional
Compiler
IR2 sigs
Optimizations/Conversions
• Requires little change
• Smaller annotation size
?
Can inference be
effective enough
x86
sigs
Signature information
is already preserved
in traditional compilers
Easy to change compiler
to write sig info to file
Type Inference
Infer proof annotations
x86
sigs annots
Proof Checker
Effectiveness of Type Inference
100%
% of Inferable Methods
100.0%
100.0%
100.0%
ahcbench
mandelform
sat_solver
94.6%
97.6%
97.4%
96.1%
zinger
lcscbench
bartok
asmlc
80%
60%
40%
20%
0%
Geomean
Capable of type checking all C# features except:
• Exceptions and Delegates
• matters of implementation, not due to theoretical limitations
Broken C# Pseudo-Assembly
Could actually
be an ArrayList
Could actually
be a LinkedList
bool bad(a, b : List) {
Grabs a’s vtable
vt = a.vtable;
mp = vt.isEmpty; Grabs a’s implementation of isEmpty
Calls a’s isEmpty with b as “this”
c = mp(b);
return c;
}
a’s implementation of isEmpty
may fail to work on b
Broken C# Pseudo-Assembly
a and b are each instances of
More
specificdifferent)
some
(possibly
function
signature
subclass
of List
Traditional TAL [PLDI ‘08]
bool bad(a, b : ∃γ≪List.
List) {
Ins(γ)) {
α must be fresh
vt
= a.vtable;
open
a as Ins(α); a is given type exactly Ins(α) where α ≪ List
Pseudo-instruction
for the type checker
Via signature &
mp = vt.isEmpty; vt is given type VTable(α)
memory layout
mp is given type (∃γ≪α. Ins(γ))→bool
c = mp(b);
information
β must be fresh
return
open bc;as Ins(β); b is given type exactly
The Ins(β)
“this” where
pointerβ ≪ List
must belong to α
} c = mp(pack b as ∃γ≪α. Ins(α));
Checks that there is some γ extending α
such that b has type Ins(γ)
Check fails since b has type Ins(β)
and β does not extend α
Broken C# Pseudo-Assembly
Traditional TAL [PLDI ‘08] Inferable TAL
bool bad(a, b : ∃γ≪List. Ins(γ)) {
open a as Ins(α);
No pack annotations
vt = a.vtable;
mp = vt.isEmpty;
No open annotations
open b as Ins(β);
NoIns(α));
loop invariants!
c = mp(b);
mp(pack b as ∃γ≪α.
return c;
}
Use
type inference instead
Inference Strategy
 Always open existential types as soon as possible
 Use subtyping in place of pack:
Given a valid
substitution of
variables
θ: ∆’ → ∆
τ ≤ τ’[θ]
∃∆.τ ≤ ∃∆’.τ’
Such that theusing
Subsumes
bodies are
open
and
subtypes
afterpack
substitution
 Use abstract interpretation over existential types
Then the existential
 Requires subtyping and
join algorithms
types are subtypes
Subtyping alone
of bounded
Designed a category-theoretic
framework
for existential types
is undecidable!
• Constructive:existential
includestypes
abstract
algorithms for inference
• Instructive:
specifies type design guidelines
Type Checking with iTalX
Inference Strategy
Immediately open
a and b
α
≪ Information
List ⇒
Signature
Ins(α)
α ≪has
Listfields:
⇒
vtable : VTable(α)
VTable(α)
has fields:
Type ⋮⋮Check
Ins(β)
≤ ∃γ≪α.
isEmpty
: (⋯) →Ins(γ)
bool
Check⋮ Fails
Signature Information
β does not extend α
bool bad(a, b : ∃γ≪List. Ins(γ)) {
vt = a.vtable;
∃α, β : α≪List, β≪List.
a : Ins(α)
mp = vt.isEmpty;
b : Ins(β)
c = mp(b);
vt : VTable(α)
return c;
mp : (∃γ≪α. Ins(γ)) → bool
}
Expressiveness of iTalX
 iTalX is capable of handling the following features:
 Classes, interfaces, generics, and multiple inheritance
 Dynamic dispatch and dynamic casts
 Covariant arrays as classes, and array-bounds checks
 By-reference parameters (ref), structs, and value types
 Jump tables and complex stack manipulation
 iTalX is also robust with respect to many optimizations
 iTalX should be able to handle the remaining features:
 Delegates and exceptions
 In experiments, iTalX currently verifies 97.9% of methods
Efficiency of iTalX
Inference Time/Compilation Time
100%
80%
60%
34.7%
40%
20%
1.9%
0%
1.3%
9.3%
ahcbench mandelform sat_solver
13.7%
13.9%
zinger
lcscbench
16.0%
bartok
asmlc
Inferring Assembly-Level Types is Affordable
Geomean
Type Annotation Size
iTalX/Traditional TAL [PLDI ’08]
100%
79%
80%
60%
40%
59%
56%
40%
23%
23%
28%
20%
0%
Type annotation size is significantly reduced
Implementation Burden of TAL
Type Preservation [PLDI ‘08]
Assembly-Level Type Inference
Changes to an Existing Compiler (Bartok)
 19,000 lines of code
 5,000 lines of code
 cut across code base
 modular addition to code base
Type Checker + Type Inference
 13,800 lines of code
 15,100 lines of code
 could be separated to reduce
trusted computing base
Conclusion
 Type inference at the assembly level is
 expressive enough to verify C# with optimizations
 flexible enough to accommodate new language features
 efficient enough to use regularly during compilation
 compact enough to include in executable binaries
 modular enough to retrofit existing compilers with