Transcript [pptx]
Ross Tate, Juan Chen, Chris Hawblitzel
Typed Assembly Languages
Compilers are great
but they make mistakes
and can introduce vulnerabilities
Typed assembly language
includes a proof of (memory) safety
verified by a trusted proof checker
no need to trust the compiler
Certifying compilers
generate typed assembly language
traditionally use “type-preservation”
C#
Certifying
Compiler
TAL
Trusted
Proof
Checker
Source Program
Intermediate
Representation
Class/Function
Signatures
Type/Proof
Annotations
IR1
types/proofs
sigs
annots
Optimizations/Conversions
Type-Preserving
types/proofs
IR2 Compiler
sigs
annots
Optimizations/Conversions
x86
types/proofs
annots
sigs
Proof Checker
• Burden to preserve types at each stage
• Hard to adopt in existing compilers
• Types/proofs increase size of executable
Source Program
IR1
sigs
Optimizations/Conversions
Traditional
Compiler
IR2 sigs
Optimizations/Conversions
• Requires little change
• Smaller annotation size
?
Can inference be
effective enough
x86
sigs
Signature information
is already preserved
in traditional compilers
Easy to change compiler
to write sig info to file
Type Inference
Infer proof annotations
x86
sigs annots
Proof Checker
Effectiveness of Type Inference
100%
% of Inferable Methods
100.0%
100.0%
100.0%
ahcbench
mandelform
sat_solver
94.6%
97.6%
97.4%
96.1%
zinger
lcscbench
bartok
asmlc
80%
60%
40%
20%
0%
Geomean
Capable of type checking all C# features except:
• Exceptions and Delegates
• matters of implementation, not due to theoretical limitations
Broken C# Pseudo-Assembly
Could actually
be an ArrayList
Could actually
be a LinkedList
bool bad(a, b : List) {
Grabs a’s vtable
vt = a.vtable;
mp = vt.isEmpty; Grabs a’s implementation of isEmpty
Calls a’s isEmpty with b as “this”
c = mp(b);
return c;
}
a’s implementation of isEmpty
may fail to work on b
Broken C# Pseudo-Assembly
a and b are each instances of
More
specificdifferent)
some
(possibly
function
signature
subclass
of List
Traditional TAL [PLDI ‘08]
bool bad(a, b : ∃γ≪List.
List) {
Ins(γ)) {
α must be fresh
vt
= a.vtable;
open
a as Ins(α); a is given type exactly Ins(α) where α ≪ List
Pseudo-instruction
for the type checker
Via signature &
mp = vt.isEmpty; vt is given type VTable(α)
memory layout
mp is given type (∃γ≪α. Ins(γ))→bool
c = mp(b);
information
β must be fresh
return
open bc;as Ins(β); b is given type exactly
The Ins(β)
“this” where
pointerβ ≪ List
must belong to α
} c = mp(pack b as ∃γ≪α. Ins(α));
Checks that there is some γ extending α
such that b has type Ins(γ)
Check fails since b has type Ins(β)
and β does not extend α
Broken C# Pseudo-Assembly
Traditional TAL [PLDI ‘08] Inferable TAL
bool bad(a, b : ∃γ≪List. Ins(γ)) {
open a as Ins(α);
No pack annotations
vt = a.vtable;
mp = vt.isEmpty;
No open annotations
open b as Ins(β);
NoIns(α));
loop invariants!
c = mp(b);
mp(pack b as ∃γ≪α.
return c;
}
Use
type inference instead
Inference Strategy
Always open existential types as soon as possible
Use subtyping in place of pack:
Given a valid
substitution of
variables
θ: ∆’ → ∆
τ ≤ τ’[θ]
∃∆.τ ≤ ∃∆’.τ’
Such that theusing
Subsumes
bodies are
open
and
subtypes
afterpack
substitution
Use abstract interpretation over existential types
Then the existential
Requires subtyping and
join algorithms
types are subtypes
Subtyping alone
of bounded
Designed a category-theoretic
framework
for existential types
is undecidable!
• Constructive:existential
includestypes
abstract
algorithms for inference
• Instructive:
specifies type design guidelines
Type Checking with iTalX
Inference Strategy
Immediately open
a and b
α
≪ Information
List ⇒
Signature
Ins(α)
α ≪has
Listfields:
⇒
vtable : VTable(α)
VTable(α)
has fields:
Type ⋮⋮Check
Ins(β)
≤ ∃γ≪α.
isEmpty
: (⋯) →Ins(γ)
bool
Check⋮ Fails
Signature Information
β does not extend α
bool bad(a, b : ∃γ≪List. Ins(γ)) {
vt = a.vtable;
∃α, β : α≪List, β≪List.
a : Ins(α)
mp = vt.isEmpty;
b : Ins(β)
c = mp(b);
vt : VTable(α)
return c;
mp : (∃γ≪α. Ins(γ)) → bool
}
Expressiveness of iTalX
iTalX is capable of handling the following features:
Classes, interfaces, generics, and multiple inheritance
Dynamic dispatch and dynamic casts
Covariant arrays as classes, and array-bounds checks
By-reference parameters (ref), structs, and value types
Jump tables and complex stack manipulation
iTalX is also robust with respect to many optimizations
iTalX should be able to handle the remaining features:
Delegates and exceptions
In experiments, iTalX currently verifies 97.9% of methods
Efficiency of iTalX
Inference Time/Compilation Time
100%
80%
60%
34.7%
40%
20%
1.9%
0%
1.3%
9.3%
ahcbench mandelform sat_solver
13.7%
13.9%
zinger
lcscbench
16.0%
bartok
asmlc
Inferring Assembly-Level Types is Affordable
Geomean
Type Annotation Size
iTalX/Traditional TAL [PLDI ’08]
100%
79%
80%
60%
40%
59%
56%
40%
23%
23%
28%
20%
0%
Type annotation size is significantly reduced
Implementation Burden of TAL
Type Preservation [PLDI ‘08]
Assembly-Level Type Inference
Changes to an Existing Compiler (Bartok)
19,000 lines of code
5,000 lines of code
cut across code base
modular addition to code base
Type Checker + Type Inference
13,800 lines of code
15,100 lines of code
could be separated to reduce
trusted computing base
Conclusion
Type inference at the assembly level is
expressive enough to verify C# with optimizations
flexible enough to accommodate new language features
efficient enough to use regularly during compilation
compact enough to include in executable binaries
modular enough to retrofit existing compilers with