Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B.
Download
Report
Transcript Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B.
Decoupled Lifeguards: Enabling Path Optimizations
for Dynamic Correctness Checking Tools
Olatunji Ruwase*
Phillip B. Gibbons+
*School of Computer Science
Carnegie Mellon University
Shimin Chen+
Todd C. Mowry*
+ Intel Labs Pittsburgh
Carnegie Mellon
Ruwase, Chen, Gibbons and Mowry
Bug detection using Lifeguards
program
Lifeguard
Detect errors by monitoring execution of unmodified binary
Exploit instruction-grained runtime information
Block exploits before software patch
[Savage et al. ‘97, Newsome & Song ’05, Nethercote et al. ‘07]
Significant program slowdown
10 - 100X using Dynamic Binary Instrumentation(DBI)
Valgrind, PIN, DynamoRIO
Carnegie Mellon
Decoupled Lifeguards
- 2 -
Ruwase, Chen, Gibbons and Mowry
Why instruction grained Lifeguards are slow
program
TaintCheck lifeguard
taint(eax) = taint(A)
mov %eax
A
taint(eax) |= taint(B)
add %eax
B
taint(C) = taint (eax)
mov C
%eax
cmp %ecx, %eax
Carnegie Mellon
Decoupled Lifeguards
- 3 -
Ruwase, Chen, Gibbons and Mowry
Why instruction grained Lifeguards are slow
program
TaintCheck lifeguard
taint(eax) = taint(A)
mov %eax
A
taint(eax) |= taint(B)
add %eax
B
taint(C) = taint (eax)
mov C
%eax
cmp %ecx, %eax
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov reg_taint(%edx)
%al
Handler for memory-to-register
copy instruction
Carnegie Mellon
Decoupled Lifeguards
- 4 -
Ruwase, Chen, Gibbons and Mowry
Why instruction grained Lifeguards are slow
program
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
mov
%ecx0xffff %eax
shr %eax
$2
shr
%ecx (%eax,%ecx,1)
$16
mov %eax
mov
%al
movreg_taint(%edx)
%ecx
level1_index(,%ecx,4)
Swith execution context
and
%eaxA
0xffff
mov %eax
Switch execution context
shr %eax
$2
mov %ecx
%eax
mov
%eax$16
(%eax,%ecx,1)
shr %ecx
mov %ecx
level1_index(,%ecx,4)
mov
reg_taint(%edx)
%al
and %eax
0xffff
shr
%eax execution
$2
Switch
context
mov %eax
(%eax,%ecx,1)
mov
%eax
A%al
or reg_taint(%edx)
Switch
execution
context context
Switch
execution
add %eax
B
taint(eax)
|= taint(B)
mov %ecx
%eax
shr
add%ecx
%eax$16 B
mov %ecx
level1_index(,%ecx,4)
taint(C)
=0xffff
taint (eax)
and %eax
shr %eax
$2
mov C
%eax
mov %eax
(%eax,%ecx,1)
cmp
%eax
mov %al%ecx,
reg_taint(%edx)
Switch execution context
mov C
%eax
cmp %ecx, %eax
TaintCheck lifeguard
Carnegie Mellon
Decoupled Lifeguards
- 5 -
Ruwase, Chen, Gibbons and Mowry
Optimizing Lifeguard code on program paths is hard
Instrumented program path
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov reg_taint(%edx)
%al
Swith execution context
mov %eax
A
Switch execution context
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
or reg_taint(%edx)
%al
Switch execution context
add %eax
B
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov %al
reg_taint(%edx)
Switch execution context
mov C
%eax
cmp %ecx, %eax
Key obstacle is tight
coupling of program
& Lifeguard code
Carnegie Mellon
Decoupled Lifeguards
- 6 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguard execution
Unoptimized path
handler
Instrumented program path
Carnegie Mellon
Decoupled Lifeguards
- 7 -
Ruwase, Chen, Gibbons and Mowry
Lifeguard specific optimizations on program path
Compose
instruction
handlers
Program path
Unoptimized path
handler
Carnegie Mellon
Decoupled Lifeguards
- 8 -
Ruwase, Chen, Gibbons and Mowry
Lifeguard specific optimizations on program path
x86 instruction count of
TaintCheck handler for mcf path
Original
Standard
path opts
86
81(95%)
Lifeguard
path opts
47(55%)
Compose
instruction
handlers
Program path
Unoptimized path
handler
Optimized path
handler
Carnegie Mellon
Decoupled Lifeguards
- 9 -
Ruwase, Chen, Gibbons and Mowry
Outline
Dynamic path optimization of Decoupled Lifeguards
Decoupling Lifeguards: Challenges and Solutions
Using lifeguard domain knowledge for path optimizations
Evaluation
Conclusions
Carnegie Mellon
Decoupled Lifeguards
- 10 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Issue 1: When to run Lifeguard code
Optimized path handler
Program path
At end of path where data is available
Carnegie Mellon
Decoupled Lifeguards
- 11 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Issue 2: How to pass data to Lifeguard
Marshall
data
Buffer
Optimized path handler
Program path
Carnegie Mellon
Decoupled Lifeguards
- 12 Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Challenge 1: How to handle side exits
1
1
2
2
3
3
Optimized path handler
4
4
Path handler for
side exits
Program path
Carnegie Mellon
Decoupled Lifeguards
- 13 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Challenge 2: How to contain errors in the path
See paper for details of solution based on:
1. Page protection to prevent data corruption
2. Completing checks at function & system
calls and indirect jumps
Program path
Optimized path handler
Carnegie Mellon
Decoupled Lifeguards
- 14 -
Ruwase, Chen, Gibbons and Mowry
Outline
Dynamic path optimization of Decoupled Lifeguards
Decoupling Lifeguards: Challenges and Solutions
Using lifeguard domain knowledge for path optimizations
Evaluation
Conclusion
Carnegie Mellon
Decoupled Lifeguards
- 15 -
Ruwase, Chen, Gibbons and Mowry
Lifeguard optimization opportunities
taint(esi) = taint(esi) | taint(A )
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(B )
1. Alias analysis to reduce metadata
accesses
2. Dead metadata update detection
to eliminate instruction handlers
taint(ecx) = taint(C )
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(D )
…
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov reg_taint(%edx)
%al
TaintCheck handler for mcf path
6 instructions to access metadata
of program memory address
Carnegie Mellon
Decoupled Lifeguards
- 16 -
Ruwase, Chen, Gibbons and Mowry
Alias analysis for metadata accesses
program
add %esi
mov %edx
mov %edi
sub %edi
…
mov %ecx
lea %edi
mov %ebx
…
-0x24[%ebp]
%edi
%esi
-0x24[%ebp]
taint(esi) = taint(esi) | taint(A)
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(B)
-0x24[%ebp]
[%edx,%ecx,1]
0x1c[%ebp]
taint(ecx) = taint(C)
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(D)
…
mcf path
TaintCheck handler for mcf path
Carnegie Mellon
Decoupled Lifeguards
- 17 -
Ruwase, Chen, Gibbons and Mowry
Alias analysis for metadata accesses
program
add %esi
mov %edx
mov %edi
sub %edi
…
mov %ecx
lea %edi
mov %ebx
…
-0x24[%ebp]
%edi
%esi
-0x24[%ebp]
taint(esi) = taint(esi) | taint(A)
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(A)
-0x24[%ebp]
[%edx,%ecx,1]
0x1c[%ebp]
taint(ecx) = taint(A)
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(A+64)
…
mcf path
TaintCheck handler for mcf path
Enables metadata access CSE
optimization described in paper
Carnegie Mellon
Decoupled Lifeguards
- 18 -
Ruwase, Chen, Gibbons and Mowry
Eliminating dead instruction handlers
taint(esi) = taint(esi) | taint(A)
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(A)
Dead taint(edi) updates
taint(ecx) = taint(A)
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(A+64)
…
See paper for details of
other optimizations: e.g
eliminating loop redundancies
TaintCheck handler for mcf path
Carnegie Mellon
Decoupled Lifeguards
- 19 -
Ruwase, Chen, Gibbons and Mowry
Evaluation
Lifeguards
AddrCheck: unallocated memory access
Eraser: concurrency errors
MemCheck: AddrCheck + uninitialized read errors
TaintCheck: security errors
Lifeguard instrumentation platforms
DBI (Valgrind ) & Hardware accelerated (LBA)
Decoupled lifeguard code on program paths of up to 8 branches
Carnegie Mellon
Decoupled Lifeguards
- 20 -
Ruwase, Chen, Gibbons and Mowry
1.2
AddrCheck
Standard path optimizations(SPO)
1.0
0.8
SPO + dead handler elimination(DHE)
0.6
0.4
0.2
0.0
Execution time normalized
to coupled Lifeguard
Execution time normalized
to coupled Lifeguard
Lifeguard overhead reduction in Valgrind
1.2
MemCheck
1.0
0.8
0.6
0.4
0.2
0.0
Carnegie Mellon
Decoupled Lifeguards
- 21 -
Ruwase, Chen, Gibbons and Mowry
1.2
AddrCheck
1.0
24% reduction
Standard path optimizations(SPO)
0.8
SPO + dead handler elimination(DHE)
0.6
0.4
0.2
0.0
Execution time normalized
to coupled Lifeguard
Execution time normalized
to coupled Lifeguard
Lifeguard overhead reduction in Valgrind
1.2
MemCheck
Limitations to improvements
• Instrumentation overhead
• No metadata access CSE
6% reduction
1.0
0.8
0.6
0.4
0.2
0.0
Carnegie Mellon
Decoupled Lifeguards
- 22 -
Ruwase, Chen, Gibbons and Mowry
Results with hardware assisted instrumentation (LBA)
Execution time normalized
to coupled Lifeguard
SPO
1.2
AddrCheck
1.0
SPO + DHE
SPO + DHE + Metadata access CSE
50% reduction
Eraser
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0
53% reduction
Execution time normalized
to coupled Lifeguard
blast
1.0
MemCheck
42% reduction
1.0
0.8
0.8
0.6
0.6
0.4
pbunzip2 pbzip2
zchaff
Avg
TaintCheck 38% reduction
0.4
0.2
0.2
0.0
0.0
Carnegie Mellon
Decoupled Lifeguards
- 23 -
Ruwase, Chen, Gibbons and Mowry
Conclusions
Decoupling: enables optimization of lifeguard code on program paths
Correctness checking at a path granularity
Multi-versioned checking code to handle side exits
Page protection for containing errors
Lifeguard domain knowledge: enable redundancy elimination beyond
standard optimizations
Better alias analysis
Lifeguard-specific dead code & common subexpression elimination
Lifeguard overhead reductions
Up to 24% on Valgrind
Up to 53% on LBA
Carnegie Mellon
Decoupled Lifeguards
- 24 -
Ruwase, Chen, Gibbons and Mowry