Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B.

Download Report

Transcript Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B.

Decoupled Lifeguards: Enabling Path Optimizations
for Dynamic Correctness Checking Tools
Olatunji Ruwase*
Phillip B. Gibbons+
*School of Computer Science
Carnegie Mellon University
Shimin Chen+
Todd C. Mowry*
+ Intel Labs Pittsburgh
Carnegie Mellon
Ruwase, Chen, Gibbons and Mowry
Bug detection using Lifeguards
program
Lifeguard
 Detect errors by monitoring execution of unmodified binary
 Exploit instruction-grained runtime information
 Block exploits before software patch

[Savage et al. ‘97, Newsome & Song ’05, Nethercote et al. ‘07]
 Significant program slowdown
 10 - 100X using Dynamic Binary Instrumentation(DBI)
 Valgrind, PIN, DynamoRIO
Carnegie Mellon
Decoupled Lifeguards
- 2 -
Ruwase, Chen, Gibbons and Mowry
Why instruction grained Lifeguards are slow
program
TaintCheck lifeguard
taint(eax) = taint(A)
mov %eax
A
taint(eax) |= taint(B)
add %eax
B
taint(C) = taint (eax)
mov C
%eax
cmp %ecx, %eax
Carnegie Mellon
Decoupled Lifeguards
- 3 -
Ruwase, Chen, Gibbons and Mowry
Why instruction grained Lifeguards are slow
program
TaintCheck lifeguard
taint(eax) = taint(A)
mov %eax
A
taint(eax) |= taint(B)
add %eax
B
taint(C) = taint (eax)
mov C
%eax
cmp %ecx, %eax
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov reg_taint(%edx)
%al
Handler for memory-to-register
copy instruction
Carnegie Mellon
Decoupled Lifeguards
- 4 -
Ruwase, Chen, Gibbons and Mowry
Why instruction grained Lifeguards are slow
program
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
mov
%ecx0xffff %eax
shr %eax
$2
shr
%ecx (%eax,%ecx,1)
$16
mov %eax
mov
%al
movreg_taint(%edx)
%ecx
level1_index(,%ecx,4)
Swith execution context
and
%eaxA
0xffff
mov %eax
Switch execution context
shr %eax
$2
mov %ecx
%eax
mov
%eax$16
(%eax,%ecx,1)
shr %ecx
mov %ecx
level1_index(,%ecx,4)
mov
reg_taint(%edx)
%al
and %eax
0xffff
shr
%eax execution
$2
Switch
context
mov %eax
(%eax,%ecx,1)
mov
%eax
A%al
or reg_taint(%edx)
Switch
execution
context context
Switch
execution
add %eax
B
taint(eax)
|= taint(B)
mov %ecx
%eax
shr
add%ecx
%eax$16 B
mov %ecx
level1_index(,%ecx,4)
taint(C)
=0xffff
taint (eax)
and %eax
shr %eax
$2
mov C
%eax
mov %eax
(%eax,%ecx,1)
cmp
%eax
mov %al%ecx,
reg_taint(%edx)
Switch execution context
mov C
%eax
cmp %ecx, %eax
TaintCheck lifeguard
Carnegie Mellon
Decoupled Lifeguards
- 5 -
Ruwase, Chen, Gibbons and Mowry
Optimizing Lifeguard code on program paths is hard
Instrumented program path
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov reg_taint(%edx)
%al
Swith execution context
mov %eax
A
Switch execution context
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
or reg_taint(%edx)
%al
Switch execution context
add %eax
B
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov %al
reg_taint(%edx)
Switch execution context
mov C
%eax
cmp %ecx, %eax
Key obstacle is tight
coupling of program
& Lifeguard code
Carnegie Mellon
Decoupled Lifeguards
- 6 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguard execution
Unoptimized path
handler
Instrumented program path
Carnegie Mellon
Decoupled Lifeguards
- 7 -
Ruwase, Chen, Gibbons and Mowry
Lifeguard specific optimizations on program path
Compose
instruction
handlers
Program path
Unoptimized path
handler
Carnegie Mellon
Decoupled Lifeguards
- 8 -
Ruwase, Chen, Gibbons and Mowry
Lifeguard specific optimizations on program path
x86 instruction count of
TaintCheck handler for mcf path
Original
Standard
path opts
86
81(95%)
Lifeguard
path opts
47(55%)
Compose
instruction
handlers
Program path
Unoptimized path
handler
Optimized path
handler
Carnegie Mellon
Decoupled Lifeguards
- 9 -
Ruwase, Chen, Gibbons and Mowry
Outline

Dynamic path optimization of Decoupled Lifeguards

Decoupling Lifeguards: Challenges and Solutions

Using lifeguard domain knowledge for path optimizations

Evaluation

Conclusions
Carnegie Mellon
Decoupled Lifeguards
- 10 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Issue 1: When to run Lifeguard code
Optimized path handler
Program path
 At end of path where data is available
Carnegie Mellon
Decoupled Lifeguards
- 11 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Issue 2: How to pass data to Lifeguard
Marshall
data
Buffer
Optimized path handler
Program path
Carnegie Mellon
Decoupled Lifeguards
- 12 Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Challenge 1: How to handle side exits
1
1
2
2
3
3
Optimized path handler
4
4
Path handler for
side exits
Program path
Carnegie Mellon
Decoupled Lifeguards
- 13 -
Ruwase, Chen, Gibbons and Mowry
Decoupling Lifeguards: Challenges and Solutions
Challenge 2: How to contain errors in the path
See paper for details of solution based on:
1. Page protection to prevent data corruption
2. Completing checks at function & system
calls and indirect jumps
Program path
Optimized path handler
Carnegie Mellon
Decoupled Lifeguards
- 14 -
Ruwase, Chen, Gibbons and Mowry
Outline

Dynamic path optimization of Decoupled Lifeguards

Decoupling Lifeguards: Challenges and Solutions

Using lifeguard domain knowledge for path optimizations

Evaluation

Conclusion
Carnegie Mellon
Decoupled Lifeguards
- 15 -
Ruwase, Chen, Gibbons and Mowry
Lifeguard optimization opportunities
taint(esi) = taint(esi) | taint(A )
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(B )
1. Alias analysis to reduce metadata
accesses
2. Dead metadata update detection
to eliminate instruction handlers
taint(ecx) = taint(C )
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(D )
…
mov %ecx
%eax
shr %ecx
$16
mov %ecx
level1_index(,%ecx,4)
and %eax
0xffff
shr %eax
$2
mov %eax
(%eax,%ecx,1)
mov reg_taint(%edx)
%al
TaintCheck handler for mcf path
6 instructions to access metadata
of program memory address
Carnegie Mellon
Decoupled Lifeguards
- 16 -
Ruwase, Chen, Gibbons and Mowry
Alias analysis for metadata accesses
program
add %esi
mov %edx
mov %edi
sub %edi
…
mov %ecx
lea %edi
mov %ebx
…
-0x24[%ebp]
%edi
%esi
-0x24[%ebp]
taint(esi) = taint(esi) | taint(A)
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(B)
-0x24[%ebp]
[%edx,%ecx,1]
0x1c[%ebp]
taint(ecx) = taint(C)
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(D)
…
mcf path
TaintCheck handler for mcf path
Carnegie Mellon
Decoupled Lifeguards
- 17 -
Ruwase, Chen, Gibbons and Mowry
Alias analysis for metadata accesses
program
add %esi
mov %edx
mov %edi
sub %edi
…
mov %ecx
lea %edi
mov %ebx
…
-0x24[%ebp]
%edi
%esi
-0x24[%ebp]
taint(esi) = taint(esi) | taint(A)
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(A)
-0x24[%ebp]
[%edx,%ecx,1]
0x1c[%ebp]
taint(ecx) = taint(A)
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(A+64)
…
mcf path
TaintCheck handler for mcf path
Enables metadata access CSE
optimization described in paper
Carnegie Mellon
Decoupled Lifeguards
- 18 -
Ruwase, Chen, Gibbons and Mowry
Eliminating dead instruction handlers
taint(esi) = taint(esi) | taint(A)
taint(edx) = taint(edi)
taint(edi) = taint(esi)
taint(edi) = taint(edi) | taint(A)
Dead taint(edi) updates
taint(ecx) = taint(A)
taint(edi) = taint(edx) | taint(ecx)
taint(ebx) = taint(A+64)
…
See paper for details of
other optimizations: e.g
eliminating loop redundancies
TaintCheck handler for mcf path
Carnegie Mellon
Decoupled Lifeguards
- 19 -
Ruwase, Chen, Gibbons and Mowry
Evaluation

Lifeguards
 AddrCheck: unallocated memory access
 Eraser: concurrency errors
 MemCheck: AddrCheck + uninitialized read errors
 TaintCheck: security errors

Lifeguard instrumentation platforms
 DBI (Valgrind ) & Hardware accelerated (LBA)

Decoupled lifeguard code on program paths of up to 8 branches
Carnegie Mellon
Decoupled Lifeguards
- 20 -
Ruwase, Chen, Gibbons and Mowry
1.2
AddrCheck
Standard path optimizations(SPO)
1.0
0.8
SPO + dead handler elimination(DHE)
0.6
0.4
0.2
0.0
Execution time normalized
to coupled Lifeguard
Execution time normalized
to coupled Lifeguard
Lifeguard overhead reduction in Valgrind
1.2
MemCheck
1.0
0.8
0.6
0.4
0.2
0.0
Carnegie Mellon
Decoupled Lifeguards
- 21 -
Ruwase, Chen, Gibbons and Mowry
1.2
AddrCheck
1.0
24% reduction
Standard path optimizations(SPO)
0.8
SPO + dead handler elimination(DHE)
0.6
0.4
0.2
0.0
Execution time normalized
to coupled Lifeguard
Execution time normalized
to coupled Lifeguard
Lifeguard overhead reduction in Valgrind
1.2
MemCheck
Limitations to improvements
• Instrumentation overhead
• No metadata access CSE
6% reduction
1.0
0.8
0.6
0.4
0.2
0.0
Carnegie Mellon
Decoupled Lifeguards
- 22 -
Ruwase, Chen, Gibbons and Mowry
Results with hardware assisted instrumentation (LBA)
Execution time normalized
to coupled Lifeguard
SPO
1.2
AddrCheck
1.0
SPO + DHE
SPO + DHE + Metadata access CSE
50% reduction
Eraser
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0
53% reduction
Execution time normalized
to coupled Lifeguard
blast
1.0
MemCheck
42% reduction
1.0
0.8
0.8
0.6
0.6
0.4
pbunzip2 pbzip2
zchaff
Avg
TaintCheck 38% reduction
0.4
0.2
0.2
0.0
0.0
Carnegie Mellon
Decoupled Lifeguards
- 23 -
Ruwase, Chen, Gibbons and Mowry
Conclusions

Decoupling: enables optimization of lifeguard code on program paths
 Correctness checking at a path granularity
 Multi-versioned checking code to handle side exits
 Page protection for containing errors

Lifeguard domain knowledge: enable redundancy elimination beyond
standard optimizations
 Better alias analysis
 Lifeguard-specific dead code & common subexpression elimination

Lifeguard overhead reductions
 Up to 24% on Valgrind
 Up to 53% on LBA
Carnegie Mellon
Decoupled Lifeguards
- 24 -
Ruwase, Chen, Gibbons and Mowry