Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B.
Download ReportTranscript Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B.
Decoupled Lifeguards: Enabling Path Optimizations for Dynamic Correctness Checking Tools Olatunji Ruwase* Phillip B. Gibbons+ *School of Computer Science Carnegie Mellon University Shimin Chen+ Todd C. Mowry* + Intel Labs Pittsburgh Carnegie Mellon Ruwase, Chen, Gibbons and Mowry Bug detection using Lifeguards program Lifeguard Detect errors by monitoring execution of unmodified binary Exploit instruction-grained runtime information Block exploits before software patch [Savage et al. ‘97, Newsome & Song ’05, Nethercote et al. ‘07] Significant program slowdown 10 - 100X using Dynamic Binary Instrumentation(DBI) Valgrind, PIN, DynamoRIO Carnegie Mellon Decoupled Lifeguards - 2 - Ruwase, Chen, Gibbons and Mowry Why instruction grained Lifeguards are slow program TaintCheck lifeguard taint(eax) = taint(A) mov %eax A taint(eax) |= taint(B) add %eax B taint(C) = taint (eax) mov C %eax cmp %ecx, %eax Carnegie Mellon Decoupled Lifeguards - 3 - Ruwase, Chen, Gibbons and Mowry Why instruction grained Lifeguards are slow program TaintCheck lifeguard taint(eax) = taint(A) mov %eax A taint(eax) |= taint(B) add %eax B taint(C) = taint (eax) mov C %eax cmp %ecx, %eax mov %ecx %eax shr %ecx $16 mov %ecx level1_index(,%ecx,4) and %eax 0xffff shr %eax $2 mov %eax (%eax,%ecx,1) mov reg_taint(%edx) %al Handler for memory-to-register copy instruction Carnegie Mellon Decoupled Lifeguards - 4 - Ruwase, Chen, Gibbons and Mowry Why instruction grained Lifeguards are slow program mov %ecx %eax shr %ecx $16 mov %ecx level1_index(,%ecx,4) and %eax mov %ecx0xffff %eax shr %eax $2 shr %ecx (%eax,%ecx,1) $16 mov %eax mov %al movreg_taint(%edx) %ecx level1_index(,%ecx,4) Swith execution context and %eaxA 0xffff mov %eax Switch execution context shr %eax $2 mov %ecx %eax mov %eax$16 (%eax,%ecx,1) shr %ecx mov %ecx level1_index(,%ecx,4) mov reg_taint(%edx) %al and %eax 0xffff shr %eax execution $2 Switch context mov %eax (%eax,%ecx,1) mov %eax A%al or reg_taint(%edx) Switch execution context context Switch execution add %eax B taint(eax) |= taint(B) mov %ecx %eax shr add%ecx %eax$16 B mov %ecx level1_index(,%ecx,4) taint(C) =0xffff taint (eax) and %eax shr %eax $2 mov C %eax mov %eax (%eax,%ecx,1) cmp %eax mov %al%ecx, reg_taint(%edx) Switch execution context mov C %eax cmp %ecx, %eax TaintCheck lifeguard Carnegie Mellon Decoupled Lifeguards - 5 - Ruwase, Chen, Gibbons and Mowry Optimizing Lifeguard code on program paths is hard Instrumented program path mov %ecx %eax shr %ecx $16 mov %ecx level1_index(,%ecx,4) and %eax 0xffff shr %eax $2 mov %eax (%eax,%ecx,1) mov reg_taint(%edx) %al Swith execution context mov %eax A Switch execution context mov %ecx %eax shr %ecx $16 mov %ecx level1_index(,%ecx,4) and %eax 0xffff shr %eax $2 mov %eax (%eax,%ecx,1) or reg_taint(%edx) %al Switch execution context add %eax B mov %ecx %eax shr %ecx $16 mov %ecx level1_index(,%ecx,4) and %eax 0xffff shr %eax $2 mov %eax (%eax,%ecx,1) mov %al reg_taint(%edx) Switch execution context mov C %eax cmp %ecx, %eax Key obstacle is tight coupling of program & Lifeguard code Carnegie Mellon Decoupled Lifeguards - 6 - Ruwase, Chen, Gibbons and Mowry Decoupling Lifeguard execution Unoptimized path handler Instrumented program path Carnegie Mellon Decoupled Lifeguards - 7 - Ruwase, Chen, Gibbons and Mowry Lifeguard specific optimizations on program path Compose instruction handlers Program path Unoptimized path handler Carnegie Mellon Decoupled Lifeguards - 8 - Ruwase, Chen, Gibbons and Mowry Lifeguard specific optimizations on program path x86 instruction count of TaintCheck handler for mcf path Original Standard path opts 86 81(95%) Lifeguard path opts 47(55%) Compose instruction handlers Program path Unoptimized path handler Optimized path handler Carnegie Mellon Decoupled Lifeguards - 9 - Ruwase, Chen, Gibbons and Mowry Outline Dynamic path optimization of Decoupled Lifeguards Decoupling Lifeguards: Challenges and Solutions Using lifeguard domain knowledge for path optimizations Evaluation Conclusions Carnegie Mellon Decoupled Lifeguards - 10 - Ruwase, Chen, Gibbons and Mowry Decoupling Lifeguards: Challenges and Solutions Issue 1: When to run Lifeguard code Optimized path handler Program path At end of path where data is available Carnegie Mellon Decoupled Lifeguards - 11 - Ruwase, Chen, Gibbons and Mowry Decoupling Lifeguards: Challenges and Solutions Issue 2: How to pass data to Lifeguard Marshall data Buffer Optimized path handler Program path Carnegie Mellon Decoupled Lifeguards - 12 Ruwase, Chen, Gibbons and Mowry Decoupling Lifeguards: Challenges and Solutions Challenge 1: How to handle side exits 1 1 2 2 3 3 Optimized path handler 4 4 Path handler for side exits Program path Carnegie Mellon Decoupled Lifeguards - 13 - Ruwase, Chen, Gibbons and Mowry Decoupling Lifeguards: Challenges and Solutions Challenge 2: How to contain errors in the path See paper for details of solution based on: 1. Page protection to prevent data corruption 2. Completing checks at function & system calls and indirect jumps Program path Optimized path handler Carnegie Mellon Decoupled Lifeguards - 14 - Ruwase, Chen, Gibbons and Mowry Outline Dynamic path optimization of Decoupled Lifeguards Decoupling Lifeguards: Challenges and Solutions Using lifeguard domain knowledge for path optimizations Evaluation Conclusion Carnegie Mellon Decoupled Lifeguards - 15 - Ruwase, Chen, Gibbons and Mowry Lifeguard optimization opportunities taint(esi) = taint(esi) | taint(A ) taint(edx) = taint(edi) taint(edi) = taint(esi) taint(edi) = taint(edi) | taint(B ) 1. Alias analysis to reduce metadata accesses 2. Dead metadata update detection to eliminate instruction handlers taint(ecx) = taint(C ) taint(edi) = taint(edx) | taint(ecx) taint(ebx) = taint(D ) … mov %ecx %eax shr %ecx $16 mov %ecx level1_index(,%ecx,4) and %eax 0xffff shr %eax $2 mov %eax (%eax,%ecx,1) mov reg_taint(%edx) %al TaintCheck handler for mcf path 6 instructions to access metadata of program memory address Carnegie Mellon Decoupled Lifeguards - 16 - Ruwase, Chen, Gibbons and Mowry Alias analysis for metadata accesses program add %esi mov %edx mov %edi sub %edi … mov %ecx lea %edi mov %ebx … -0x24[%ebp] %edi %esi -0x24[%ebp] taint(esi) = taint(esi) | taint(A) taint(edx) = taint(edi) taint(edi) = taint(esi) taint(edi) = taint(edi) | taint(B) -0x24[%ebp] [%edx,%ecx,1] 0x1c[%ebp] taint(ecx) = taint(C) taint(edi) = taint(edx) | taint(ecx) taint(ebx) = taint(D) … mcf path TaintCheck handler for mcf path Carnegie Mellon Decoupled Lifeguards - 17 - Ruwase, Chen, Gibbons and Mowry Alias analysis for metadata accesses program add %esi mov %edx mov %edi sub %edi … mov %ecx lea %edi mov %ebx … -0x24[%ebp] %edi %esi -0x24[%ebp] taint(esi) = taint(esi) | taint(A) taint(edx) = taint(edi) taint(edi) = taint(esi) taint(edi) = taint(edi) | taint(A) -0x24[%ebp] [%edx,%ecx,1] 0x1c[%ebp] taint(ecx) = taint(A) taint(edi) = taint(edx) | taint(ecx) taint(ebx) = taint(A+64) … mcf path TaintCheck handler for mcf path Enables metadata access CSE optimization described in paper Carnegie Mellon Decoupled Lifeguards - 18 - Ruwase, Chen, Gibbons and Mowry Eliminating dead instruction handlers taint(esi) = taint(esi) | taint(A) taint(edx) = taint(edi) taint(edi) = taint(esi) taint(edi) = taint(edi) | taint(A) Dead taint(edi) updates taint(ecx) = taint(A) taint(edi) = taint(edx) | taint(ecx) taint(ebx) = taint(A+64) … See paper for details of other optimizations: e.g eliminating loop redundancies TaintCheck handler for mcf path Carnegie Mellon Decoupled Lifeguards - 19 - Ruwase, Chen, Gibbons and Mowry Evaluation Lifeguards AddrCheck: unallocated memory access Eraser: concurrency errors MemCheck: AddrCheck + uninitialized read errors TaintCheck: security errors Lifeguard instrumentation platforms DBI (Valgrind ) & Hardware accelerated (LBA) Decoupled lifeguard code on program paths of up to 8 branches Carnegie Mellon Decoupled Lifeguards - 20 - Ruwase, Chen, Gibbons and Mowry 1.2 AddrCheck Standard path optimizations(SPO) 1.0 0.8 SPO + dead handler elimination(DHE) 0.6 0.4 0.2 0.0 Execution time normalized to coupled Lifeguard Execution time normalized to coupled Lifeguard Lifeguard overhead reduction in Valgrind 1.2 MemCheck 1.0 0.8 0.6 0.4 0.2 0.0 Carnegie Mellon Decoupled Lifeguards - 21 - Ruwase, Chen, Gibbons and Mowry 1.2 AddrCheck 1.0 24% reduction Standard path optimizations(SPO) 0.8 SPO + dead handler elimination(DHE) 0.6 0.4 0.2 0.0 Execution time normalized to coupled Lifeguard Execution time normalized to coupled Lifeguard Lifeguard overhead reduction in Valgrind 1.2 MemCheck Limitations to improvements • Instrumentation overhead • No metadata access CSE 6% reduction 1.0 0.8 0.6 0.4 0.2 0.0 Carnegie Mellon Decoupled Lifeguards - 22 - Ruwase, Chen, Gibbons and Mowry Results with hardware assisted instrumentation (LBA) Execution time normalized to coupled Lifeguard SPO 1.2 AddrCheck 1.0 SPO + DHE SPO + DHE + Metadata access CSE 50% reduction Eraser 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.8 0.6 0.4 0.2 0.0 53% reduction Execution time normalized to coupled Lifeguard blast 1.0 MemCheck 42% reduction 1.0 0.8 0.8 0.6 0.6 0.4 pbunzip2 pbzip2 zchaff Avg TaintCheck 38% reduction 0.4 0.2 0.2 0.0 0.0 Carnegie Mellon Decoupled Lifeguards - 23 - Ruwase, Chen, Gibbons and Mowry Conclusions Decoupling: enables optimization of lifeguard code on program paths Correctness checking at a path granularity Multi-versioned checking code to handle side exits Page protection for containing errors Lifeguard domain knowledge: enable redundancy elimination beyond standard optimizations Better alias analysis Lifeguard-specific dead code & common subexpression elimination Lifeguard overhead reductions Up to 24% on Valgrind Up to 53% on LBA Carnegie Mellon Decoupled Lifeguards - 24 - Ruwase, Chen, Gibbons and Mowry