Dyninst tutorial slides

Download Report

Transcript Dyninst tutorial slides

Using Dyninst for Program Binary
Analysis and Instrumentation
Emily Jacobson
Paradyn Project
Paradyn / Dyninst Week
Madison, Wisconsin
April 29 - May 1, 2013
No Source Code — No Problem
Executables
a.out
prog.exe
Libraries
lib.so
lib.dll
With Dyninst we can:
o Find (stripped) code
o in program binaries
o in live processes
o Analyze code
Live Process
Executable
Library 1
…
Library N
o functions
o control-flow-graphs
o loop, dominator analyses
o Instrument code
o statically (rewrite binary)
o dynamically (instrument live
process)
Using Dyninst for Analysis and Instrumentation
2
Choice of Static vs. Dynamic Instrumentation
Static Rewriting
Dynamic Instrumentation
oAmortize parsing and
instrumentation time.
oExecute instrumentation at a
particular time
(oneTimeCode).
oPotential to generate more
efficient modified binaries.
oInsert and remove
instrumentation at run time.
o3rd party response to
runtime events
o1st party response to runtime
events
Using Dyninst for Analysis and Instrumentation
3
Example Dyninst Program
• Find memory leaks
• Add printfs to malloc, free
• Stackwalk malloc calls that are not freed
ChaosPro ver 3.1
Using Dyninst for Analysis and Instrumentation
4
Dyninst Components
Analysis
Requests
Instruction
Decoder
(InstructionAPI)
Instrumentation
Requests
Stack Walker
Symbol Table
Parser
(SymtabAPI)
Binary Code
Stack Walk
Requests
(StackwalkerAPI)
Code Parser
(ParsingAPI
Process
Controller
(ProcControlAPI)
Using Dyninst for Analysis and Instrumentation
Instrumenter
Code
Generator
5
Process Control
• Several supported OS’s
Linux
Process
Controller
Windows
Using Dyninst for Analysis and Instrumentation
6
Process Control
• Several supported OS’s
• Broad functionality
• Attach/create process
• Monitor process status
changes
• Callbacks for
fork/exec/exit
• Mutatee operations:
malloc, load library,
inferior RPC
Analyst Program
(Mutator)
Dyninst Library
Debugger Interface
Monitored Process
Process
Controller
(Mutatee)
Dyninst Runtime Lib
• Uses debugger interface
Using Dyninst for Analysis and Instrumentation
7
Dyninst’s Process Interface
http://paradyn.org/html/manuals.html
...
...
Using Dyninst for Analysis and Instrumentation
8
Example: Create a ChaosPro.exe Process
BPatch bpatch;
> mutator.exe C:\Chaos\ChaosPro.exe
static void exitCallback(BPatch_thread*,BPatch_exitType) {
printf(“About to exit\n”);
}
int main(int argc, char *argv[]) {
if (argc < 2) {
fprintf(stderr, "Usage: %s prog_filename\n", argv[0]);
return 1;
}
BPatch_process *proc = bpatch.processCreate( argv[1] , argv+1 );
bpatch.registerExitCallback( exitCallback );
proc->continueExecution();
while ( ! proc->isTerminated() )
bpatch.waitForStatusChange();
return 0;
}
Using Dyninst for Analysis and Instrumentation
9
Unified Abstractions
BPatch_addressSpace
BPatch_binaryEdit
BPatch_process
Live Process
write file
Add/remove
instrumentation,
lookups by address,
allocate variables in
mutatee
a.out
a.out
libc.so
libc.so
Using Dyninst for Analysis and Instrumentation
Process state,
threads,
one-time
instrumentation
10
Symbol Table Parsing
Where are malloc, free?
Mutator
Dyninst Library
Symbol Table
Parser
Stack Walker
Instrumenter
Process
Controller
Code
Generator
Code Parser
Mutatee
Instruction
Decoder
chaospro.exe
msvcrt.dll
Runtime Lib
Using Dyninst for Analysis and Instrumentation
11
Symbol Table Parsing
PE
ELF
Symbol Table
Parser
Symbol
Address
Size
func1
0x0804cc84
100
variable1
0x0804cd00
4
func2
0x0804cd1d
500
XCOFF
Mutatee
chaospro.exe
msvcrt.dll
Where are malloc, free?
Runtime Lib
Using Dyninst for Analysis and Instrumentation
12
Example: Find malloc
Mutator
int main(int argc, char *argv[])
{
...
BPatch_image* image = proc->getImage();
Dyninst Library
BPatch_module* libc = image->findModule( “msvcrt” );
vector< BPatch_function* > * funcs =
libc->findFunction( “malloc” );
Mutatee
BPatch_function * bp_malloc = (*funcs)[0];
chaospro.exe
Address start = bp_malloc->getBaseAddr();
Address size = bp_malloc->getSize();
msvcrt.dll
Runtime Lib
printf( “malloc: [%x %x]\n",
start , start + size );
...
}
Using Dyninst for Analysis and Instrumentation
13
Decoding and Parsing of Binary Code
Get parameters, return values for malloc, free
Mutator
Dyninst Library
Symbol Table
Parser
Stack Walker
Instrumenter
Process
Controller
Code
Generator
Code Parser
Mutatee
Instruction
Decoder
chaospro.exe
msvcrt.dll
Runtime Lib
Using Dyninst for Analysis and Instrumentation
14
Instruction Decoding
Abstract Syntax Tree
IA32
mov eax -> [ebx * 4 + ecx]
AMD64
mov
eax
[ebx * 4 + ecx]
POWER
Mutatee
deref
Instruction
Decoder
add
8b 04 99 20 e9 3d e0
09 e8 68 c0 45 be 79
5e 80 89 08 27 c0 73
mult
1c 88 48 6a d8 6a d0
56 4b fe 92 57 af 40
0c b6 f2 64 32 f5 07
57 af 40 0c b6 f2 64
32 f5 07 b6 66 21 0c
85 a5 94 2b 20 fd 5b
ebx
Using Dyninst for Analysis and Instrumentation
ecx
4
15
Parsing
 Parse-time
analyses:
IA32
AMD64
Code Parser
POWER
Mutatee
8b 04 99 20 e9 3d e0
09 e8 68 c0 45 be 79
5e 80 89 08 27 c0 73
Instruction
Decoder
mov eax -> [ebx * 4 + ecx]
mov
eax
[ebx * 4 + ecx]
• Identify basic blocks,
functions
• Builds control-flow
graph
• Operate on
stripped code, but
use symbol
information
opportunistically
deref
add
1c 88 48 6a d8 6a d0
56 4b fe 92 57 af 40
0c b6 f2 64 32 f5 07
mult
ebx
ecx
4
57 af 40 0c b6 f2 64
32 f5 07 b6 66 21 0c
85 a5 94 2b 20 fd 5b
Using Dyninst for Analysis and Instrumentation
16
Binary Code Parsing
chaospro.exe
Task: instrument malloc at its entry and exit
points, instrument free at its entry point
Subtask: find malloc and parse it
Process
Controller
Symbol Table
Parser
msvcrt.dll
malloc
free
atoi
strcpy
memmove
77C2C407
77C2C21B
77C1BE7B
77C46030
77C472B0
Mutatee
Code Parser
84 04 99 20 e9 3d e0
09 e8 68 c0 45 be 79
5e 80 89 08 27 c0 73
1c 88 48 6a d8 6a d0
56 4b fe 92 57 af 40
0c b6 f2 64 32 f5 07
57 af 40 0c b6 f2 64
32 f5 07 b6 66 21 0c
85 a5 94 2b 20 fd 5b
mov eax -> [ebx * 4 + ecx]
Instruction
Decoder
mov
eax
[ebx * 4 + ecx]
deref
add
mult
ebx
ecx
4
Using Dyninst for Analysis and Instrumentation
17
Control Flow Traversal Parsing
• Function symbols may be
sparse
• Executables must provide
only one function address
• Libraries provide symbols for
exported functions
• Parsing finds additional
functions by following call
edges
_start
_init
_fini
main
targ3d4
targ400
targ440
Using Dyninst for Analysis and Instrumentation
[80483b0
[8048354
[8048580
[8048480
[80483d4
[8048400
[8048440
80483fa]
804836b]
804859c]
80484cf]
80483fa]
804843e]
8048468]
18
Control Flow Graph
• Graph elements:
• BPatch_function
• BPatch_basicBlock
• BPatch_edge
• Instrumentation points:
E
C
R
E
E
R
C
R
R
• BPatch_point
Address pointAddr;
BPatch_procedureLocation type;
enum { BPatch_entry,
BPatch_exit,
BPatch_subroutine,
BPatch_address }
Using Dyninst for Analysis and Instrumentation
19
Example: Find malloc’s Exit Points
malloc
Parsing is triggered
automatically as needed
E
C
R
Mutatee
chaospro.exe
msvcrt.dll
E
E
R
C
R
R
vector< BPatch_function * > * funcs;
• funcs = bp_image->getProcedures();
• funcs = bp_image->findFunction(“malloc”);
kernel32.dll
Using Dyninst for Analysis and Instrumentation
20
Example: Find malloc’s Exit Points
malloc
Parsing is triggered
automatically as needed
E
C
R
Mutatee
E
E
R
C
R
R
vector< BPatch_function * > * funcs;
chaospro.exe
• funcs = bp_image->findFunction(“malloc”);
msvcrt.dll
• funcs = libc_mod->findFunction(“malloc”);
kernel32.dll
Using Dyninst for Analysis and Instrumentation
21
Example: Find malloc’s Exit Points
malloc
E
C
R
Mutatee
chaospro.exe
msvcrt.dll
kernel32.dll
E
E
R
C
R
R
BPatch_function * bp_malloc = (*funcs)[0];
vector< BPatch_point* > * points =
BPatch_entry
bp_malloc->findPoints BPatch_subroutine ;
BPatch_exit
Using Dyninst for Analysis and Instrumentation
22
Instrumentation (at last!)
Mutator
Dyninst Library
Symbol Table
Parser
Stack Walker
Instrumenter
Process
Controller
Code
Generator
Code Parser
Mutatee
Instruction
Decoder
chaospro.exe
msvcrt.dll
Runtime Lib
Using Dyninst for Analysis and Instrumentation
23
Specifying Instrumentation Requests
Abstract Syntax Tree
Snippet
Instrumentation
Requests
what
Instrumentation Points
Instrumenter
Code
Generator
where
R
R
Using Dyninst for Analysis and Instrumentation
24
BPatch_Snippet Subclasses
• BPatch_sequence( vector < BPatch_Snippet*> items )
• BPatch_variableExpr()
int
value
• BPatch_constExpr char* value
void* value
• BPatch_ifExpr( BPatch_boolExpr condition,
BPatch_Snippet then_clause,
BPatch_Snippet else_clause )
• BPatch_funcCallExpr( BPatch_function * func,
vector< BPatch_Snippet* > args )
• BPatch_paramExpr( int param_number )
• BPatch_retExpr()
Using Dyninst for Analysis and Instrumentation
25
BPatch_Snippet Classes
Using Dyninst for Analysis and Instrumentation
26
Example: Forming printf Snippet
printf( “free(%x)\n” , arg0 );
BPatch_funcCallExpr ( BPatch_function * func,
vector< BPatch_Snippet* > args )
free(ptr)
E
BPatch_funcCallExpr
Bpatch_function bp_printf
vector
BPatch_constExpr
“free(%x)\n”
BPatch_paramExpr arg0(0)
Using Dyninst for Analysis and Instrumentation
27
Example: Instrument free w/ call to printf
BPatch_function * bp_free;
vector< BPatch_point * > entryPoints;
...
BPatch_constExpr arg0 ( “free(%x)\n” );
BPatch_paramExpr arg1 (0);
BPatch_funcCallE
xpr
vector< BPatch_snippet * > printf_args;
printf_args.push_back( & arg0 );
printf_args.push_back( & arg1 );
vector
bp_printf
BPatch_constExpr
“free(%x)\n”
BPatch_funcCallExpr callPrintf( *bp_printf,
printfArgs );
bpatch.beginInsertionSet();
for ( int idx =0;
idx < entryPoints.size();
idx++ )
proc->insertSnippet( callPrintf,
*entryPoints[idx] );
bpatch.finalizeInsertionSet();
Using Dyninst for Analysis and Instrumentation
BPatch_paramExpr
arg0(0)
free(ptr)
E
28
Using Variables
malloc instrumentation: save argument in a variable
• Find / create variable
bp_image->findVariable(“global1”);
bp_proc->malloc(bp_image->findType(“int”));
• Initialization instrumentation
• e.g., assignment at entry point of main
• Manipulation instrumentation
• e.g., arithmetic assignment expression
• Gather / print out values
• e.g., through callback instrumentation
Using Dyninst for Analysis and Instrumentation
29
Example: Instrumenting malloc
malloc
void * malloc ( size_t size )
{
MALLOC_ARG = size;
...
if (MALLOC_ARG > 1000)
printf(“%x = malloc(%x)\n”,
retnValue,
MALLOC_ARG);
}
E
R
R
BPatch_arithExpr
BPatch_assign
MALLOC_ARG
BPatch_constExpr
1
Using Dyninst for Analysis and Instrumentation
30
Example: Instrumenting malloc
malloc
void * malloc ( size_t size )
{
MALLOC_ARG = size;
...
if (MALLOC_ARG > 100)
printf(“%x = malloc(%x)\n”,
retnValue,
MALLOC_ARG);
}
BPatch_ifExpr
R
R
BPatch_funcCallExpr
Bpatch_boolExpr
BPatch_gt
E
vector
BPatch_constExpr(100)
BPatch_constExpr
MALLOC_ARG
BPatch_function
bp_printf
“%x = malloc(.)\n”
BPatch_retExpr retnValue
Using Dyninst for Analysis and Instrumentation
31
Generating the Instrumentation Code
BPatch_funcCallExpr
bp_printf
vector
BPatch_constExpr
Instrumenter
IA32
Code
Generator
AMD64
“free(%x)\n”
POWER
BPatch_paramExpr
arg0(0)
Instrumentation
snippet
mov eax -> [ebx * 4 + ecx]
mov
eax
[ebx * 4 + ecx]
deref
add
mult
ebx
ecx
4
Code at the
instrumented
point
Using Dyninst for Analysis and Instrumentation
32
Stack Walking
Mutator
Dyninst Library
Symbol Table
Parser
Stack Walker
Instrumenter
Process
Controller
Code
Generator
Code Parser
Mutatee
Instruction
Decoder
chaospro.exe
msvcrt.dll
Runtime Lib
Using Dyninst for Analysis and Instrumentation
33
Example: Stack Walk of malloc Call
Mutator
Dyninst Library
Mutatee
• Callback triggers
stackwalk
• BPatch_thread::
getCallStack(…)
malloc
E
 Choose instrumentation
point
• the exit points of malloc
chaospro.exe
 Insert callback
instrumentation
msvcrt.dll
Runtime Lib
Stack Walker
R
R
• use stopThreadExpr
snippet
Using Dyninst for Analysis and Instrumentation
34
Implementation Session
Code Coverage
• Create a mutator that counts function
invocations
• See description of the lab at
http://www.paradyn.org/tutorial/
Using Dyninst for Analysis and Instrumentation
35