Analyzing Memory Accesses in x86 Executables Gogul Balakrishnan Thomas Reps University of Wisconsin Motivation • Basic infrastructure for language-based security – buffer-overrun detection – information-flow vulnerabilities – ... •

Download Report

Transcript Analyzing Memory Accesses in x86 Executables Gogul Balakrishnan Thomas Reps University of Wisconsin Motivation • Basic infrastructure for language-based security – buffer-overrun detection – information-flow vulnerabilities – ... •

Analyzing Memory Accesses
in x86 Executables
Gogul Balakrishnan
Thomas Reps
University of Wisconsin
Motivation
• Basic infrastructure for language-based security
– buffer-overrun detection
– information-flow vulnerabilities
– ...
• What if we do not have source code?
– viruses, worms, mobile code, etc.
– legacy code (w/o source)
• Limitations of existing tools
– overly conservative treatment of memory accesses
 Many false positives
– non-conservative treatment of memory accesses
 Many false negatives
Crude VSA
VSA
No VSA
Goal (1)
• Create an intermediate representation (IR)
that is similar to the IR used in a compiler
–
–
–
–
CFGs
call graph
used, killed, may-killed variables for CFG nodes
points-to sets
• Why?
– a tool for a security analyst
– a general infrastructure for binary analysis
Goal (2)
• Scope: programs that conform to a
“standard compilation model”
–
–
–
–
data layout determined by compiler
some variables held in registers
global variables  absolute addresses
local variables  offsets in esp-based stack
frame
• Report violations
– violations of stack protocol
– return address modified within procedure
Codesurfer/x86 Architecture
IDA Pro
Binary
Parse
Binary
Build
CFGs
Connector
Value-set
Analysis
CodeSurfer
Build SDG
Browse
• CFGs
• call graph
• used, killed, may-killed
variables for CFG nodes
• points-to sets
Client
Applications
Codesurfer/x86 Architecture
Whole-program analysis
• stubs are ok
IDA Pro
Binary
Parse
Binary
Build
CFGs
Connector
Value-set
Analysis
CodeSurfer
Build SDG
Browse
Initial estimate of
• code vs. data
• procedures and call sites
• malloc sites
Client
Applications
Outline
• Example
• Challenges
• Value-set analysis
• Performance
• [Future work]
Running Example
int arrVal=0, *pArray2;
int main() {
int i, a[10], *p;
/* Initialize pointers */
pArray2 = &a[2];
p = &a[0];
/* Initialize Array */
for(i = 0; i<10; ++i) {
*p = arrVal;
p++;
}
/* Return a[2] */
return *pArray2;
}
;
;
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
;
;return *pArray2
Tutorial on x86 Instructions
• mov
ecx, edx
; ecx = edx
• mov
ecx, [edx]
; ecx = *edx
• mov
[ecx], edx
;*ecx = edx
• lea
ecx, [esp+8]
; ecx = &a[2]
Running Example
int arrVal=0, *pArray2;
int main() {
int i, a[10], *p;
/* Initialize pointers */
pArray2 = &a[2];
p = &a[0];
/* Initialize Array */
for(i = 0; i<10; ++i) {
*p = arrVal;
p++;
}
/* Return a[2] */
return *pArray2;
}
;
;
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
;
;return *pArray2
Running Example
int arrVal=0, *pArray2;
int main() {
int i, a[10], *p;
/* Initialize pointers */
pArray2 = &a[2];
p = &a[0];
/* Initialize Array */
for(i = 0; i<10; ++i) {
*p = arrVal;
p++;
}
/* Return a[2] */
return *pArray2;
}
;
;
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Running Example – Address Space
return_address
0ffffh
;
;
a(40 bytes)
Data local
to main
(Activation
Record)
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
pArray2(4 bytes)
arrVal(4 bytes)
4h Global data
0h
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Running Example – Address Space
return_address
0ffffh
;
;
Data local
to main
(Activation
Record)
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
No debugging
information
Global data
0h
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Challenges (1)
• No debugging/symbol-table information
• Explicit memory addresses
– need something similar to C variables
– a-locs
• Only have an initial estimate of
– code, data, procedures, call sites, malloc sites
– extend IR on-the-fly
• disassemble data, add to CFG, . . .
• similar to elaboration of CFG/call-graph in a
compiler because of calls via function pointers
Challenges (2)
• Indirect-addressing mode
– need “pointer analysis”
– value-set analysis
• Pointer arithmetic
– need numeric analysis (e.g., range analysis)
– value-set analysis
• Checking for non-aligned accesses
– pointer forging? [
– keep stride information in value-sets
]
Not Everything is Bad News !
• Multiple source languages OK
• Some optimizations make our task easier
– optimizers try to use registers, not memory
– deciphering memory operations is the hard part
Memory-regions
• An abstraction of the address space
• Idea: group similar runtime addresses
– collapse the runtime ARs for each procedure
…
f
…
g
f
…
g
f
f
g
global
global
Memory-regions
• An abstraction of the address space
• Idea: group similar runtime addresses
– collapse the runtime ARs for each procedure
• Similarly,
• one region for all global data
• one region for each malloc site
Example – Memory-regions
(main, 0)
ret_main
;
(GL,8)
;
(GL,0)
sub
lea
mov
lea
mov
Global Region
(main, -40)
Region for main
ebx  i
ecx  variable p
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
“Need Something Similar to C Variables”
• Standard compilation model
– some variables held in registers
– global variables  absolute addresses
– local variables  offsets in stack frame
• A-locs
– locations between consecutive addresses
– locations between consecutive offsets
– registers
• Use a-locs instead of variables in static analysis
– e.g., killed a-loc  killed variable
Example – A-locs
(main, 0)
ret_main
;
;
(GL,8)
(GL,4)
(main, -32)
(main, -40)
[esp+8]
[esp]
Region for main
(GL,0)
[4]
[0]
Global Region
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Example – A-locs
(main, 0)
ret_main
;
(GL,8)
mainv_20
(GL,4)
(GL,0)
(main, -32)
mainv_28
(main, -40)
Region for main
;
mem_4
mem_0
Global Region
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Example – A-locs
(main, 0)
ret_main
;
(GL,8)
mainv_20
(GL,4)
(GL,0)
(main, -32)
mainv_28
(main, -40)
Region for main
;
mem_4
mem_0
Global Region
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp, 40
edx, &mainv_2
mem_4, edx
ecx, &mainv_2
edx, mem_0
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, mem_4
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Example – A-locs
locals:
mainv_28, mainv_20
{a[0], a[2]}
globals: mem_0, mem_4
{arrVal, pArray2}
edx
mainv_20
mem_4
edi
ecx
mainv_28
;
;
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp, 40
;adjust stack
edx, &mainv_20;
mem_4, edx
;pArray2=&a[2]
ecx, &mainv_28;p=&a[0]
edx, mem_0
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, mem_4
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
?
;
;return *pArray2
Example – A-locs
locals:
mainv_28, mainv_20
{a[0], a[2]}
globals: mem_0, mem_4
{arrVal, pArray2}
edx
mainv_20
mem_4
edi
ecx
mainv_28
;
;
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp, 40
;adjust stack
edx, &mainv_20;
mem_4, edx
;pArray2=&a[2]
ecx, &mainv_28;p=&a[0]
edx, mem_0
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, mem_4
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;

;
;return *pArray2
Value-Set Analysis
• Resembles a pointer-analysis algorithm
– interprets pointer-manipulation operations
– pointer arithmetic, too
• Resembles a numeric-analysis algorithm
– over-approximate the set of values/addresses
held by an a-loc
• range information
• stride information
– interprets arithmetic operations on sets of
values/addresses
Value-set
• An a-loc  a variable
– the address of an a-loc
(memory-region, offset within the region)
• An a-loc  an aggregate variable
– addresses of elements of an a-loc
(rgn, {o1, o2, …, on})
• Value-set = a set of such addresses
{(rgn1, {o1, o2, …, on}), …, (rgnr, {o1, o2, …, om})}
“r” – number of regions in the program
Value-set
• Set of addresses: {(rgn1, {o1, …, on}), …, (rgnr, {o1, …, om})}
• Idea: approximate {o1, …, ok} with a numeric domain
– {1, 3, 5, 9} represented as 2[0,4]+1
– Reduced Interval Congruence (RIC)
• common stride
• lower and upper bounds
• displacement
• Set of addresses is an r-tuple: (ric1, …, ricr)
– ric1: offsets in global region
– a set of numbers: (ric1, , …, )
Example – Value-set analysis
(main, 0)
ret_main
;
(GL,8)
mainv_20
(GL,4)
(GL,0)
(main, -32)
mainv_28
2
Region for main
ecx  (
, 4[0,∞]-40)
ebx  (1[0,9],
)
esp  (
,
-40)
edi  (
esp  (
,
,
mem_4
mem_0
Global Region
(main, -40)
1
;
-32)
-40)
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
1
?
;
;return *pArray2
2
Example – Value-set analysis
(main, 0)
ret_main
;
(GL,8)
mainv_20
(GL,4)
(GL,0)
(main, -32)
mainv_28
;
mem_4
mem_0
Global Region
(main, -40)
1
2
edi  (,-32)
(, -32)
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
Region for main
ecx  (, 4[0,∞]-40)

ebx  i
ecx  variable p
= (,-32)  
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
1
?
;
;return *pArray2
2
Example – Value-set analysis
(main, 0)
ret_main
;
(GL,8)
mainv_20
(GL,4)
(GL,0)
(main, -32)
mainv_28
;
mem_4
mem_0
Global Region
(main, -40)
Region for main
1
2
ecx  (, 4[0,∞]-40)
edi  (, -32)
A stack-smashing attack?
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
1
;
;return *pArray2
2
Affine-Relation Analysis
• Value-set domain is non-relational
– cannot capture relationships among a-locs
• Imprecise results
– e.g. no upper bound for ecx at loc_9
• ecx  (, 4[0,∞]-40)
. . .
loc_9:
mov
add
inc
cmp
jl
. . .
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
;*p=arrVal
;p++
;i++
;i<10?
;
Affine-Relation Analysis
• Obtain affine relations via static analysis
• Use affine relations to improve precision
– e.g., at loc_9
ecx=esp+(4ebx), ebx=([0,9],), esp=(,-40)
 ecx=(,-40)+4([0,9])
 ecx=(,4[0,9]-40)
 upper bound for ecx at loc_9
. . .
loc_9:
mov
add
inc
cmp
jl
. . .
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
;*p=arrVal
;p++
;i++
;i<10?
;
Example – Value-set analysis
(main, 0)
ret_main
;
(GL,8)
mainv_20
(GL,4)
(GL,0)
(main, -32)
mainv_28
;
mem_4
mem_0
Global Region
(main, -40)
Region for main
1
ecx  (, 4[0,9]-40)
No stack-smashing attack reported
ebx  i
ecx  variable p
sub
lea
mov
lea
mov
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
1
;
;return *pArray2
2
Affine-Relation Analysis
• Affine relation
– x1, x2, …, xn – a-locs
– a0, a1, …, an – integer constants
– a0 +i=1..n(ai xi) = 0
• Idea: determine affine relations on registers
– use such relations to improve precision
• Implemented using WPDS++
Performance
Program
javac
nProc
nInsts
Value-set
analysis
(seconds)
Affinerelations
(seconds)
36
3,555
42
36
cat(2.0.14)
123
3,892
51
32
cut(2.0.14)
129
4,329
28
50
grep(2.4.2)
245
16,808
85
78
flex(2.5.4)
239
23,373
200
376
tar(1.13.19)
587
50,347
210
awk(3.1.0)
595
69,927
1,507
winhlp32
1,018
108,380
2,002
(5.00.2195.2014)
Future Work
• Aggregate Structure Identification
– Ramalingam et al. [POPL 99]
– Ignore declarative information
– Identify fields from the access patterns
– Useful for
• improving the a-loc abstraction
• discovering type information
Future Work
;
;
ebx  i
ecx  variable p
AR[-40:-1]
sub
lea
mov
lea
mov
40
8
32
4
28
loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
;
;return *pArray2
Future Work
;
;
ebx  i
ecx  variable p
AR[-40:-1]
sub
lea
mov
lea
mov
40
28
32
14

4
728

loc_9:
mov
add
inc
cmp
jl
mov
mov
add
retn
esp,
edx,
[4],
ecx,
edx,
40
[esp+8]
edx
[esp]
[0]
;adjust stack
;
;pArray2=&a[2]
;p=&a[0]
;
[ecx], edx
ecx, 4
ebx
ebx, 10
short loc_9
edi, [4]
eax, [edi]
esp, 40
;*p=arrVal
;p++
;i++
;i<10?
;
;
;return *pArray2
Main Insights
• Combined numeric and pointer analysis
• Congruence (“stride”) information
– Ranges alone  false reports of pointer forging
[
]
• Affine relations used to improve precision
– Constraints among values of registers
– Loop conditions + affine relations
 better bounds for an a-locs RICs
Codesurfer/x86 Architecture
IDA Pro
Binary
Parse
Binary
Build
CFGs
Connector
Value-set
Analysis
CodeSurfer
Build SDG
Browse
For more details
• Gogul Balakrishnan’s demo
• Gogul Balakrishnan’s poster
• Consult UW-TR 1486
[http://www.cs.wisc.edu/~reps/#tr1486]
Client
Applications
Analyzing Memory Accesses
in x86 Executables
Gogul Balakrishnan
Thomas Reps
University of Wisconsin