Transcript l 1
Deobfuscation of VirtualizationObfuscated Software
1
K E V I N C O O G A N , G E N L U , S A U M YA D E B R AY
D E PA R T M E N T O F C O M U P U T E R S C I E N C E
UNIVERSITY OF ARIZONA
報告者:張逸文
ADLab
Outline
2
Introduction
Deobfuscation
Experimental Evaluation
Related Work
Conclusion
ADLab
Introduction(1/4)
3
Basic about Reverse Engineering
Compilation
ADLab
Decompilation
Introduction(2/4)
4
Virtualization obfuscators
VMProtect, Code Virtualizer
{
VIRTUALIZER_START
your code
VIRTUALIZER_END
}
ADLab
Introduction(3/4)
5
The virtualization-obfuscated programs are resistant to
static and dynamic analysis techniques
The executed code reveals only the structure and logic of the bytecode interpreter
Randomness VM
Outside-in approach
Reverse engineer the VM interpreter
Individual byte code instructions
Recover the logic
The structure of the interpreter meets certain requirements
ADLab
Introduction(4/4)
6
Programs interact with the system through system calls
Identifying instructions that interact with the system
Not recovering the original instructions
Capturing behavior of the code
General, using in a wide range
ADLab
Deobfuscation
7
Static analysis v.s dynamic trace
Identifying instructions that are known to be part of the
original code
No information about the specific structure of the
interpreter
ADLab
Deobfuscation
8
Overall approach:
Tracing tool
1.
Low level execution trace
Identifying system calls and their arguments
2.
database
Instruction trace
3.
Relevant instructions
Building a subtrace
4.
ADLab
Relevant subtrace
Deobfuscation
9
Value-based Dependence Analysis
Not recovering the original code
The process of deobfuscation must be semantics-preserving
Identifying instructions that affect the values of the arguments to
system calls
Slicing algorithms --- control-dependent
Data dependencies
Use-definition chains --- link instructions that use a variable to the
instruction that define it
Problem:
ADLab
Deobfuscation
10
Value-based dependence
if( I defines a location l
S) {
I is marked as relevant;
l is removed from S;
the set of locations used by I is added to S; }
Problem:a pointer to a structure
I uses some locations l1, l2, … , ld
if ( I uses li
P to define ld )
ld is added to P
if ( li access a memory location )
[li ] is added to M
ADLab
Deobfuscation
11
Relevant Conditional Control Flow
Value-based dependence analysis doesn’t identify the associated
control flow instructions
The occurring of conditional control flow
IA-32 architecture setting the condition code flags in the eflags
register
Not such simple!!
Examining target address
Equational Resoning System:translate each instruction in the
dynamic trace into an equivalent set of equations
ADLab
Deobfuscation
12
Equational Resoning System
Identifies conditional dependencies
The left hand side variables in an equation is numbered by the order of its
instruction appears
The right hand side variables is numbered by the instruction that defined
it
ADLab
Example 1.
Deobfuscation
13
Example 2.
Example 3.
ADLab
Indirect jump
Deobfuscation
14
Example 4.
Used in VMProtect
Target20 = index1*4+0x10000
ADLab
Deobfuscation
15
ADLab
Deobfuscation
16
ADLab
Deobfuscation
17
Relevant Call-Return Control Flow
Identifying functions:the behavior of calls and returns
Knowing how them work allows one to use for other purposes
Behavior of Function Calls and Returns
ADLab
Deobfuscation
18
call 改成push
registers
ADLab
無法解決
Deobfuscation
19
Identification Approach
Call:a code address is saved at the call site
Return:the saved address is used for a control transfer at the return
point
ADLab
Deobfuscation
20
Relevant Dynamic Trace
ADLab
Experimental Evaluation
21
Experimental Methodology
Compile original source code
Generate an original dynamic trace
Build an original subtrace
Virtualization-obfuscation technique
Generate an obfuscated dynamic trace
Build a relevant subtrace of the obfuscated subtrace
The obfuscated subtrace is matched to the original subtrace and
scores are produced
The relevance score and obfuscation score are calculated
ADLab
Experimental Evaluation
22
VX Heavens website
ADLab
Related Work
23
Deobfuscation of code obfuscated via virtualization
obfuscators
Rolles, Sharif, Falliere
Programming language community
Partial evaluation
ADLab
Conclusions
24
Virtualization-obfuscated programs are difficult to reverse
engineer
We present a different approach to identifying the flow of
values to system call instructions
ADLab
XD ~
25
ADLab