Transcript l 1

Deobfuscation of VirtualizationObfuscated Software
1
K E V I N C O O G A N , G E N L U , S A U M YA D E B R AY
D E PA R T M E N T O F C O M U P U T E R S C I E N C E
UNIVERSITY OF ARIZONA
報告者:張逸文
ADLab
Outline
2
 Introduction
 Deobfuscation
 Experimental Evaluation
 Related Work
 Conclusion
ADLab
Introduction(1/4)
3
 Basic about Reverse Engineering
 Compilation

ADLab
Decompilation
Introduction(2/4)
4
 Virtualization obfuscators
 VMProtect, Code Virtualizer
{
VIRTUALIZER_START
your code
VIRTUALIZER_END
}
ADLab
Introduction(3/4)
5
 The virtualization-obfuscated programs are resistant to
static and dynamic analysis techniques


The executed code reveals only the structure and logic of the bytecode interpreter
Randomness VM
 Outside-in approach
 Reverse engineer the VM interpreter
 Individual byte code instructions
 Recover the logic
 The structure of the interpreter meets certain requirements
ADLab
Introduction(4/4)
6
 Programs interact with the system through system calls
 Identifying instructions that interact with the system
 Not recovering the original instructions
 Capturing behavior of the code
 General, using in a wide range
ADLab
Deobfuscation
7
 Static analysis v.s dynamic trace
 Identifying instructions that are known to be part of the
original code
 No information about the specific structure of the
interpreter
ADLab
Deobfuscation
8
 Overall approach:
Tracing tool
1.

Low level execution trace
Identifying system calls and their arguments
2.

database
Instruction trace
3.

Relevant instructions
Building a subtrace
4.

ADLab
Relevant subtrace
Deobfuscation
9
 Value-based Dependence Analysis
 Not recovering the original code
 The process of deobfuscation must be semantics-preserving
 Identifying instructions that affect the values of the arguments to
system calls
 Slicing algorithms --- control-dependent
 Data dependencies
 Use-definition chains --- link instructions that use a variable to the
instruction that define it
 Problem:
ADLab
Deobfuscation
10

Value-based dependence
if( I defines a location l
S) {
I is marked as relevant;
l is removed from S;
the set of locations used by I is added to S; }

Problem:a pointer to a structure
I uses some locations  l1, l2, … , ld
if ( I uses li
P to define ld )
ld is added to P
if ( li access a memory location )
[li ] is added to M
ADLab
Deobfuscation
11
 Relevant Conditional Control Flow
 Value-based dependence analysis doesn’t identify the associated
control flow instructions
 The occurring of conditional control flow
 IA-32 architecture  setting the condition code flags in the eflags
register
 Not such simple!!
 Examining target address
 Equational Resoning System:translate each instruction in the
dynamic trace into an equivalent set of equations
ADLab
Deobfuscation
12

Equational Resoning System
Identifies conditional dependencies
 The left hand side variables in an equation is numbered by the order of its
instruction appears
 The right hand side variables is numbered by the instruction that defined
it


ADLab
Example 1.
Deobfuscation
13

Example 2.

Example 3.

ADLab
Indirect jump
Deobfuscation
14

Example 4.

Used in VMProtect
Target20 = index1*4+0x10000
ADLab
Deobfuscation
15
ADLab
Deobfuscation
16
ADLab
Deobfuscation
17
 Relevant Call-Return Control Flow
 Identifying functions:the behavior of calls and returns
 Knowing how them work allows one to use for other purposes
 Behavior of Function Calls and Returns
ADLab
Deobfuscation
18
call 改成push
registers
ADLab
無法解決
Deobfuscation
19

Identification Approach
Call:a code address is saved at the call site
 Return:the saved address is used for a control transfer at the return
point

ADLab
Deobfuscation
20
 Relevant Dynamic Trace
ADLab
Experimental Evaluation
21
 Experimental Methodology
 Compile original source code
 Generate an original dynamic trace
 Build an original subtrace
 Virtualization-obfuscation technique
 Generate an obfuscated dynamic trace
 Build a relevant subtrace of the obfuscated subtrace
 The obfuscated subtrace is matched to the original subtrace and
scores are produced
 The relevance score and obfuscation score are calculated
ADLab
Experimental Evaluation
22

VX Heavens website
ADLab
Related Work
23
 Deobfuscation of code obfuscated via virtualization
obfuscators

Rolles, Sharif, Falliere
 Programming language community
 Partial evaluation
ADLab
Conclusions
24
 Virtualization-obfuscated programs are difficult to reverse
engineer
 We present a different approach to identifying the flow of
values to system call instructions
ADLab
XD ~
25
ADLab