PPT - Salvatore Guarnieri

Download Report

Transcript PPT - Salvatore Guarnieri

Saving the World Wide Web
from Vulnerable JavaScript
International Symposium on Software Testing and Analysis (ISSTA 2011)
Salvatore Guarnieri
Marco Pistoia
Omer Tripp
IBM Software Group
IBM T. J. Watson Research
Center
IBM Software
Group
[email protected]
[email protected]
Julian Dolby
Stephen Teilhet
Ryan Berg
IBM T.J. Watson Research
Center
IBM Software Group
IBM Software Group
[email protected]
[email protected]
[email protected]
[email protected]
www.research.ibm.com/labasec
JavaScript is present on many
popular Web sites
1
2
Consequences of Taint Violations
• Read and write access to saved data in cookies and local data
stores
• Read and write access to data in the web page
• Key loggers
• Impersonation
• Phishing via page modifications or redirects
3
Getting data from the DOM
var el1 = document.getElementById("d1");
Sanitizing some, but not
function foo() {
all, of the data
var el2 = document.getElementById("d2");
function bar() {
var el3 = new Element();
var s = encodeURIComponent(el2.innerText);
document.write(s);
el1.innerHTML = el2.innerText;
Writing untrusted data
document.location = el3.innerText;
into web page
}
bar();
}
foo();
function baz(a, b) {
a.f = document.URL;
document.write(b.f);
}
Writing unchecked data
var x = new Object();
to the web page
baz(x, x);
4
Motivation
Sources, Sinks, and Sanitizers
Taint Analysis
Results
5
var el1 = document.getElementById("d1");
function foo() {
var el2 = document.getElementById("d2");
function bar() {
var el3 = new Element();
var s = encodeURIComponent(el2.innerText);
document.write(s);
el1.innerHTML = el2.innerText;
document.location = el3.innerText;
}
bar();
}
foo();
function baz(a, b) {
a.f = document.URL;
document.write(b.f);
}
var x = new Object();
baz(x, x);
6
var el1 = document.getElementById("d1");
function foo() {
var el2 = document.getElementById("d2");
function bar() {
var el3 = new Element();
var s = encodeURIComponent(el2.innerText);
document.write(s);
el1.innerHTML = el2.innerText;
document.location = el3.innerText;
}
bar();
}
foo();
function baz(a, b) {
a.f = document.URL;
document.write(b.f);
}
var x = new Object();
baz(x, x);
7
Rules
• A rule is a triple <Sources, Sinks, Sanitizers>
• Not all sources are valid for all sinks, and not all
sanitizers are valid for all sinks
8
Rules
• A rule is a triple <Sources, Sinks, Sanitizers>
• Not all sources are valid for all sinks, and not all
sanitizers are valid for all sinks
• Sources
– Seeds of untrusted data
– Field gets or returns of function calls
– Ex: document.url
9
Rules
• A rule is a triple <Sources, Sinks, Sanitizers>
• Not all sources are valid for all sinks, and not all
sanitizers are valid for all sinks
•
Sources
–
–
–
Seeds of untrusted data
Field gets or returns of function calls
Ex: document.url
• Sinks
– Security critical operations
– Field puts or parameters to function calls
– Ex: element.innerHTML
10
Rules
• A rule is a triple <Sources, Sinks, Sanitizers>
• Not all sources are valid for all sinks, and not all
sanitizers are valid for all sinks
•
Sources
–
–
–
•
Seeds of untrusted data
Field gets or returns of function calls
Ex: document.url
Sinks
–
–
–
Security critical operations
Field puts or parameters to function calls
Ex: element.innerHTML
• Sanitizers
– Marks flow as non-dangerous
– Function calls
– Ex: encodeURIComponent(str)
11
Motivation
Sources, Sinks, and Sanitizers
Taint Analysis
Results
12
Complexities of JavaScript
• Reflective property
access
• Prototype chain
property lookup
• Lexical scoping
• Function pointers
• eval and its relatives
eval("document.write('evil')");
function
F() { +{ "bar";
foo()
var m
a =="foo"
function()
...
this.bar
var
42;
= document.url;
var
kby==function(f)
obj[a];
{
} var
f(); bar = function() {
} write(y);
function
}
G() {
k(m);
}
G.prototype = new F();
var a = new G();
write(g.bar);
13
Demand Driven Taint Analysis
•
•
•
The seeds are the assignments
to sources or return values from
sources
The analysis proceeds by
tainting variables
Variables consist of triplets:
– Static Single Assignment
(SSA) variable ID
– Method where SSA variable
is defined
– Access path
– Ex: (v7, m, <f, g>)
14
Context Sensitive Taint Analysis
• Start from taint sources
• Propagate taint intraprocedurally through defuse
• Inter-procedurally
propagate taint forward
• Resolve aliasing by using
Andersen alias analysis
• Record constraints on call
sites, recursively
• In the final constraintpropagation graph, detect
paths between sources and
sinks not intercepted by
sanitizers
m1()
m2(p1, p2, p3)
m3(q1, q2)
15
Analysis Example
Taint variable: (v2, foo, <f, *>)
function foo(p1, p2) {
p1.f = p2.f;
}
var a = new Object();
var b = new Object();
b.f = window.location.toString();
var c = new Object();
var d = new Object();
d.f = "safe";
foo(a, b);
foo(c, d);
Install taint summary for foo: p2.f -> p1.f
Since d.f is not tainted, c.f will not be tainted
document.write(a.f); // This is a taint violation
document.write(c.f); // This is NOT a taint violation
16
Motivation
Sources, Sinks, and Sanitizers
Taint Analysis
Results
17
Data Sets
• Developed a micro-benchmark suite of
about 150 test scripts
• Downloaded Web pages and ran Actarus
on them
18
Real World Data Set
• Crawled portions of top Alexa Web sites
and downloaded pages to disk
• Ran Actarus on a sample of the saved
pages
• Ran on over 12,000 pages
• Successfully analyzed over 9,000 pages
• ~22% failure due to a 4 minute timeout
19
Findings
• Several vulnerable Web sites were found
• Duplicates of vulnerabilities were found on
many pages from the same site
• Some exploits were found in third party
code that was shared among several
websites
• 40% true positive rate
• Vulnerabilities can be fixed with common
sanitization routines
20
Findings
Site
Unique True Positives
Total True Positives
A
7
80
B
4
12
C
4
91
D
7
13
E
2
4
F
1
200
G
1
1
H
1
114
I
3
7
J
1
3
K
1
1
21
User Friendly Output
• Flows are highlighted and numbered in the
source code
• JavaScript was pretty printed to improve
readability and usefulness of line numbers
22
23
Future Work
• Using string analysis to reduce false
positives
• Make analysis modular so library code
does not have to be reanalyzed
24
Thank You
E-mail: [email protected]
25