Defeating Memory Corruption Attacks via Pointer Taintedness Detection Shuo Chen†, Jun Xu‡, Nithin Nakka†, Zbigniew Kalbarczyk† and Ravi K.

Transcript Defeating Memory Corruption Attacks via Pointer Taintedness Detection Shuo Chen†, Jun Xu‡, Nithin Nakka†, Zbigniew Kalbarczyk† and Ravi K.

Defeating Memory Corruption Attacks
via Pointer Taintedness Detection
Shuo Chen†, Jun Xu‡, Nithin Nakka†, Zbigniew Kalbarczyk† and Ravi K. Iyer†
†
Center for Reliable and High-Performance Computing,
University of Illinois at Urbana-Champaign,
U.S.A.
‡
Department of Computer Science,
North Carolina State University,
U.S.A.
IEEE International Conference on Dependable Systems and Networks
Yokohama, Japan, June 30, 2005
Introduction
Memory corruption attack


Major threat of Internet
Current dominant form: Control data attack
Our contributions



Non-control data attacks are realistic
More general observation: pointer taintedness
A new architecture for detection
2
Outline
Non-control Data Attacks
The Concept of Tainted Pointers
Processor Architecture for Pointer Taintedness
Detection
Experimental Evaluation
Conclusion
3
Control Data Attack
Control data attack


a.k.a. control hijacking or code-injection attack
Dominant form of memory corruption attacks
[CERT and Microsoft Security Bulletin]
Control data (code pointers)


data used as targets of call, return and jump
widely understood as security critical-data
Many existing defenses: enforce security via
control data integrity
4
Control Data Attack – An Example
WU-FTPD format string attack
repeat
Embed
malicious
contents in
input
FTP_service()
Authentication;
x = user ID
seteuid(x)
get an FTP
command
SITE_EXEC(fmt)
printf(fmt,…)
Overwrite
a return
address
seteuid(0)
exec(“/bin/sh”)
Execute malicious code
5
Non-Control-Data Attack: A Realistic Threat
Non-control-data: not control data (code pointers),
attacks corrupt application-specific data
Not been seriously considered
We constructed non-control-data attacks against
a number of real world applications




Equivalent security compromise as control data attacks
Root privilege on HTTP, SSH, Telnet and FTP servers
Corrupting user identify data, configuration data, user
input data, and decision-making data
Will appear in USENIX Security Symposium, Aug 2005
6
Non-Control Data Attack – An Example
WU-FTPD format string attack
repeat
Embed
malicious
contents in
input
FTP_service()
Authentication;
x = user ID
seteuid(x)
get an FTP
command
SITE_EXEC(fmt)
printf(fmt,…)
Overwrite
x (saved
user ID)
getdatasock( ... ) {
seteuid(0);
setsockopt( ... );
seteuid(x);
7
}
More Non-Control-Data Attacks
Against NULL HTTP server


Corrupt the configuration string of CGI-BIN path.
Run /bin/sh as a CGI program
Against SSH Communications SSH server


Corrupt a Boolean
Log in as root with an arbitrary password
Against GazTek HTTP server


Corrupt user URL input
Run /bin/sh as a CGI program
New threat calling for new defense

How can we defeat both control-data and non-control-data
attacks?
8
Pointer Taintedness Detection
Tainted pointers: code or data pointers derived
from malicious user input
Root cause of a large class of memory
corruption attacks (control-data or non-controldata)
Detection of tainted pointers

Defeat a large class real-world memory attacks, e.g.,
stack smashing, format string, heap corruption,
integer overflow
9
Internals of Stack Buffer Overflow Attacks
Vulnerable code:
char buf[100];
strcpy(buf,user_input);
Stack growth
High
Return addr
Frame pointer
buf[99]
…
buf[1]
buf[0]
Frame pointer or
return address
can be tainted.
user_input
buf
Low
10
Runtime Pointer Taintedness Detection
A processor architectural level mechanism to
detect pointer taintedness

Implemented a taintedness-aware memory system
One-bit extension for each byte to indicate the taintedness of
the byte

Taintedness initialization
Tag every byte of data received from external input sources

Taintedness tracking
Tainedness is propagated by ALU instructions

Attack detection
When a tainted value is dereferenced (i.e., used as a pointer).
On SimpleScalar processor simulator
11
Opcode
Register File
4 bits
36 bits
4 bits
M
U
X
Bitwise 4 bits
OR
36 bits
0
36 bits
8-bit byte
Taintedness
bit
36 bits
M
U
X
store path
36 bits
Data pointer
taintedness
detector
alert
MUX
load/
store?
jr?
36 bits
M
U
X
MUX
alert
Jump pointer
taintedness
32 bits
detector
A 32 bits 36 bits
L
32 bits
U
MEM/WB
0
Data Memory
ID/EX
Shift specific logic
AND specific logic
XOR specific logic
Compare specific logic
ALU taintedness
tracking logic EX/MEM
M
U
X
4 bits
36 bits
36 bits
load path
12
Related Work on Taintedness
Perl security
Shankar and Wagner (2001)

Static analysis to uncover format string vulnerabilities
Our previous work on pointer taintedness (Aug. 2004)


A source code analysis technique to uncover pointer taintedness
vulnerabilities
Reasoning taintedness at machine code level, relying on an
extended memory model
More recent work: Secure Program Execution (MIT),
Minos (UC-Davis) and TaintCheck (CMU) (late 2004 and
early 2005)



Similar memory model
Taintedness of control data
Pointer taintedness vs. control-data taintedness
 cause vs. result of memory corruption
13
Evaluation
Attack detection effectiveness


Synthetic vulnerable programs
Real-world network applications
Evaluation of false positives


Real-world network applications
SPEC 2000 benchmarks
Potential false negative scenarios
14
Attack Detection Effectiveness
First, test on synthetic vulnerable programs
All attacks (control/non-control data) are detected
Stack Buffer
Overflow
Heap Corruption
Attack
Format String
Attack
Vulnerable
program
void exp1() {
char buf[10];
scanf("%s",buf);
}
void exp2() {
void exp3(int s) {
char * buf;
char buf[100];
buf = malloc(8);
recv(s,buf,100,0);
scanf("%s",buffer);
printf(buf);
free(p);
}
}
Input data
(network/console)
aaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaa
abcd%x%x%x%n
Violating
instruction
400a38: JR $31
401dc0: LW $3,0($3)
402d60: SW $21,0($3)
Tainted data
$31= 0x61616161
$3 = 0x61616161
$3=0x64636261
15
Attack Detection Effectiveness (cont.)
Evaluation on real world network applications
All attacks are detected
No difference between control-data attack and non-control-data
attack from the viewpoint of pointer taintedness
WU-FTP
server
Format string
attack
Overwrite user identity data
(non-control-data)
detected
GazTek
HTTP server
Stack buffer
overflow attack
Overwrite user input data
(non-control-data)
detected
NULL HTTP
server
Heap corruption Overwrite configuration data
attack
(non-control-data)
detected
traceroute
Double free
detected
Function pointer
(control-data)
16
Transparency and False Positive
No need for re-compilation, run existing binary executables
Results from network applications: no false positives
Results from SPEC benchmarks

15 billion instructions without any false positive
Conclusion: No known false positive
BZIP2
GCC
GZIP
MCF
PARSER
VPR
Total
Program size
321KB
4184KB
485KB
304KB
595KB
697KB
6586KB
Total number
of input bytes
1048KB
77.7K
282KB
39.2KB
743.0KB
6.4KB
2186KB
Total number
of instructions
5,951M
110M
6,926M
1,653M
389M
108M
15,139M
Alert
generated?
No
No
No
No
No
No
No
17
Potential False Negative Scenarios
Incorrect array index boundary check

Determining correct array size requires source code
analysis – very hard at binary level
Buffer overflow within the local frame


If no pointer is tainted, no alert is raised
Unlikely to cause severe security damage because
attacker-controllable location is very limited
Format string attack causing information leak


Allows inspection of some memory data words
Cause security compromises if these words containing
security-critical secret, e.g., key and password
18
Integer overflow Induced Array
Index Out of Bound
void foo(unsigned int ui)
{
1: int i = ui;
2: if (i >= ArraySize)
3:
i = ArraySize – 1;
4: array[i] = 1;
}
19
Buffer overflow causing critical
flags to be corrupted
void bar () {
1: int auth;
2: char buf[100];
3: auth = do_auth ();
4: scanf(“%s”,buf);
5: if (auth) grant_access();
}
20
Format string attack causing
information leak
void leak() {
1: int secret_key;
2: char buf[12];
3: recv(s,buf,12,0);
4: printf(buf); “%x%x%x%x”
}
21
Conclusions
Contributions:



Non-control-data attack is a realistic threat
Memory corruption attacks, including control-data
attacks and non-control-data attacks, are due to
pointer taintedness
Proposed a runtime pointer taintedness detection
architecture - Substantial improvement in security
coverage
Evaluation


transparent to existing applications
a near-zero false positive rate
We plan to implement this approach in the
Hardware framework for detection and recovery
22
Questions?
23
Another Motivating Example
NULL-HTTPD heap corruption attack
repeat
Overwrite
function
pointer foo
HTTP_service()
p=malloc(…)
process HTTP
header
free(p)
HTTP_POST()
*foo()
recv(p,…)
Corrupt
heap
structure
seteuid(0)
exec(“/bin/sh”)
Execute malicious code
24
Non-Control-Data Attack against WU-FTP Server
Overwrite an integer representing user ID 
obtain the root privilege of the server
int x;
site_exec() {
a format string vulnerability
}
getdatasock( ... ) {
seteuid(0);
setsockopt( ... );
seteuid(x);
}
25
Internals of Format String Attack
Vulnerable code:
recv(buf);
printf(buf);
Stack growth
High
Low
\xdd \xcc \xbb \xaa %d %d %d %n
/* should be printf(“%s”,buf) */
…
%n
%d
%d
%d
0xaabbccdd
fmt: format string pointer
ap:
pointer
fmt:argument
format string
pointer
ap: argument pointer
In vfprintf(),
*ap is a
if (fmt points to “%n”)
then **ap = (character count) tainted value.
26
Future Directions
Combination of static code analysis and
architecture support

To automatically derive predicates to be
checked by processor at runtime
Reliability and security support for
embedded systems


Migrate our current techniques to embedded
systems
New topics: cell phone virus, reduced power
consumption, tamper-resistance hardware,
crypto and authentication hardware/software
27
Other
33%
Buffer
Overflow
44%
Globbing
2%
Format
String
7%
Heap
Corruption
8%
Integer
Overflow
6%
28