Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D.
Download
Report
Transcript Enhancing Security of Real-World Systems with a Better Understanding of Threats Shuo Chen Ph.D.
Enhancing Security of Real-World Systems
with a Better Understanding of Threats
Shuo Chen
Ph.D. Candidate in Computer Science
Center for Reliable and High Performance Computing
University of Illinois at Urbana-Champaign
1
My Dissertation
Security Threat Analysis and Mitigations in Real-World
Systems
– How errors in hardware and software impose security threats to
real-world systems? (common characteristics?)
– How effective are current defense techniques? (substantial
deficiencies?)
– How to build better defenses?
Analysis-centric research approach
– Study hardware memory errors impact on system security
– Software vulnerabilities reported in Bugtraq and CERT
databases, source code of vulnerable applications
– Current attack methods and defense techniques
– Analysis results motivate the development of new defense
techniques.
Many areas related to my dissertation
2
I as a System Hacker/Builder
Summer’01, Avaya Labs, Basking Ridge, NJ
– Port Libsafe to Windows NT/2000.
Summer’02, Bell Labs, Holmdel, NJ
– Detection of network denial of service attacks
– Hack FreeBSD TCP/IP, network card drivers
Summer’03, Microsoft Research, Redmond, WA
– Audit-enhanced authentication in Kerberos
– NTOS security subsystem, Kerberos, LSA, NTDLL
Summer’04, Microsoft Research, Redmond, WA
– A tracing technique to identify the dependencies of
Windows applications on Administrator privileges
– NTOS security subsystem, access/privilege checking,
application interactions with NTOS
3
Outlines
– Security compromises due to HW/SW memory
corruptions
– A type of memory corruption attacks currently
believed to be rare is a realistic threat.
– Deficiencies of current defense techniques
Analyses
Solutions
Analyzing and Identifying Security Threats
on Real-World Systems
New Defense Techniques Towards a Better
Security Protection
– A common characteristic of memory corruption
attacks: pointer taintedness
– A theorem proving based program analysis
– A runtime detection technique
4
Analyzing and Identifying Security
Threats on Real-World Systems
5
Threat of Hardware Memory Errors
Due to hardware memory errors, users can log in with arbitrary passwords
Attacker
Network server (FTP and SSH)
Due to hardware memory errors, packets can penetrate firewalls
Attacker
Firewall (IPChains and Netfilter)
Target host
Emulate random hardware memory errors
A stochastic model to estimate such threats in real environments
Motivate other researchers to conduct physical fault injections
– Java type system subverted due to random hardware memory errors.
6
Threat of Software Vulnerabilities
Other
33%
Buffer
Overflow
44%
Globbing
2%
Format
String
7%
Heap
Corruption
8%
Integer
Overflow
6%
CERT Advisories: 66% vulnerabilities are low level
memory errors in software.
Widely exploited by attackers, worms and viruses.
7
State Machine Model: WU-FTP Server Attack
repeat
Embed
malicious
contents in
input
FTP_service()
Authentication;
x = user ID
seteuid(x)
get an FTP
command
SITE_EXEC(fn)
printf(fn,…)
Overwrite
a return
address
seteuid(0)
exec(“/bin/sh”)
8
Execute malicious code
State Machine Model: NULL-HTTP Server Attack
repeat
Overwrite
function
pointer foo
HTTP_service()
p=malloc(…)
process HTTP
header
free(p)
HTTP_POST()
*foo()
recv(p,…)
Corrupt
heap
structure
seteuid(0)
exec(“/bin/sh”)
Execute malicious code
9
Control Data Attack: Well-Known, Dominant
Control data:
– data used as targets of call, return and jump.
– widely understood as security critical elements
Control data attack: the most dominant form
of memory corruption attacks [CERT and
Microsoft Security Bulletin]
Many current defense techniques: to enforce
program control flow integrity to provide
security.
10
Non-control-data attacks
Currently very rare in reality.
One instance suggested by Young and
McHugh in 1987.
How applicable are such attacks
against many real-world software?
– Not studied yet, but important.
11
An Important Question
Are attackers in general incapable to mount noncontrol-data attacks against many real systems?
– PROBABLY NOT!
– Random hardware memory errors can subvert the security of
real-world systems with a non-negligible probability.
– Software vulnerabilities are more deterministic and more
amenable to attacks.
– Each attack exploiting software vulnerabilities is composed by
multiple primitive components. Allow potentially polymorphic
attacks. Dangerous.
12
Our Claim: General Applicability of
Non-control-data Attacks
We claim:
– Many real-world software applications are
susceptible to non-control-data attacks.
– The severity of the attack consequences is
equivalent to that due to control data attacks.
Validate the claim by constructing non-controldata attacks to get the root privilege on major
network servers
– FTP, HTTP, SSH and Telnet servers
– Over 1/3 of vulnerabilities in CERT advisories
Non-control-data attacks are realistic threats.
13
Non-control-data attack against WU-FTP
Server (via a format string bug)
int x;
FTP_service(...) {
authenticate(); x uninitialized, run as EUID 0
x = user ID of the authenticated user; x=109, run as EUID 0
seteuid(x); x=109, run as EUID 109. Lose the root privilege!
while (1) {
Get a special SITE EXEC command.
Get
a data command (e.g., PUT)
get_FTP_command(...); //vulnerable
Exploit a format string vulnerability.
x= 0, still run as EUID 109.
if (a data command?)
getdatasock(...);
}
}
getdatasock(
... ) { loop, still runs as EUID 0 (root).
When
return to service
x=0, run as EUID 0
Allow seteuid(0);
me to upload /etc/passwd
setsockopt(
); root privilege!
I can grant
myself...
the
x=0, run as EUID 0
seteuid(x);
14
Only}corrupt an integer, not a control data attack.
Non-control-hijacking attack against
NULL-HTTP Server (via a heap overflow bug)
Attack the configuration string of CGI-BIN path.
Mechanism of CGI
– suppose server name = www.foo.com
CGI-BIN = /usr/local/httpd/exe
/usr/local/httpd/exe
/bar
– Requested URL = http://www.foo.com/cgi-bin/bar
– The server executes
Our attack
– Exploit the vulnerability to overwrite CGI-BIN to /bin
/sh
– Request URL http://www.foo.com/cgi-bin/sh
– The server executes
The server gives me a root shell!
Only overwrite four characters in the CGI-BIN string.
Not a control data attack.
15
Non-control-data attack against SSH Communications
SSH Server (via an integer overflow bug)
void do_authentication(char *user, ...) {
auth = 0
int auth = 0;
...
auth = 0
while (!auth) {
/* Get a packet from the client */
type = packet_read();
auth = 1
switch (type) {
...
case SSH_CMSG_AUTH_PASSWORD:
Password incorrect,
if (auth_password(user, password))but auth = 1
auth =1;
case ...
}
if (auth) break;
auth = 1
}
/* Perform session preparation. */
Logged in without
do_authenticated(…);
16
correct password
}
More non-control-hijacking attacks
Against NetKit Telnet server (default Telnet
server of Redhat Linux)
– Exploit a heap overflow bug
– Overwrite two strings:
/bin/login –h foo.com -p
(normal scenario)
/bin/sh –h
–p
-p
(attack scenario)
– The server runs /bin/sh when it tries to authenticate
the user.
Against GazTek HTTP server
– Exploit a stack buffer overflow bug
Send a legitimate URL http://www.foo.com/cgi-bin/bar
The server checks that “/..” is not embedded in the URL
Exploit the bug to change the URL to
http://www.foo.com/cgi-bin/../../../../bin/sh
17
The server executes /bin/sh
Implications of Non-Control-Data Attacks
Control flow integrity is not a
sufficiently accurate approximation to
software security.
Many types of non-control data critical
to security
Once attackers have the incentive, they
are likely to succeed in non-controldata attacks.
18
Re-Examining Current Defense Techniques
Many of them are based on control flow
integrity
– Monitor system call sequences
– Protect control data
– Non-executable stack and heap
Pointer encryption PointGuard
Address space randomization
StackGuard, Libsafe and FormatGuard
Building a generic and secure defense
technique: still an open problem.
19
Pointer Taintedness Detection:
Towards a Better Security
Protection for Real-World Systems
20
Pointer Taintedness
Pointer Taintedness: a pointer value,
including a return address, is derived
from user input.
Most memory corruption attacks are due
to pointer taintedness.
Pointer taintedness: a unifying
perspective for reasoning about many
security attacks.
21
Most Memory Corruption Attacks are Due
to Pointer Taintedness
Format string attack
– Taint an argument pointer of functions such
as printf, sprintf and syslog.
Stack buffer overflow (stack smashing)
– Taint a frame pointer or a return address.
Heap corruption
– Taint the free-chunk doubly-linked list
maintaining the heap structure.
globbing attack
– User input resides in a location that is used
as a pointer by the parent function of glob().
22
Internals of Stack Buffer Overflow Attacks
Vulnerable code:
char buf[100];
strcpy(buf,user_input);
Stack growth
High
Return addr
Frame pointer
buf[99]
…
buf[1]
buf[0]
Frame pointer or
return address
can be tainted.
user_input
buf
Low
23
Internals of Format String Attacks
Vulnerable code:
recv(buf);
printf(buf);
Stack growth
High
Low
\xdd \xcc \xbb \xaa %d %d %d %n
/* should be printf(“%s”,buf) */
…
%n
%d
%d
%d
0xaabbccdd
fmt: format string pointer
ap:
pointer
fmt:argument
format string
pointer
ap: argument pointer
In vfprintf(),
*ap is a
if (fmt points to “%n”)
24
tainted
value.
then **ap = (character count)
Internals of Heap Corruption Attacks
user input
Vulnerable code:
buf = malloc(1000);
recv(sock,buf,1024);
free(buf);
Free chunk A
Allocated buffer buf
Free chunk B
fd=A
bk=C
In free():
B->fd->bk=B->bk;
B->bk->fd=B->fd;
Free chunk C
When B->fd and B->bk are tainted, the effect of free() is to
write a user specified value to a user specified address.
25
Building Defense Techniques
based on Pointer Taintedness
Static code analysis: analyze the
source code to extract the conditions
under which the possibility of pointer
taintedness exists.
– To uncover potential vulnerabilities
Runtime detection: monitor at runtime
whether a tainted value is
dereferenced as a pointer.
– To defeat memory corruption attacks
26
Static Analysis about Pointer Taintedness:
To Extract Security Specifications of Library Functions
IFIP International Information Security Conference 2004
27
Library function specifications are
crucial to secure programming
Library function specifications are specified
empirically
– printf(fmt,…), strcpy(d,s), free(p), glob(p),
strtok(s,del), savestr(p), ….
A unified reason why these specifications are
required
– Required to eliminate pointer taintedness.
Extraction of security specifications of a
function is reduced to a theorem proving task
Formal and complete specifications required
by compiler techniques to check application
source code for security.
28
Semantics of Pointer Taintedness
Formal definition of program semantics is required for
theorem proving.
– Currently defined using an equational logic framework
Taintedness-aware memory model
– The logic framework defines operations to fetch the content
and test the taintedness (true/false) of each memory
location.
Incorporate pointer taintedness into program
semantics
– Define program semantics at the assembly level to reason
about memory layout.
– Load/Store/ALU instructions: propagate taintedness from
source data to destination data.
– Input functions (scanf, recv and recvfrom)
Axiom: The memory locations in the receiving buffer are tainted
immediately after these function calls.
29
Extracting Function Specifications
by Theorem Prover
C source code of
a library function
Automatically translated
to formal semantic
representation
formal semantic
representation
Theorem generation
For each pointer dereference in an
assignment, generate a theorem
stating that the pointer is not tainted
Theorem proving
A set of sufficient conditions that imply the validity of the theorems.
They are the security specifications of the analyzed function.
30
Example:
vfprintf()
int vfprintf (FILE *s, const char *format, va_list ap)
{ char * p, *q; int done,data,n,state;
char buf[10];
p=format; done=0; if (p==NULL) return 0; state=NO_PENDING;
while (*p != 0) {
if (state==NO_PENDING) {
if (*p=='%') state=PENDING;
else outchar(s,*p); }
else {
switch (*p) {
case '%':
outchar(s,'%')
break;
case 'd':
data=va_arg (ap, int);
if (data<0) { outchar(s,'-'); data=-data; }
n=0;
while (data>0 && n<10) {
Theorem1: buf+n should not be a tainted value
case 's':
case 'n':
Theorem2: q should not be a tainted value
buf[n]=data%10+'0';
data/=10;
n++; }
while (n>0) { n--; outchar(s,buf[n]); }
break;
q=va_arg (ap, char *);
if (q==NULL) break;
while (*q!=0) {
outchar(s,*q)
q++; }
break;
q= va_arg(ap,void*) ;
*(int*) q = done;
break;
outchar(s,*p)
default:
}
state=NO_PENDING;
}
}
p++;
} return done;
31
Extracting the Specifications of vfprintf()
iterate
Try to prove the two theorems
The theorem prover cannot complete the proof initially
– only valid under certain preconditions.
Add these preconditions as axioms to the theorem
prover.
Repeat until both theorems are proved.
Four preconditions are added: the specifications of
vfprintf (FILE *s, const char *format, va_list ap)
– ap never points to any location within the current function
frame.
– *ap never points to the location of variable ap, i.e., *ap &ap
– Suppose the memory segment that ap sweeps over is called
ap_activitiy_range, then *ap never points to any location
within ap_activitiy_range.
– No locations within ap_activitiy_range are tainted before
vfprintf() is called.
Suggest the scenario of format string vulnerability
32
Other Studied Examples
Function strcpy()
– Four security specifications indicating buffer overflow, buffer
overlapping and buffer underflow scenarios causing pointer
taintedness.
Function free() of a heap management system
– Seven security specifications are extracted, including several
specifications indicating heap corruption vulnerabilities.
Socket read functions of Apache HTTP Server and
NULL HTTP Server
– Apache function is proven to be free of pointer taintedness.
– Two (known) vulnerabilities are exposed in the theorem
proving process of NULL HTTP Server function.
33
Runtime Pointer Taintedness Detection:
To Defeat Memory Corruption Attacks
To appear in IEEE Conference on Dependable Systems and Networks, 2005.
34
The Technique
A processor architectural level mechanism
to detect pointer taintedness
– On SimpleScalar simulator
Implemented a taintedness-aware memory
system
Extened instructions to track taintedness
– To show the validity of pointer taintedness
concept on whole programs of real
applications
Network servers
SPEC 2000 integer benchmarks
35
Evaluations on Real-World Software
Evaluation
–
–
–
–
Effectiveness of detection
No false alarm in any application evaluated
Transparent to applications
A small number of potential attack scenarios
undetected.
Pointer taintedness detection can be applied
to the whole program of real software
– offers a substantial improvement on security
protection.
36
Conclusions
37
Conclusions
Many real-world software can be compromised by
corrupting non-control data.
– It is insufficient to rely on control flow integrity for
software security.
Pointer taintedness is a unifying perspective to
reason about most memory corruption
vulnerabilities/attacks.
Reasoning about pointer taintedness is a promising
direction to enhance security on real-world systems
– A theorem proving based code analysis approach
– A runtime pointer taintedness detection mechanism
38
Future Directions
Short term goals
– Provide a higher degree of automation for the theorem
proving technique.
– Reduce the intrusiveness of the runtime pointer
taintedness detection technique
Combine with the theorem proving technique. The processor
only checks function preconditions.
Long term goals
– Extract programming styles susceptible to security attacks.
Can compilers detect bad programming styles?
– Identify a broader range of non-traditional security
threats.
– Study historical data about how security vulnerabilities
were discovered, reported and patched.
– Decompose the behaviors of viruses, worms and rootkits
to a number of basic building blocks.
39
Summary of My Research Methodology
Analysis-centric approach
– A significant amount of effort in my dissertation is
on analysis.
– Starting from the reality (usually a mess) to define
problems!
I am a data analysis person
– Excited to analyze real data and incidents
– Tedious? Sometimes, but it is a step toward a lot of
fun.
– Rewarding? Definitely. Especially important for
systems research.
– Goal: strongly motivate research topics that solve
problems in the reality.
40