An Overview of Vulnerability Research and Exploitation Peter Winter-Smith [peter@ngssoftware.com] Chris Anley [chris@ngssoftware.com]

An Overview of Vulnerability Research and Exploitation Peter Winter-Smith [[email protected]] Chris Anley [[email protected]]

Transcript An Overview of Vulnerability Research and Exploitation Peter Winter-Smith [[email protected]] Chris Anley [[email protected]]

An Overview of Vulnerability Research and Exploitation

Peter Winter-Smith [[email protected]] Chris Anley [[email protected]]

Agenda

• Who we are • What we do • How we do it • What we’ve found

Who Are NGS?

• Security Research, Software and Consultancy company • UK (London) based, self-funded, no debt, hence freedom of action, independence • 50+ employees • UK Govt. CHECK scheme, also accredited to audit Visa + MasterCard sites • You may have heard of: – “Slammer” worm (Jan 2003) – Oracle “Unbreakable” campaign (Feb 2002) – Sybase threats of legal action (April 2005) – Many, many more publicly disclosed vulnerabilities

What Do We Do?

Attack! (and advise on defences) • Paid vulnerability research / product review for software vendors • Paid network / application audit for wide range of companies • Unpaid research into “interesting” major products • Write and sell software to automate all of the above

What Are We Looking For?

• Behaviours (unintended and deliberate) in a piece of hardware, software or protocol that are “useful” to an attacker • But, generally, “arbitrary code execution” flaws.

• Buffer overflows, format string bugs, signedness errors, pointer overwrites, null overwrites, application-specific code execution flaws

Why arbitrary code execution?

• Execute attacker code in the context of the target, e.g. do everything the database can do.

• Instant and total control of the target process, sometimes the target host, occasionally the target network domain • Still growing – after 20 years of research – and still the most dangerous class of vulnerability

Buffer overflows 1989-2005

Buffer Overflows

• Data buffer of size x is allocated • Data of size (x+y) is copied into buffer • Data ‘after’ the buffer is overwritten • Stack overflow – local variables, saved return address, parameters, exception handling data • Heap overflow – heap management structures • Net effect – redirection of execution, normally into the data that the attacker supplied • The attacker is uploading and running a program with the privileges of the program they are attacking • Several worms based on overflow attacks – Morris worm (1988), Code Red, Slammer, Witty, Apache-Slapper, Zotob

Format string bugs

C string formatting functions called the ‘printf’ family Safe: printf( “My name is %s”, name ); Unsafe: printf( buff ); The %n specifier outputs the number of characters written so far to the specified variable Attacker includes more specifiers than there are variables, printf pulls extra variables off the stack Attacker is able to write the value of their choice to the location of their choice, multiple times, and thereby redirect execution into their own, attacker-supplied code

SQL injection

Occurs when a SQL Query is created by concatenating user-supplied data with string constants, e.g.

"Select * from users where user='" + username + "' and password='" + password + "'“

If user-supplied data can contain the string delimiter (single to the database, e.g. ‘ drop table users-- results in quote, ‘), you end up being able to insert arbitrary SQL into the query passed

Select * from users where username =''; drop table users—

…which drops the ‘users’ table.

A Philosophical Moment

From a certain perspective, these are all the same bug.

Data in grammar A is interpreted in grammar B, e.g. a username becomes SQL, some string data becomes ‘stack’ or ‘heap’.

Other examples – CDONTS.Newmail ‘sender’ – an email address – is interpreted in the underlying SMTP protocol, so if the sender is [email protected]\n …the attacker can send arbitrary mail Much of what we do relies on our understanding of these underlying grammars and subsequent ability to create valid phrases in ‘B’ that work in ‘A’

How Do We Do It?

• Methodology (though this is fluid) • Create and use tools • Perform analysis – Source code / architectural / protocol review – Dynamic instrumentation using hooks – Code coverage • Automated dynamic fault analysis, aka “Fuzzing”

Fuzzers, Tools and Techniques

• Generally two different techniques for assessment – black-box and white-box • White-box often leads to manual testing and very focused fuzzing • Black-box often requires the use of analysis and monitoring tools more so than white-box – Code coverage tools – Global function interception

Code Coverage

• Works through one of two means – compiled into the binary, or added at runtime • Allows for a good idea of level of coverage achieved during testing/fuzzing • Has been combined with fuzzing tools to implement smart-fuzzing techniques which detect branching based on test-case

Global Function Interception

• Good for fast understanding of process internals • Enumerates each module and function in process address space • Redirects (hooks) the function and point it at dynamically generated stub function to deduce and log parameters • Continues execution as it was left off (does not affect calling convention)

Fuzzing

• Fuzzing is a testing technique used to deliver a large number of test-cases of malformed input to the target application • Fuzzers are generally either smart (application or protocol aware), or naïve (operate with very basic knowledge of the target) • Fuzzers are becoming more intelligent!

• NGS use fuzzers on almost every engagement – Fault injection – Fuzzing frameworks (binary/text) – Smart fuzzers – RPC/COM fuzzers

Fault Injection

• Good for fuzzing of very state-reliant applications (removes the need for re-negotiation of state) • Allows very specific areas of target to be tested (on a function or function group basis) • Debugs the target – saving state information including writeable memory pages and thread context information • Delivers new test-case by process specific means (usually a buffer and a length) • Allows test-case to run, logs any exceptions, and repeats

Fuzzing Frameworks

• NGS have several flexible frameworks which allow us to implement fuzzers for a given protocol with relative ease • NGS have two main frameworks, one for text based protocols, one for binary based protocols • Frameworks has to support the idea of relationships between data elements (i.e. lengths, checksums, encoded data) • The frameworks are implemented as DLLs which give the output as data, allowing it to be used through whatever delivery method is most suitable (i.e. fault injection, TCP session)

Smart Fuzzing

• Smart fuzzing is the name given to more state/protocol aware fuzzers • Often, naïve fuzzing is not enough to find issues – cannot get deep enough inside the target • Smart fuzzers which NGS have implemented include – Code coverage based fuzzers – Protocol/state aware fuzzers using the fuzzing frameworks – Fuzzers relying on heuristics (i.e. is able to deduce lengths, blocks within blocks, etc)

RPC and COM Fuzzing

• It is not uncommon for an application under assessment to operate over RPC or COM • Due to the nature of COM, it is relatively easy to implement fuzzers to query the component about it’s methods and properties • NGS have implemented a fuzzer to automatically fuzz COM components with a given set of generated malformed arguments and properties

COM Fuzzing

• COM interfaces disclose a large amount of information about the target component • ITypeInfo interface is used to gain information about a method’s parameters • IPropertyBag2 is used to gain information about properties which can be queried or set • IDispatch is used to call the method with a new test-case • Iterates through all methods and properties

RPC Fuzzing

• RPC is an interesting case, as data transported to the target application has to be formatted using NDR encoding before being fuzzed • NDR is flexible, and will often successfully decode badly formed data and structures due to the way in which NDR is formatted • NGS have written a naïve fuzzer which will construct well formed calls with random arguments to a given interface over TCP/Named Pipes/LPC • Disadvantage – many RPC applications are heavily state based and require knowledge of what is going on behind the scenes before getting deep inside the code

RPC Smart Fuzzing

• It is trivial to extract a large amount of data relating to the interface from the MIDL generated format string (compiled into the RPC client and server) • The format string is a byte representation of the interface used by the NDR functions to correctly decode and encode data • The extracted interface holds information about the data types expected, and context handles, etc • NGS have constructed an RPC fuzzer which can parse the IDL file and build structures and well formed calls to the interface, retaining context information – which assists in getting deeper inside of the application

What Kinds of Issues Have We Found?

• SQL UDP (used by the ‘Slammer’ worm) • NetDDE, LexPPS • Sun directory server • MySQL auth bypass • Apache chunked encoding

Typical Issues We See

• Stack overflow • Heap overflow • Off By One overflow • Integer wrap • Other Signedness Error • Canonicalisation issues

The Future

• Only some subclasses of flaw are amenable to automated discovery.

• Ultimately “security” is a human concept, as is “bug”. This is why, in our experience, skilled humans make better attackers than even the very best commercial toolset.

• Vulnerabilities, in the sense we’ve used the word today, aren’t going away in the foreseeable future.

An Overview of Vulnerability Research and Exploitation Peter Winter-Smith [[email protected]] Chris Anley [[email protected]]

Transcript An Overview of Vulnerability Research and Exploitation Peter Winter-Smith [[email protected]] Chris Anley [[email protected]]