General approach to exploit detection and signature generation White-box Gray-box

Transcript General approach to exploit detection and signature generation White-box Gray-box

General approach to exploit detection and
signature generation

White-box


Gray-box


Need the source code
More accurate. But need to monitor a
program's execution flow
Black-box

Detect and analyze an exploit using the
outputs of a vulnerable program.
Packet vaccine approach


A black-box approach.
Faster, but does not use much on data
format information.
ShieldGen approach



Gray-box approach
General Gray-box approach is inherently
specific to the attack input used in the data
flow analysis.
Generalize attack-specific symbolic predicatebased signatures to cover significantly more
attack variants with data format-informed
probing to the oracle in ShieldGen.
Packet Vaccine: Black-box Exploit
Detection and Signature Generation
Xiaofeng Wang, Zhuowei Li, Jun Xu, Michael
K. Reiter, Chongyung Kil, Jong Youl Choi
Presented by Zhaosheng Zhu
Outline







Introduction to Packet Vaccine
Related work
Design of the packet vaccine mechanism
Implementation and Evaluation
Application (Good Points)
Limitations (Bad Points)
Conclusion
Introduction to Packet Vaccine


The principle of vaccine
Packet vaccine:


Identify anomalous tokens in packet payloads
Randomize the contents of tokens to get a
vaccine

Generate a signature during exception
Design of the packet vaccine
mechanism
Design: 1. Vaccine Generation

Build a target address set:




T = [bs – aus, bs] U [bh, bh + auh] U S
Aggregate the application payloads of
the packets in one session into a
dataflow, carry out a proper decoding
For every byte session, do replacement
Construct vaccine packet using the new
data flows
Example
Design: 2. Exploit Detection and
Vulnerability Diagnosis


Correlate each byte sequence that
equals to the forensic string with the
exception
Validation test




Randomize all byte sequences
Generate new vaccine
Check
Repeat
Design: 3. Signature
Generation



Constructs packet vaccines or probes by
randomizing address-like strings
It detects exploit by observing memory
exception upon packet vaccine
injection
Generates signatures by finding in the
attack input the bytes that cannot take
random values
Byte-based vaccine injection

Can be paralleled at most cases
Implementation





Target address set is extracted from proc files
Process monitor is developed using ptrace
Kernel mode is necessary for CR2
Signature generation:
 Prober
 Verifier
Sequential vaccine injection (performance penalty)
Evaluation



Linux exploits
Windows-based exploits: Code Red II
Heap-based overflow
Evaluation

Comparison with MEP signatures



MEP signature contains richer information
Quality of MEP diminishes with the
availability for multiple exploit instances
and application information
MEP is slower
Application
An architecture to protect Internet servers using packet vaccine
Application (good points)

Fast



Effective


Up to an order of magnitude faster than
gray-box approaches
Not use source code
Immune to interference
Low overhead


No need to install anything on host
Lightweight Collector
Limitations

Its main probing scheme randomizes each byte
rather than leveraging data format information
 Works more reliably for text-based protocols than
the binary ones because of the lack of protocol
knowledge for binary data formats.
 Briefly mentioned the benefit of leveraging
protocol specifications.


Unclear what type of protocol specification language
considered and how protocol specifications leveraged.
Can only detect control flow hijacking attacks
 cannot detect exploits of the WMF vulnerability
Conclusion


Packet vaccine is a fast, blackbox
technique for exploit detection
But not good enough in some case. If
given input data format we have better
approach: ShieldGen.
ShieldGen: Automatic Data Patch Generation
for Unknown Vulnerabilities
with Informed Probing
Weidong Cui Marcus Peinado Helen J.
Wang Michael E. Locasto
Presented by Zhaosheng Zhu
Outline






What is ShieldGen
Related work and Comparison
System Design
Evaluation and Performance
Some future work
Conclusion
What is ShieldGen



A system for automatically generating a
data patch or a vulnerability signature
for an unknown vulnerability.
Leverage knowledge of the data format
Use data-patch instead of traditional
software patch.
SheildGen system overview
Related work




Poly-graph
 Significant false negatives and false positives
Nemean
 Generalization is dependent on the attack instance.
Covers
 Signatures does not contain any protocol context.
Packet vaccine
 Randomized each byte rather than leveraging data
format information. Not efficient enough.
 Can only detect control-flow hijacking attack
The Oracle: a Zero-Day Attack
Detector
Used the Vigilante’s zero-day detector
 Based on dynamic data flow analysis
 Implement three vulnerability condition



Arbitrary execution control (AEC)
Arbitrary code execution (ACE)
Arbitrary function arguments (AFA)
Data Format Spec and Data
Analyzer

Two assumptions to the input data



Data formats are known
No encryption or obfuscation are used.
Two types of analyzers


File data: application level protocol, host-based
Network data



High-speed parsing w/ preprocessed protocol parser
E.g., binpac and GAPA
We use GAPA as our Data analyzer
System design

Design goals



No false positive
Minimizing the number of false negatives
Minimizing the number of probes.
Data patch generation
Some methods to reduce
probes



Recognizing iterative elements
Obeying protocol semantics and
reduce illegitimate probes.
High possibility that the vulnerability
predicate is only dependent on the
last message
Probe generation algorithm

Three Steps



Buffer Overrun heuristic for character strings
Iteration removal
Eliminating irrelevant field conditions
Buffer overrun heuristics

If the offending byte lies in the middle of a byte
or unicode string then ShieldGen diagnoses a
buffer overrun and adds the following condition
as a refinement:
sizeof(buffer) > offendingByte offset –
bufferStart offset
Iteration removal



Many popular input formats include
arbitrary sequences of largely independent
elements (Records). Any input which
contains a malicious record is an attack.
Generating probes with removing some of
the iterative elements.
Iterative elements can be removed if probes
still exploit successfully.
Eliminating irrelevant field
conditions


Constructing probes over the remaining
data fields to eliminate don’t-care fields and
to find additional values of the data fields
for which the attack succeeds.
Evaluating one field at one time
Evaluation

Run ShieldGen for three well known
vulnerabilities



SQL vulnerability
RPC vulnerability
WMF (Window Metafile) vulnerability
Filter quality of ShieldGen

For a larger sample of real-world
vulnerabilities
Failure cases and analysis



Complex conditions
Unchecked array indices
Other special cases
Future work

Quality of the data format specification


In our scheme the quality of data format
specification matters.
Complex filter conditions
Future work

Probing time


Reference VM is preferred
Attacks not delivered by the last
message
Conclusion


Leverage data information to construct
new attack instance
Generate high quality vulnerability
signatures


Fewer don’t care fields
Fewer false negatives
Thanks!

General approach to exploit detection and signature generation White-box Gray-box

Transcript General approach to exploit detection and signature generation White-box Gray-box

Directory