NetShield: Matching a Large Vulnerability Signature for High Performance Network Defense

Download Report

Transcript NetShield: Matching a Large Vulnerability Signature for High Performance Network Defense

NetShield: Matching a Large
Vulnerability Signature
Ruleset for High Performance
Network Defense
Yan Chen
Department of Electrical Engineering and
Computer Science
Northwestern University
Lab for Internet & Security Technology (LIST)
1
http://list.cs.northwestern.edu
Background
NIDS/NIPS (Network Intrusion
Detection/Prevention System) operation
Signature
DB
Packets
NIDS/NIPS
`
`
`
Security • Accuracy
alerts
• Speed
• Attack Coverage
2
IDS/IPS Overview
State Of The Art
Regular expression (regex) based approaches
Used by: Cisco IPS, Juniper IPS, open source Snort
Example: .*Abc.*\x90+de[^\r\n]{30}
Pros
• Can efficiently match
multiple sigs
simultaneously,
through DFA
• Can describe the
syntactic context
Cons
• Limited expressive
power
• Cannot describe the
semantic context
• Inaccurate
4
Cannot combat Conficker!
State Of The Art
Vulnerability Signature [Wang et al. 04]
Blaster Worm (WINRPC) Example:
Vulnerability:
design flaws enable the bad
BIND:
inputs
lead the
to a bad state
rpc_vers==5
&&program
rpc_vers_minor==1
&& packed_drep==\x10\x00\x00\x00
Good
&& context[0].abstract_syntax.uuid=UUID_RemoteActivationstate
BIND-ACK:
Bad input&& rpc_vers_minor==1
rpc_vers==5
CALL:
rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00
Bad
Vulnerability
&& stub.RemoteActivationBody.actual_length>=40
&& matchRE(
state
Signature
stub.buffer, /^\x5c\x00\x5c\x00/)
Pros
• Directly describe
semantic context
• Very expressive, can
express the vulnerability
condition exactly
• Accurate
Cons
• Slow!
• Existing approaches all
use sequential matching
• Require protocol parsing
5
Speed
High
Motivation of NetShield
State of the
art regex Sig
IDSes
NetShield
Theoretical accuracy
limitation of regex
Low
Existing
Vulnerability
Sig IDS
Low
Accuracy
High
6
Motivation
• Desired Features for Signature-based
NIDS/NIPS
– Accuracy (especially for IPS)
– Speed
Cannot capture
vulnerability – Coverage: Large ruleset
condition well!
Regular
Expression
Vulnerability
Accuracy
Relative
Poor
Much Better
Speed
Good
??
Memory
OK
??
Coverage
Good
??
Shield
[sigcomm’04]
Focus of
this work
7
Vulnerability Signature Studies
• Use protocol semantics to express
vulnerabilities
• Defined on a sequence of PDUs & one
predicate for each PDU
– Example: ver==1 && method==“put” && len(buf)>300
Blaster
Worm
(WINRPC) Example:
• Data
representations
BIND:
– For&&
allrpc_vers_minor==1
the vulnerability&&
signatures
we studied, we only
rpc_vers==5
packed_drep==\x10\x00\x00\x00
need numbers and strings
&& context[0].abstract_syntax.uuid=UUID_RemoteActivation
BIND-ACK:
– number operators: ==, >, <, >=, <=
rpc_vers==5 && rpc_vers_minor==1
CALL:– String operators: ==, match_re(.,.), len(.).
rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00
&& stub.RemoteActivationBody.actual_length>=40 && matchRE(
stub.buffer, /^\x5c\x00\x5c\x00/)
8
Research Challenges
• Matching thousands of vulnerability
signatures simultaneously
– Regex rules can be merged to a single DFA,
but vulnerability signature rules cannot be
easily combined
– Sequential matching match multiple sigs.
simultaneously
• Need high speed protocol parsing
9
Outline
•
•
•
•
•
Motivation and NetShield Overview
High Speed Matching for Large Rulesets
High Speed Parsing
Evaluation
Research Contributions
10
NetShield Overview
11
Matching Problem Formulation
• Suppose we have n signatures, defined on k
matching dimensions (matchers)
– A matcher is a two-tuple (field, operation) or a fourtuple for the associative array elements
– Translate the n signatures to a n by k table
– This translation unlocks the potential of matching
multiple signatures simultaneously
Rule 4: URI.Filename=“fp40reg.dll” && len(Headers[“host”])>300
RuleID Method == Filename == Header == LEN
1
DELETE
*
*
2
POST
Header.php
*
3
*
awstats.pl
*
4
*
fp40reg.dll
name==“host”; len(value)>300
5
*
*
name==“User-Agent”; len(value)>544
12
Matching Problem Formulation
• Challenges for Single PDU matching
problem (SPM)
– Large number of signatures n
– Large number of matchers k
– Large number of “don’t cares”
– Cannot reorder matchers arbitrarily -buffering constraint
– Field dependency
• Arrays, associative arrays
• Mutually exclusive fields.
13
Matching Algorithms
Candidate Selection Algorithm
1.Pre-computation decides the rule order and
• Integer range checking 
matcher order
balanced binary search tree
• String exact
matching
Trie
2.Decomposition.
Match
each 
matcher
• Regex  DFA (XFA)
separately and iteratively combine the
results efficiently
14
Step 1: Pre-Computation
• Optimize the matcher order based on buffering
constraint & field arrival order
• Rule reorder:
1
Require
Matcher 1
Require
Matcher 1
Require
Matcher 2
Don’t care
Matcher 1
Don’t care
Matcher 1
&2
n
15
Step 2: Iterative Matching
PDU={Method=POST, Filename=fp40reg.dll,
Header: name=“host”, len(value)=450}
S1={2} Candidates after match Column 1 (method==)
S2=S1 A2+B2 ={2} {}+{4}={}+{4}={4}
S3=S2 A3+B3={4} {4}+{}={4}+{}={4}
Si  Ai 1
Don’t care
RuleID Method == Filename
== Header == LEN
R1
R2
R3
1
2
DELETE
SiPOST
* matcher i+1 *
Header.php
*
*
3
*
awstats.pl
4
*
fp40reg.dll
5
*
*
Si  Ai 1
require
In Ai+1 len(value)>300
name==“host”;
matcher i+1
name==“User-Agent”; len(value)>544
16
Complexity Analysis
Three HTTP traces:
avg(|Si|)<0.04
• Merging complexity
Two WINRPC
– Need k-1 merging iterations
traces: avg(|Si|)<1.5
– For each iteration
• Merge complexity O(n) the worst case, since Si can
have O(n) candidates in the worst case rulesets
• For real-world rulesets, # of candidates is a small
constant. Therefore, O(1)
– For real-world rulesets: O(k) which is the
optimal we can get
17
Refinement and Extension
• SPM improvement
– Allow negative conditions
– Handle array cases
– Handle associative array cases
– Handle mutual exclusive cases
• Extend to Multiple PDU Matching (MPM)
– Allow checkpoints.
18
Outline
•
•
•
•
•
Motivation
High Speed Matching for Large Rulesets.
High Speed Parsing
Evaluation
Research Contribution
19
Observations
• PDU  parse tree
• Leaf nodes are
numbers or strings
PDU
array
• Observation 1: Only need to parse the
fields related to signatures (mostly leaf
nodes)
• Observation 2: Traditional recursive
descent parsers which need one function
20
call per node are too expensive
Efficient Parsing with State Machines
• Studied eight protocols: HTTP, FTP, SMTP,
eMule, BitTorrent, WINRPC, SNMP and
DNS as well as their vulnerability signatures
• Common relationship among leaf nodes
Var
Var
derive
Var
Sequential
Branch
Loop
Derive
(a)
(b)
(c)
(d)
• Pre-construct parsing state machines based
on parse trees and vulnerability signatures
• Design UltraPAC, an automated fast parser
21
generator
Example for WINRPC
• Rectangles are states
• Parsing variables: R0 .. R4
• 0.61 instruction/byte for BIND PDU
R1-16
8 merge2
1 ncontext
3 padding
Bind-ACK
1
rpc_vers
1 rpc_ver_minor
R0 1
ptype
Header 1
pfc_flags
R0
4 packed_drep
Bind
R1 2 frag_length
6
merge1
merge3
R4
20*R4
2
ID
1 n_tran_syn
1 padding
16 UUID
4 UUID_ver
tran_syn
Bind-ACK
R2 ‹- 0
R3 ‹- ncontext
Bind
R2++
R2£R3
22
Outline
•
•
•
•
•
Motivation
High Speed Matching for Large Rulesets.
High Speed Parsing
Evaluation
Research Contributions
23
Evaluation Methodology
Fully implemented prototype
• 12,000 lines of C++ and
3,000 lines of Python
• Can run on both Linux and
Windows
Deployed at a university DC
with up to 106Mbps
• 26GB+ Traces from Tsinghua Univ. (TH), Northwestern (NU)
and DARPA
• Run on a P4 3.8Ghz single core PC w/ 4GB memory
• After TCP reassembly and preload the PDUs in memory
• For HTTP we have 794 vulnerability signatures which cover
973 Snort rules.
• For WINRPC we have 45 vulnerability signatures which cover
24
3,519 Snort rules
Parsing Results
TH
DNS
TH
NU
TH
WINRPC WINRPC HTTP
0.31
3.43
1.41
16.2
1.11
12.9
2.10 14.2 1.69
7.46 44.4 6.67
11.2
Max. memory per 15
11.5
15
11.6
15
3.6
14
Trace
Throughput
(Gbps)
Binpac
Our parser
Speed up ratio
NU
HTTP
3.1
14
DARPA
HTTP
3.9
14
connection
(bytes)
25
Matching Results
Trace
Throughput (Gbps)
Sequential
CS Matching
Matching only time
speed up ratio
TH
NU
TH
WINRPC WINRPC HTTP
NU
HTTP
10.68
14.37
9.23
10.61
0.34
2.63
2.37 0.28
17.63 1.85
4
1.8
11.3
11.7
1.48
27
0.033 0.038 0.0023
20
20
20
Avg # of Candidates 1.16
Max. memory per
connection (bytes)
27
DARPA
HTTP
8.8
26
Scalability and Accuracy Results
Rule scaling results
Throughput (Gbps)
0
1
2
3
4
Performance
decrease
gracefully
0
200
400
600
# of rules used
800
Accuracy
• Create two polymorphic
WINRPC exploits which
bypass the original Snort
rules but detect
accurately by our
scheme.
• For 10-minute “clean”
HTTP trace, Snort
reported 42 alerts,
NetShield reported 0
alerts. Manually verify
the 42 alerts are false
positives
27
Research Contribution
Make vulnerability signature a practical solution
for NIDS/NIPS
Regular Expression Exists Vul. IDS
NetShield
Accuracy
Poor
Good
Good
Speed
Good
Poor
Good
Memory
Good
??
Good
Coverage
Good
??
Good
• Multiple sig. matching  candidate selection algorithm
• Parsing  parsing state machine
• Achieves high speed with much better accuracy
Build a better Snort alternative!
28
Q&A
Thanks!
29
Comparing With Regex
• Memory for 973 Snort rules: DFA
5.29GB (XFA 863 rules1.08MB),
NetShield 2.3MB
• Per flow memory: XFA 36 bytes,
NetShield 20 bytes.
• Throughput: XFA 756Mbps,
NetShield 1.9+Gbps
(*XFA [SIGCOMM08][Oakland08])
30
Measure Snort Rules
• Semi-manually classify the rules.
1. Group by CVE-ID
2. Manually look at each vulnerability
• Results
– 86.7% of rules can be improved by protocol semantic
vulnerability signatures.
– Most of remaining rules (9.9%) are web DHTML and
scripts related which are not suitable for signature
based approach.
– On average 4.5 Snort rules are reduced to one
vulnerability signature.
– For binary protocol the reduction ratio is much higher
than that of text based ones.
• For netbios.rules the ratio is 67.6.
31
Matcher order
Si 1  Si  Ai 1  Bi 1
Reduce Si+1 Enlarge Si+1
Merging Overhead |Si| (use hash table to calculate
in Ai+1, O(1))
| Ai 1  Bi 1 | fixed, put the matcher later, reduce Bi+1
32
Matcher order optimization
• Worth buffering only if estmaxB(Mj)<=MaxB
• For Mi in AllMatchers
– Try to clear all the Mj in the buffer which
estmaxB(Mj)<=MaxB
– Buffer Mi if (estmaxB(Mi)>MaxB)
– When len(Buf)>Buflen, remove the Mj with
minimum estmaxB(Mj)
33
34
•
Backup Slides
35
Experiences
• Working in process
– In collaboration with MSR, apply the semantic
rich analysis for cloud Web service profiling.
To understand why slow and how to improve.
• Interdisciplinary research
• Student mentoring (three undergraduates,
six junior graduates)
36
Future Work
• Near term
– Web security (browser security, web server security)
– Data center security
– High speed network intrusion prevention system with
hardware support
• Long term research interests
– Combating professional profit-driven attackers will be
a continuous arm race
– Online applications (including Web 2.0 applications)
become more complex and vulnerable.
– Network speed keeps increasing, which demands
highly scalable approaches.
37
Research Contributions
• Demonstrate vulnerability signatures can
be applied to NIDS/NIPS, which can
significantly improve the accuracy of
current NIDS/NIPS
• Propose the candidate selection algorithm
for matching a large number of
vulnerability signatures efficiently
• Propose parsing state machine for fast
protocol parsing
38
• Implement the NetShield
Motivation
• Network security has been recognized as
the single most important attribute of their
networks, according to survey to 395
senior executives conducted by AT&T
• Many new emerging threats make the
situation even worse
39
Candidate merge operation
Si  Ai 1
Don’t care
matcher i+1
Si
Si  Ai 1
require
matcher i+1
In Ai+1
40
A Vulnerability Signature Example
• Data representations
– For all the vulnerability signatures we studied, we
only need numbers and strings
– number operators: ==, >, <, >=, <=
– String operators: ==, match_re(.,.), len(.).
• Example signature for Blaster worm
Example:
BIND:
rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00
&& context[0].abstract_syntax.uuid=UUID_RemoteActivation
BIND-ACK:
rpc_vers==5 && rpc_vers_minor==1
CALL:
rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00
&& stub.RemoteActivationBody.actual_length>=40 && matchRE(
stub.buffer, /^\x5c\x00\x5c\x00/)
41
System Framework
Accuracy &
Scalability &
Coverage
Sent out for
aggregation
Reversible
k-ary sketch
monitoring
Local
sketch
records
Remote
aggregated
sketch
records
Sketch based
statistical anomaly
detection (SSAD)
Part III
Streaming
packet
data
Signature
matching
Content-based
engines
signature matching
Token Based Signature
Generation (TOSG)
Protocol semantic
signature matching
To unused IP
blocks
Data path
Length Based Signature
Generation (LESG)
Network
Situational
Awareness
Honeynets/
Honeyfarms
Control path
Modules on
the critical
path
Modules on
the non-critical
path
Scalability
Part I
Sketchbased
monitoring
& detection
Accuracy &
adapt fast
Part II
Polymorphic
worm
signature
generation
Part IV
Network
Situational
Awareness
Accuracy &
adapt42fast
Example of Vulnerability Signatures
• At least 75%
vulnerabilities are due to
buffer overflow
Sample vulnerability
signature
• Field length
corresponding to
vulnerable buffer > certain
threshold
• Intrinsic to buffer overflow
vulnerability and hard to
evade
Overflow!
Protocol message
Vulnerable
buffer
43