Security services and the IXP Wu-chang Feng Systems Software Laboratory

Download Report

Transcript Security services and the IXP Wu-chang Feng Systems Software Laboratory

Security services and the IXP
Wu-chang Feng
[email protected]
Systems Software Laboratory
Dept. of Computer Science and Engineering
About the project..
• 6 months old
– Just started, pardon the vapor
• Supported by Intel (12/2001) and ETIC (4/2002)
– Graduate Students
• Francis Chang: [email protected]
• Deepa Srinivasan: [email protected]
• Jin Choi (1/2003): [email protected]
– Undergraduate Interns from Charles Consel’s group
• Ludovic Martorel
• Damien Berger
Talk outline
•
•
•
•
IXP and network security research
Packet classification
Packet classification caching strategies
Curriculum
The IXP and network security
research
A research opportunity
• IXP
– Provides an open high-speed networking platform
– Research enabler
• Analyzing packet classification/routing algorithms
• Analyzing packet classification/routing lookup caching
algorithms
• Security functions
– Sandbox to test and compare algorithms on a real platform
IXP and research
• Quickly becoming the ns of experimental networking
systems
– Open hardware
– Open software
• What’s needed?
• A library of reference implementations and benchmarks
–
–
–
–
IP route lookup (longest-prefix match) algorithms
General packet classification algorithms
Route and classification lookup caching algorithms
Security functions
Our focus: Security
• Borrow and use liberally…
–
–
–
–
–
Princeton (VERA)
Columbia (NetBind)
Georgia Tech (IDS)
Utah (Emulab)
Others..
• Build what’s missing
– Range of full packet classifiers
– Range of lookup caching algorithms
– Merging the goals of research and education
• A security-focused IXP laboratory course
• Eventually, examine additional security services
– Anomaly detection
– Content filtering
– etc.
Packet classification
Student: Deepa Srinivasan
Packet classification
• Use the IXP and open-source tools to
– Compare full, packet classification algorithms
– Benchmark algorithms via real rule sets and real traffic traces
– Explore adaptive packet classifiers
A hard, but well-studied problem
• What are the key issues?
– Storage
– Search time
– Update time
• General filter matching problem ~ Problems in
computational geometry
– N=number of filters or rules, d=number of dimensions
– Requires
• O(log N) time with O(Nd) space
OR
• O((log N)(d-1) time with O(N) space
• Classic space-time tradeoff problem
A space-time tradeoff example
• Hierarchical tries: slow and compact
• Set-pruning tries: fast and large
Hierarchical Trie
(Figure should terminate at R2)
Set-pruning Trie
A space-time tradeoff example
• Hierarchical tries vs. Set-pruning tries (worst-case)
Algorithm
Time
Storage
Updates
Linear Search
N
N
1
Hierarchical trie
Wd
NdW
d2W
Set-pruning trie
dW
Nd
Nd
Notes
Simple, poor scaling, iptables
Backtracking search
Fast retrieval at the cost of storage. Good for
relatively static classifiers.
N – Number of Rules W – Width of dimension d – Number of dimensions
Packet classification
• Approaches
– Generic classifiers
• Optimized for best worst-case performance
– Heuristic classifiers
• Take advantage of structure in rule sets (as done with IP
router lookups)
• Tradeoff speed, storage, and update time in the worst case
for speed and storage in the common case
– Hardware classifiers
• Throw hardware and parallel processing at the problem
• Serves as a wish-list for the IXP
– Is a hardware-based packet classification engine in the works?
– Can I go home?
– Will I need to shoot myself when the IXP4xxx comes out?
So many algorithms, so little time…
• Which one to choose?
–
–
–
–
–
–
–
–
–
–
Hierarchical tries with backtracking search
Set-pruning tries
Bit vector, Fractional cascading [Lakshman98]
Aggregated bit vector [Baboescu00]
Grid of tries, Cross-producting [Srinivasan98]
Area-based quadtrees [Buddhikot99]
Fat inverted segment tree [Feldman00]
Tuple-space search [Srinivasan99]
Recursive flow classification [Gupta99]
Hierarchical intelligent cuttings [Gupta00]
• Performance and cost a function of
–
–
–
–
d = number of dimensions
W = width of dimensions
N = number of rules
l = number of levels in tree (FIS-tree only)
Summary of schemes [Gupta00]
Algorithm
Notes
Time
Storage
Updates
Linear Search
N
N
1
Hierarchical trie
Wd
NdW
d2 W
Set-pruning trie
Cross-producting
dW
Nd
Nd
W d-1
NdW
NdW
aW
NW
a Sqrta(N)
(l + 1) W
l x N1 + 1/l
--
Tree must be recomputed on update
RFC
d
Nd
---
Not suitable for large sets of rules (> 6000); preprocessing and large storage space. 10Gbps line rates
in hardware and 2.5Gbps rates in software.
Hierarchical Intelligent
Cuttings
d
Nd
---
Parameters can be tuned to trade-off query time
against storage requirements.
Tuple-space search
M
N
1
Performs well for multiple dimensions if the number
of tuples (i.e. hash entries) are small. Only supports
prefixes; generic rules increase storage complexity.
Ternary CAM
1
N
1*
Simple; Good for small classifiers; Costly
dW +
N/memwidth
dN2
---
Incremental updates not supported; Good for multiple
dimension and a small number of rules
Grid-of-tries
AQT
FIS-tree
Bit vector
Simple, poor scaling
Fast retrieval at the cost of storage. Good for
relatively static classifiers.
Rebuild for each update; Could be used for last 2
dimensions of a multi-dimensional hierarchical trie.
a is a tunable integer parameter
N=# of rules, W=Width of dimensions, d=# of dimensions, l=levels of tree, M=# of Tuples
Is there a winner?
• Not really, it depends on….
–
–
–
–
Rule sets
Incoming traffic characteristics
Metric desired (average vs. worst-case lookup time)
Hardware cost (memory, ternary CAM)
• How much chip area did that 16-entry CAM on the
IXP2xxx take?
Adaptive packet classifiers
• Hypothesis
– Value in adaptation
– Reconfigure for high-speed based on amount of memory and rule set given
a fixed hardware configuration and performance metric
• Approach
– Implement a small set of classifiers
– Build modules that translate ipchains/iptables/netfilter rule sets into data
structures of individual classifiers
– Study adaptation policies for classifiers based on rule analysis
– Implement seamless switching between implementations (i.e. double
buffering [Partridge98])
– Performance evaluation using
• Library of publicly available rule sets
• Public traffic trace
• An Emulab with loadable IXPs 
Classification lookup caching
Student: Francis Chang
Caching and IP route lookups
• IP destination-based routing
– A one-dimensional packet classifier
• Caching instrumental in building gigabit IP routers
– Full lookup extremely expensive to support at high rates
– Cache of 12,000 entries gives 95% hit rate [Jain86, Feldmeier88,
Heimlich90, Jain90, Newman97, Partridge98]
– “A 50 Gb/s IP Router” [Partridge98]
• Switched interconnection fabric
• Alpha 21164-based forwarding cards (separate from line cards)
• First-level on-chip caches Icache=8kB (2048 instructions), Dcache=8kB
• Secondary on-chip cache=96kB
– Fits 12000 entry route cache in memory
– 64 bytes per entry presumably due to cache line size
• Tertiary cache=16MB (full, double-buffered route table)
Caching and multi-dimension lookups
• Flow-based firewalls
– A five-dimensional packet classifier
• Caching even more important
– Full classification algorithms will not run anywhere near linespeed on the current incarnation of the IXP
– Inherently harder to do
– Much lower hit rates [Xu00]
– Rule and traffic dependent
Current approaches
• Direct-mapped hashing with LRU replacement
– Typical for IP route caches [Partridge98]
• Parallel hashing and searching with set-associative
hardware [Xu00]
– ASIC solution with parallel processing and a fixed, LRU
replacement scheme
• Proprietary vendor solutions
– ?
Class-based caching
• Structure of application traffic can provide useful information
• W. Feng, F. Chang, W. Feng, J. Walpole, “Provisioning On-line
Games: A Traffic Analysis of a Busy Counter-Strike Server”
– Packet load of an on-line game server over 10ms intervals
Observations
• Game traffic
–
–
–
–
–
–
Large number of periodic packets
Extremely small packet sizes
Persistent flows
Small number of clients per server
Without caching, a packet classification disaster
With caching, a poster-child for LFU replacement?
• Web traffic
– Bursty, heavy-tailed packet arrival
– Many more clients per server
– Small number of packets per flow
Goal of study
• Attack the packet classification caching problem
• Resource requirements and data structures for high
performance packet classification caches
• “Segregate, Hash, and Cache”
– Understand traffic characteristics
– Examine hierarchical class-based partitioning of cache
– Examine class-based partitioning of classification function (i.e.
MEv2)
– Examine alternative replacement algorithms per class such as
LFU
Curriculum
Student: Jin Choi
An IXP course for OGI/OHSU
• Goal
– Spread the IXP gospel
– Provide students with experience on a modern networking
platform
• Train (and test drive) potential Ph.D. students
• Train future Intel employees
– 171 OGI/OHSU alums @ Intel
– Intel is the single largest employer of OGI/OHSU graduates
Approach
• Ask for help
– Dirk & Raj (PCs, IXP boards, and support)
– Ken Mackenzie (course material and advice)
• Keep it simple
• Align with security research project
• Ask for feedback
– Curriculum completed
– Guide and slide presentation available at
http://www.cse.ogi.edu/~wuchang/ixp/
– Course will be offered as CSE58?: Networking Practicum
– Scheduled for Spring 2003
The course itself
• Errata
– Weekly 3-hour sessions
– Dedicated laboratory of 10 IXP workstations
• Cloned via Norton Ghost
• Week #1
– Conceptual framework
– IXP architecture
• Hardware: StrongARM, memory resources, micro-engines
• Software: ACEs, microACEs
• Week #2
– Introduce Linux/Windows2000/VMware, and the IXP platform
– Remedial Linux network administration material
• ifconfig, route, netstat, ipchains, ping, traceroute, arp etc.
– Learn the IXP environment setup/configuration
• Building core components on Linux using standard GNU toolchain
• Building microcode using microengine toolchain on Windows2000
The course itself (cont.)
• Week #3
– Build and run the L3 forwarder application
• Test with external sources and sinks
• Week #4
– Add a packet counter to the L3 forwarder
• Makes sure that everyone with a CS degree from OGI/OHSU has
programmed in assembly code at some point.
• Week #5
– In-line port filter
• Add microcode to block TCP segments based on destination port
– Code review of L3 forwarder to design full port filter
The course itself (cont.)
• Week #6: continued
The course itself (cont.)
• Week #6
– Full port filtering functionality
• Pass port numbers to be blocked as arguments
• SRAM management (allocation and initialization of multistride trie in the core component, access to data structure
from the microengine)
• Add logic in core component to handle port filtering of
exceptional packets
The course itself (cont.)
• Week #7-#10
– Propose and implement functions of their own for a final project
• Packet classifiers
• Classification lookup caching
Questions
Future work
• Support for high-speed intrusion and anomaly detection
(E-boxes and A-boxes)
– Content-based filters
• Basic network-level filters (Snort)
• Application-specific filters (Bro)
– Usage-based filters
• Accounting
• Logging
What makes sense on an IXP?
• Function-based decomposition used in security
– Common Intrusion Detection Framework (CIDF) [Porras01]
• Event generators (E-boxes)
– produce entries based on filtered activities
• Event databases (D-boxes)
– store events in a persistent manner
• Event analyzers (A-boxes)
– synthesize higher-level activity based on individual range of events
• Response units (R-boxes)
– perform actions based on events