An active queue management scheme to contain high

Download Report

Transcript An active queue management scheme to contain high

NetViewer: A Network Traffic
Visualization and Analysis Tool
Seong Soo Kim
A. L. Narasimha Reddy
Electrical and Computer Engineering
Texas A&M University
USENIX LISA’05
Contents
•
•
•
•
•
•
Introduction and Motivation
Our Approach
NetViewer’s Architecture
NetViewer’s Functionality
Evaluation of Netviewer
Conclusion
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
2
Attack/ Anomaly
- Single attacker (DoS)
- Multiple Attackers (DDoS)
- Multiple Victims (Worms, viruses)
Aggregate Packet header data as signals
Image based anomaly/attack detectors
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
3
Motivation (1)
• Previous studies looked at individual flows
behavior
These become ineffective with DDoS 
Aggregate Analysis
• Link speeds are increasing
- currently at G b/s, soon to be at 10~100 G b/s
Need simple, effective mechanisms
• Packet inspection can’t be expensive
• Can we make them simple enough to implement
them at line speeds?
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
4
Motivation (2)
• Signature (rule)-based approaches are
tailored to known attacks
- Become ineffective when traffic patterns
or attacks change
New threats are constantly emerging
Quick identification of network anomalies
is necessary to contain threat
• Can we design general mechanisms for
attack detection that work in real-time?
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
5
Our Approach (1)
• Look at aggregate information of traffic
- Collect data over a large duration (order of
seconds)
- Can be higher if necessary
• Use sampling to reduce the cost of
processing
• Process aggregate data to detect anomalies
- Individual flows may look normal  look at
the aggregate picture
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
6
Our Approach (2) - Environment
A.D.
Internet
A.D.
ingress
router
egress
router
core
router
Victims
Attackers
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
7
NetViewer’s Architecture
•
Packet Parser : Collects and filters raw packets and traffic data from packet
header traces or NetFlow records.
•
Signal Computing Engine : Analyzes the statistical properties of aggregate
traffic distributions.
•
Detection Engine : Thresholds setting through statistical measures of traffic
signal.
•
Visualization Engine : Employing image processing , and displaying traffic
signals and images
•
Alerting Engine : Attacks and anomalies are detected/identified in real-time
Network
Traffic
Packet Parser
Statistical Analysis
&
Anomaly Detection
Visualization
&
Alerting
Detection
Report
The block diagram of NetViewer
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
8
Packet Parser (1)
• Packet headers carry a rich set of information
- Data : Packet counts, byte counts, the number of flows
- Domain : Source/destination address, source/destination Port
numbers, protocol numbers
• Processing traffic header poses challenges.
- Discrete spaces
- Large Domains
- 232 IPv4 addresses
- 216 Port numbers
Need Mechanisms to reduce the domain size
Need Mechanisms to generate useful signals
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
9
Packet Parser (2) –
Data structure for reducing domain size
• 2 dimensional arrays count[i][j]
- To record the packet count for the address j in ith field of the IP address
• Normalized packet counts
pijn 
i 0,1,2,3
count[i][ j ][n]
,
255
j 0 count[i][ j ][n] j 0,..,255
• Effects
- Constant, small memory regardless of the packets, 232 (4G)  4*256 (1K)
- Running time O(n) to O(lgn)
- Somewhat reversible hash function
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
10
Packet Parser (3) –
Data structure for reducing domain size
• Simple example
0
64
128
192
255
3
3
3
3
•
IP of Flow1 = 165. 91. 212. 255,
IP of Flow2 = 64. 58. 179. 230,
IP of Flow3 = 216. 239. 51. 100,
IP of Flow4 = 211. 40. 179. 102,
IP of Flow5 = 203. 255. 98. 2,
Seong Soo Kim & A.L.Narasimha Reddy
Packet1 = 3
Packet2 = 2
Packet3 = 1
Packet4 = 10
Packet5 = 2
Texas A&M University
USENIX LISA’05
11
Packet Parser (3) –
Data structure for reducing domain size
• Simple example
0
64
128
2
2
10
1
2
•
3
255
2 10 1
3
1
2
12
1
2
3
2
10
IP of Flow1 = 165. 91. 212. 255,
IP of Flow2 = 64. 58. 179. 230,
IP of Flow3 = 216. 239. 51. 100,
IP of Flow4 = 211. 40. 179. 102,
IP of Flow5 = 203. 255. 98. 2,
Seong Soo Kim & A.L.Narasimha Reddy
192
3
Packet1 = 3
Packet2 = 2
Packet3 = 1
Packet4 = 10
Packet5 = 2
Texas A&M University
USENIX LISA’05
12
Signal Computing Engine
•
Correlation
- To measure the strength of the linear relationship between adjacent sampling
instants
Cijn 
•
i 0,1,2,3
count[i][ j ][n]
count[i][ j ][n 1]

,
255
255
j 0 count[i ][ j ][n]  j 0 count[i ][ j ][n 1] j 0,..,255
Delta
– The difference of traffic intensity
– It is remarkable at the instant of beginning and ending of attacks
pijn 
•
i 0,1,2,3
count[i][ j ][n]
count[i][ j ][n 1]

,
255
255
j 0 count[i ][ j ][n]  j 0 count[i ][ j ][n 1] j 0,..,255
Scene change Analysis
– Variance of pixel intensities in the image
1
2
 1 3 255
2
  ( pijn  pijn ) 
S  
1024i 0 j 0

, where pijn are pixel intensitie
s and pijn 
Seong Soo Kim & A.L.Narasimha Reddy
1 3 255
  pijn
1024i 0 j 0
Texas A&M University
USENIX LISA’05
13
Detecting Engine –
Threshold setting
• From generated distribution
signals (S), derive statistical
thresholds
- High threshold TH : Traffic
distribution less correlated than
usual
- Low threshold TL : Traffic
distribution more uniform than
usual
X ~ N (  ,  2)  P r.(   3.0  X    3.0 )  99.7%
semi-random , if S  TH

traffic status normal,
if TL  S  TH
random,
if S  TL

Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
14
Visualization Engine
• Treat the traffic data as images
• Apply image processing based analysis
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
15
Image Generation
0
1
..........
14
15
0
0
0
1
..........
0
254
0
255
16
17
..........
30
31
1
0
1
1
..........
1
254
1
255
..........
..........
IP byte 0
(source IP address,
destination IP address)
..........
..........
..........
..........
IP byte 0
224
225
..........
238
239
254
0
254
1
..........
254
254
254
255
240
241
..........
254
255
255
0
255
1
..........
255
254
255
255
IP byte 1
IP byte 0
IP byte 1
IP byte 2
IP byte 3
IP byte 2
IP byte 3
source IP address
IP byte 0
destination IP address
(a) 1 dimension
(b) 2 dimension
Figure 2. The visualization of network traffic signal in IP address
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
16
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
17
Generated various traffic Images
• Image reveals the
characteristics of traffic
– Normal behavior mode
– A single target (DoS)
– Semi-random target : a
subnet is fixed and other
portion of address is change
(Prefix-based attacks)
– Random target :
horizontal (Worm) and
vertical scan (DDoS)
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
18
Alerting Engine
• Scrutinize the statistical quantities – correlation and
delta
• Identify the IP addresses of suspicious attackers and
victims
• Lead to some form of a detection signal
• Generate the detection report
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
19
NetViewer’s Functionality
• Traffic Profiling
– General information of current network traffic
• Monitoring
– Monitor traffic distribution signal (S) over the latest time-window
• Anomaly Reporting
– Image-based traffic in the source/destination IP address domain
and the 2-dimensional domain
• Auxiliary Function
– Multidimensional Image
– Attack Tracking
– Automatic Spoofed Address Masking
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
20
Traffic Profiling Function (1)
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
21
Traffic Profiling Function (2)
•
Understanding the general nature of the traffic ay the
monitoring point
• Bandwidth in Kbps and Kpps (packet per sec.)
• Protocol : the proportion occupied by each traffic
protocol in percent
• Top 5 flows : the topmost 5 flows in packet count or
byte count or flow number
– Based on LRU (least Recently Used) policy cache
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
22
Monitoring Function (1)
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
23
Monitoring Function (2)
•
Traffic distribution signal (S) over the latest time-window
-
•
•

3 kinds of selected signals – S of packet count, S of byte count, S of flow count
Source IP : packet count distribution signal in the source IP address domain
Source FLOW : the number of flow distribution signal in the source IP address
domain
Source PORT : packet count distribution signal in the source IP port domain
MULTIDIMENSIONAL : multiple components of the above signals in source
domain
Pr : the anomalous probability of current traffic under Gaussian distribution

1
  (p p ) 
Signal : the distribution signal computed by S  1024

– illustrated with dotted vertical lines of 3 level
 and  : mean value and standard deviation of distribution signal using
EWMA
Seong Soo Kim & A.L.Narasimha Reddy
1
2 2
3 255

Texas A&M University
i 0 j 0
ijn
ijn
USENIX LISA’05
24
Anomaly Reporting Function (1)
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
25
Anomaly Reporting Function (2)–
normal network traffic
• Use variance of pixel
intensities
– Distribution of traffic
over the observed
domain
• During anomalies, the
traffic distributions
different from normal
traffic
– Higher correlation (DOS)
– Lower correlation
(worms)
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
26
Anomaly Reporting Function (3)–
semi-random targeted attacks
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
27
Anomaly Reporting Function (4)–
random targeted attacks
• Worm propagation type attack
• DDoS propagation type attack
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
28
Anomaly Reporting Function (5)–
complicated attacks
• Complicated and mixed attack pattern
• The horizontal (dotted or solid) line => specific source scanning
destination addresses.
• The vertical line => random sources assail specific destination
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
29
Anomaly Reporting Function (6)–
Summary of Visual representation of traffic
• Worm attacks – horizontal line in 2D image
• DDoS attacks – vertical line in 2D image
 Line detection algorithm
• Visual images look different in different traffic modes
• Motion prediction can lead to attack prediction
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
30
Anomaly Reporting Function (7)
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
31
Anomaly
Reporting
Function (7)Identification
• Identify IP using
statistical measures
• Black list
**************************************************************
[ Time : Tue 10-14-2003 05:12:00 ]
-------------------------------------------------------------Source IP[1] 134.
correlation = 17.48% possession = 18.77% delta = 2.50%
Source IP[1] 141.
correlation = 4.33% possession = 3.94% delta = 0.79%
Source IP[1] 155.
correlation = 58.20% possession = 56.80% delta = 2.84%
Source IP[1] 210.
correlation = 5.66% possession = 6.51% delta = 1.60%
Source IP[2]
75.
correlation = 17.47% possession = 18.77% delta = 2.51%
Source IP[2]
110.
correlation = 4.62% possession = 5.25% delta = 1.21%
Source IP[2]
223.
correlation = 4.31% possession = 3.94% delta = 0.78%
Source IP[2]
230.
correlation = 58.21% possession = 56.84% delta = 2.76%
Source IP[3]
7. correlation = 15.59% possession = 17.02% delta = 2.74%
Source IP[3]
14. correlation = 53.99% possession = 52.31% delta = 3.41%
Source IP[4]
41 correlation = 15.16% possession = 16.36% delta = 2.30%
Source IP[4]
50 correlation = 52.58% possession = 50.83% delta = 3.54%
-------------------------------------------------------------Identified No. 1st = 4, 2nd = 4, 3rd = 2, 4th = 2
==============================================================
Destination IP[1] 18.
correlation = 4.37% possession = 3.88% delta = 1.01%
Destination IP[1] 128.
correlation = 6.08% possession = 7.01% delta = 1.75%
Destination IP[1] 131.
correlation = 53.65% possession = 52.33% delta = 2.67%
Destination IP[2] 181. correlation = 56.03% possession = 54.00% delta = 4.15%
Destination IP[4]
26 correlation = 3.89% possession = 3.58% delta = 0.65%
-------------------------------------------------------------Identified No. 1st = 3, 2nd = 1, 3rd = 0, 4th = 1
==============================================================
* Identified Suspicious Source IP address(es)
134. 75. 7. 41 correlation = 17.48% possession = 18.77% delta = 2.50%
141.223.xxx.xxx correlation = 4.33% possession = 3.94% delta = 0.79%
155.230. 14. 50 correlation = 58.20% possession = 56.80% delta = 2.84%
210.xxx.xxx.xxx correlation = 5.66% possession = 6.51% delta = 1.60%
------------------------* Identified Suspicious Destination IP address(es)
18.xxx.xxx.xxx correlation = 4.37% possession = 3.88% delta = 1.01%
128.xxx.xxx.xxx correlation = 6.08% possession = 7.01% delta = 1.75%
131.181.xxx.xxx correlation = 53.65% possession = 52.33% delta = 2.67%
**************************************************************
The detection report of anomaly identification.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
32
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
Flow-based Network Traffic
• The number of flows
based visual
representation
– The number of flows in
address domain.
– The black lines
illustrate more
concentrated traffic
intensity.
– An analysis is effective
for revealing flood
types of attacks.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
33
Port-based Network Traffic
• Port number based
visual
representation
– Normalized packet
counts in portnumber domain.
– An analysis is
effective for
revealing portscan
types of attacks.
• Normal network traffic
• Attack traffic: SQL Slammer worm
• 0d 1434 = 0x 059A = 0d 5 + 0d 154
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
34
Multidimensional Visualization
• Study multi-dimensional signals
in IP address
i) packet counts  R
ii) number of flows  G
iii) the correlation of packet
counts  B
• Comprehensive characteristics.
• Diverse analysis.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
35
Evaluation in Address-based signals
Time
D.
TP b 1
FP a 2
Real-time
SA
81.5%
637/782
0.06%
2/3563
DA
87.1%
681/782
0.42%
15/3563
(SA,
DA)
94.2%
737/782
0.48%
17/3563
NP b 3
NP a 4
LR 5
NLR 6
0.15%
1451.2/
508.7
0.19/
0.24
88.4%
0.15%
206.9/
589.3
0.13/
0.12
_
_
197.5
0.06
76.3%
1. True Positive rate by 3, the number of detection / the number of anomalies.
2. False Positive rate by 3
3. Expected true positive rate by NP test
4. Expected false positive rate by NP test
5. Likelihood Ratio in measurement by 3 / LR in NP test
6. Negative Likelihood Ratio by 3 / NLR in NP test
• NP Test shows a little high performance than 3
• 2 dimensional is better than 1 dimensional.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
36
Port-based signals
Time
D.
TP b
FP a
Real-time
SP
83.4%
652/782
0.14%
5/3563
DP
96.2%
752/782
0.17%
6/3563
(SP, DP)
96.8%
757/782
0.25%
9/3563
NP b
NP a
LR
NLR
0.07%
594.1/
1428.8
0.17/
0.05
90.5%
0.14%
571.1/
630.4
0.04/
0.09
_
_
383.2
0.03
94.9%
• Port-based signal could be a powerful signal
• Particularly useful for probing/scanning attacks
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
37
Multidimensional signals
Time
Realtime
Post
mortem
D.
TP b
FP a
LR
NLR
(S, D)
97.1%
759/782
0.62%
22/3563
157.2
0.03
(S, D)
97.4%
762/782
0.34%
12/3563
289.3
0.03
• Combined with three distinct image-based signals :
address-based, flow-based and port-based
• Improve the detection rates considerably
• It is possible to detect complicated attacks using various
signals
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
38
Attack Tracking - Motion prediction
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
39
Automatic Spoofed address Masking
• Unassigned by IANA – especially, 1st byte
• Blue-colored polygons indicate the reserved IP addresses –
there should be no pixels matching the unassigned space
• Destination IP : normal traffic
• Source IP : SQL slammer using (randomly) address
spoofed traffic
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
40
Comparison with IDS
• Intrusion detection system (IDS) is signature-based
compared to our measurement-based.
– Compares with predefined rules
– Need to be updated with the latest rules.
• Snort as representative IDS.
• Both show similar detection on TAMU trace.
• Snort is superior in identification
– But missed heavy traffic sources and new patterns
– Required more processing time.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
41
Advantages
• Not looking for specific known attacks
• Generic mechanism
• Works in real-time
– Latencies of a few samples
– Simple enough to be implemented inline
• Window and Unix versions are released at
http://dropzone.tamu.edu/~skim/netviewer.html
• Comments to
[email protected] or
[email protected]
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
42
Conclusion
• We studied the feasibility of analyzing packet header data
as Images for detecting traffic anomalies.
• We evaluated the effectiveness of our approach for realtime modes by employing network traffic.
• Real-time traffic analysis and monitoring is feasible
– Simple enough to be implemented inline
• Can rely on many tools from image processing area
– More robust offline analysis possible
– Concise for logging and playback
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
43
Thank you !!
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
44
Bloom Filtering
Identification (2)
: Entire IP address level
• Step 1: Employ 4 independent hash
functions as a Bloom filter, h1(am),
h2(am), h3(am), h4(am).
Detect anomaly
?
Y
Select an identified entry
in 1st byte,
n=2
n=n+1
Find the nearest identified
entry in the nth byte
• Step 2: Concatenation of suspicious IP
bytes using e-vicinity.
Continue to the 4th byte.
•
Step 3: Membership query of
generated 4-byte IP address
Difference of
1st and nth < 20%
of 1st byte entry ?
N
Y
Concatenate
Represent by xxx
n=4
?
N
Search all identified
entries in 1st byte ?
N
Y
Y
 Automatic containment for identified
attacks
Query membership :
concatenated 4 byte &
stored entries in Bloom filter
?
N
Discard
Y
End
The Flowchart in concatenation of Identification.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
45
Processing and memory complexity
• Two samples of packet header data 2*P, P is the size of the
sample data
• Summary information (DCT coefficients etc.) over
samples S
• Total space requirement O(P+S)
• P is 232  4*256 = 1024 (1D), 264  256K (2D)
• S is 32*32  16
 Memory requires 258K
• Processing O(P+S)
• Update 4 counters per domain
• Per-packet data-plane cost low.
Seong Soo Kim & A.L.Narasimha Reddy
Texas A&M University
USENIX LISA’05
46