Transcript X1 Application: Computational Fluid Dynamics
MINDS: Data Mining Based Network Intrusion Detection System
Vipin Kumar
[email protected]
Army High Performance Computing Research Center University of Minnesota
http://www.cs.umn.edu/research/minds/ Team Members: Eric Eilertson, Paul Dokas, Levent Ertoz, Ben Mayer, Aleksandar Lazarevic, Michael Steinbach, George Simon, Varun Chandola, Mark Shaneck, Jaideep Srivastava, Zhi-Li Zhang, Yongdae Kim, Vipin Kumar
AHPCRC
1
Information Assurance
Sophistication of cyber attacks and their severity is increasing
ARL, the Army, DOD and Other U.S. Government Agencies are major targets for sophisticated state sponsored cyber terrorists
Cyber strategies can be a major force multiplier and equalizer
Across DoD, computer assets have been compromised, information has been stolen, putting technological advantage and battlefield superiority at risk
90000 80000 70000 60000 50000 40000 30000 20000 10000 0 2 3 4 5 6 7 8 9 10 11 12 13 Incidents Reported to Computer Emergency Response Team/Coordination Center
Security mechanisms always have inevitable vulnerabilities
Firewalls are not sufficient to ensure security in computer networks
Insider attacks
Spread of SQL Slammer worm 10 minutes after its deployment
AHPCRC
2
Information Assurance
•
Intrusion Detection System
–
Combination of software and hardware that attempts to perform intrusion detection
–
Raises the alarm when possible intrusion happens Traditional intrusion detection system IDS tools are based on signatures of known attacks
Limitations
–
Signature database has to be manually revised for each new type of discovered intrusion
–
Substantial latency in deployment of newly created signatures across the computer system
– –
They cannot detect emerging cyber threats Not suitable for detecting policy violations and insider abuse
– –
Do not provide understanding of network traffic Generate too many false alarms
Example of SNORT rule
( MS SQL “Slammer” worm ) any -> udp port 1434 (content:"|81 F1 03 01 04 9B 81 F1 01|"; content:"sock"; content:"send")
www.snort.org
AHPCRC
3
Data Mining for Intrusion Detection
Increased interest in data mining based intrusion detection
– –
Attacks for which it is difficult to build signatures Unforeseen/Unknown/Emerging attacks
•
Misuse detection
–
Building predictive models from labeled labeled data sets (instances are labeled as “normal” or “intrusive”) to identify known intrusions
–
High accuracy in detecting many kinds of known attacks
–
Cannot detect unknown and emerging attacks
•
Anomaly detection
–
Detect novel attacks as deviations from “normal” behavior
–
Potential high false alarm rate - previously unseen (yet legitimate) system behaviors may also be recognized as anomalies
AHPCRC
4
Data Mining for Intrusion Detection
Tid
SrcIP
Training Set
Start time Dest IP Dest Port Number of bytes Attack
139 192
No
1 206.135.38.95 11:07:20 160.94.179.223 2 206.163.37.95 11:13:56 160.94.179.219 3 206.163.37.95 11:14:29 160.94.179.217 139 139 4 206.163.37.95 11:14:30 160.94.179.255 5 206.163.37.95 11:14:32 160.94.179.254 139 139 6 206.163.37.95 11:14:35 160.94.179.253 7 206.163.37.95 11:14:36 160.94.179.252 139 139 195 180 199 19 177 172 10 8 206.163.37.95 11:14:38 160.94.179.251 9 206.163.37.95 11:14:41 160.94.179.250 10 206.163.37.95
11:14:44 160.94.179.249
139 139 139 285 195 163
Summarization of attacks using association rules
Yes No Yes No No No Yes No No
Rules Discovered:
{Src IP = 206.163.37.95, Dest Port = 139, Bytes
[150, 200]} --> {ATTACK}
Tid
SrcIP Start time Dest Port Number of bytes Attack
1 206.163.37.81 11:17:51 160.94.179.208 150 2 206.163.37.99 11:18:10 160.94.179.235 3 206.163.37.55 11:34:35 160.94.179.221 208 195 4 206.163.37.37 11:41:37 160.94.179.253 5 206.163.37.41 11:55:19 160.94.179.244
Test Set
199 181
? ? ? ? ? Learn Classifier Model
Anomaly Detection
Misuse Detection – Building Predictive Models
Key Technical Challenges
Large data size
High dimensionality
Temporal nature of the data
Skewed class distribution
Data preprocessing
On-line analysis
AHPCRC
5
Data Mining for Intrusion Detection
Tid
SrcIP
Training Set
Start time Dest IP Dest Port Number of bytes Attack
139 192
No
1 206.135.38.95 11:07:20 160.94.179.223 2 206.163.37.95 11:13:56 160.94.179.219 3 206.163.37.95 11:14:29 160.94.179.217 139 139 4 206.163.37.95 11:14:30 160.94.179.255 5 206.163.37.95 11:14:32 160.94.179.254 139 139 6 206.163.37.95 11:14:35 160.94.179.253 7 206.163.37.95 11:14:36 160.94.179.252 139 139 195 180 199 19 177 172 10 8 206.163.37.95 11:14:38 160.94.179.251 9 206.163.37.95 11:14:41 160.94.179.250 10 206.163.37.95
11:14:44 160.94.179.249
139 139 139 285 195 163
Summarization of attacks using association rules
Yes No Yes No No No Yes No No
Rules Discovered:
{Src IP = 206.163.37.95, Dest Port = 139, Bytes
[150, 200]} --> {ATTACK}
Tid
SrcIP Start time Dest Port Number of bytes Attack
1 206.163.37.81 11:17:51 160.94.179.208 150 2 206.163.37.99 11:18:10 160.94.179.235 3 206.163.37.55 11:34:35 160.94.179.221 208 195 4 206.163.37.37 11:41:37 160.94.179.253 5 206.163.37.41 11:55:19 160.94.179.244
Test Set
199 181
? ? ? ? ? Learn Classifier Model
Misuse Detection – Building Predictive Models
Key Technical Challenges
Large data size
High dimensionality
Temporal nature of the data
Skewed class distribution
Data preprocessing
On-line analysis
AHPCRC
6
MINDS – Minnesota INtrusion Detection System
MINDS
system network
Anomaly scores
Association pattern analysis
Summary and of attacks Data capturing device Net flow tools tcpdump
Anomaly detection
… … Detected novel attacks Human analyst Filtering
Feature Extraction Known attack detection
Labels Detected known attacks
Data mining based intrusion detection system Incorporated into Interrogator architecture at ARL Center for Intrusion Monitoring and Protection (CIMP)
Helps analyze data from multiple sensors at DoD sites around the country MINDS anomalies are used as the primary key when viewing related alerts from other tools (SNORT, Jids, etc.) MINDS is the first effective anomaly intrusion detection system used by ARL Routinely detects attacks and intrusive behavior not detected by widely used intrusion detection systems
Insider Abuse / Policy Violations / Worms / Scans
AHPCRC
7
Feature Extraction Module
•
Three groups of features
–
Basic features of individual TCP connections
• source & destination IP -
Features 1 & 2
• source & destination port -
Features 3 & 4
• Protocol • Duration • Bytes per packets • number of bytes
Feature 5 Feature 6 Feature 7 Feature 8
–
Time based features
• For the same source ( destination ) IP address, number of unique destination ( source ) IP addresses inside the network
in last T seconds – Features 9 ( 13 )
• Number of connections from source ( destination ) IP to the same destination ( source ) port
in last T seconds – Features 11 ( 15 )
–
Connection based features
• For the same source ( destination ) IP address, number of unique destination ( source ) IP addresses inside the network
in last N connections Features 10
• Number of connections from source ( destination ) IP to the same destination ( source ) port
in last N connections Features 12 ( 16 ) ( 14 )
AHPCRC
8
Detection of Anomalies on Real Network Data
• • •
Anomalies/attacks picked by MINDS include scanning activities , worms , and non-standard behavior such as policy violations and insider attacks . Many of these attacks detected by MINDS, have already been on the CERT/CC list of recent advisories and incident notes.
Some illustrative examples of intrusive behavior detected using MINDS at U of M Scans
–
Detected scanning for Microsoft DS service on port 445/TCP
• Undetected by SNORT since the scanning was non-sequential (very slow). Rule added to SNORT in September 2002 –
Detected scanning for Oracle server
• Undetected by SNORT because the scanning was hidden within another Web scanning –
Detected a distributed windows networking scan from multiple source locations Policy Violations
–
Identified machine running Microsoft PPTP VPN server on non-standard ports
• Undetected by SNORT since the collected GRE traffic was part of the normal traffic –
Identified compromised machines running FTP servers on non-standard ports, which is a policy violation
• Example of anomalous behavior following a successful Trojan horse attack –
Detected computers on the network apparently communicating with outside computers over a VPN or on IPv6 Worms
–
Detected several instances of slapper worm that were not identified by SNORT since they were variations of existing worm code
–
Detected unsolicited ICMP ECHOREPLY messages to a computer previously infected with Stacheldract worm (a DDos agent)
AHPCRC
9
M I N D S
Typical Anomaly Detection Output
–
January 26, 2003 (48 hours after the “slammer” worm)
score 37674.69
26676.62
24323.55
21169.49
19525.31
19235.39
17679.1
8183.58
7142.98
5139.01
4048.49
4008.35
3657.23
3450.9
3327.98
2796.13
2693.88
2683.05
2444.16
2385.42
2114.41
2057.15
1919.54
1634.38
1596.26
1513.96
1389.09
1315.88
1279.75
1237.97
1180.82
srcIP
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
142.150.Y.101
200.250.Z.20
202.175.Z.237
63.150.X.253
63.150.X.253
63.150.X.253
142.150.Y.101
63.150.X.253
142.150.Y.236
142.150.Y.101
63.150.X.253
142.150.Y.101
142.150.Y.101
142.150.Y.101
63.150.X.253
142.150.Y.107
63.150.X.253
63.150.X.253
142.150.Y.103
63.150.X.253
63.150.X.253
sPort 1161 1161 1161 1161 1161 1161 1161 1161 1161 1161
0 27016 27016
1161 1161 1161
0
1161
0 0
1161
0 0 0
1161
0
1161 1161
0
1161 1161 dstIP
128.101.X.29
160.94.X.134
128.101.X.185
160.94.X.71
160.94.X.19
160.94.X.80
160.94.X.220
128.101.X.108
128.101.X.223
128.101.X.142
128.101.X.127
128.101.X.116
128.101.X.116
128.101.X.62
160.94.X.223
128.101.X.241
128.101.X.168
160.94.X.43
128.101.X.240
128.101.X.45
160.94.X.183
128.101.X.161
128.101.X.99
128.101.X.219
128.101.X.160
128.101.X.2
128.101.X.30
128.101.X.40
128.101.X.202
160.94.X.32
128.101.X.61
1434
2048 4629 4148
1434 1434 1434
2048
1434
2048 2048
1434
2048 2048
dPort 1434 1434 1434 1434 1434 1434 1434 1434 1434
2048
1434
2048
1434 1434
2048
1434 1434
1 17 1 17 17 1 17 17 17 1 17 17 17 17 17 1 17 1 1 17 1 1
protocolflags packets bytes
17 17 16 16 [0,2) [0,2) [0,1829) [0,1829) 17 17 17 17 17 17 17 16 16 16 16 16 16 16 [0,2) [0,2) [0,2) [0,2) [0,2) [0,2) [0,2) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 [0,2) [2,4) [2,4) [2,4) [0,2) [0,2) [0,2) [2,4) [0,2) [2,4) [0,2) [0,2) [0,2) [2,4) [2,4) [0,2) [0,2) [0,2) [0,2) [0,2) [0,2) [0,2) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829) [0,1829)
4
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 0.81
0.81
0.81
0.81
0.81
0.81
0.81
0.82
0.82
0.83
0.82
0.83
0.82
0.82
0.83
0.83
0.83
0.82
0.83
0 0 0.82
0.82
0.82
0.83
0.82
0.83
0.83
0.82
0.83
0.83
7
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Anomalous connections that correspond to the “slammer” worm 8
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Anomalous connections that correspond to the ping scan Connections corresponding to UM machines connecting to “half-life” game servers 11 0.59
0.59
0.58
0.58
0.58
0.58
0.58
0.58
0.57
0.56
0.57
0.56
0.57
0.57
0.56
0.56
0.56
0.57
0.56
0 0 0.57
0.57
0.57
0.56
0.57
0.56
0.56
0.57
0.56
0.56
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
12 13 14 15 16
0 0 0 0 0 0
0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0 0 0 0 0 0
Summarization Using Association Patterns
Ranked connections
attack
Anomaly Detection System Discriminating Association Pattern Generator
normal update Knowledge Base
AHPCRC
1.
2.
3.
4.
5.
Build normal profile Study changes in normal behavior Create attack summary Detect misuse behavior Understand nature of the attack 11
R1:
TCP, DstPort=1863 Attack
… … … … R100:
TCP, DstPort=80 Normal
Typical MINDS Output
5.6
2.7
4.39
4.34
4.07
3.49
3.48
3.34
2.46
2.37
2.45
score c1 31.2
3.04
15.4
14.4
c2 src IP 218.19.X.168
138 12 64.156.X.74
218.19.X.168
sPort 5002 ---- 5002 134.84.X.129 4770 7.81
3.09
2.41
6.64
4 64 1 8 xxx.xxx.xxx.xxx4729
xxx.xxx.xxx.xxx
218.19.X.168
---- 5002 dst IP dPort 134.84.X.129 4182 xxx.xxx.xxx.xxx---- xxx 134.84.X.129 4896 6 218.19.X.168
134.84.X.129 3890 218.19.X.168
200.75.X.2
5002 5002 xxx.xxx.xxx.xxx---- ---- 6 6 6 6 xxx 134.84.X.129 3676 6 12 8 51 42 58 0 0 0 5 0 218.19.X.168
5002 xxx.xxx.xxx.xxx
---- 218.19.X.168
5002 218.19.X.168
218.19.X.168
200.75.X.2
5002 5002 ---- xxx.xxx.xxx.xxx21
200.75.X.2
---- 134.84.X.129 4626 6 xxx.xxx.xxx.xxx113
6 134.84.X.129 4571 6 218.19.X.168
5002 134.84.X.129 4572 160.94.X.114 51827 64.8.X.60
119 218.19.X.168
5002 6 6 134.84.X.129 4525 6 134.84.X.129 4524 6 134.84.X.129 4159 xxx.xxx.xxx.xxx21
200.75.X.2
---- xxx.xxx.xxx.xxx21
6 6 6 6 27 4 27 27 [ 5,6) [ 0,2) [ 5,6) [ 5,6) 1 [ 0,2045) 0 [ 0,2045) 0.12
[ 0,2045) 0.01
[ 0,2045) 0.01
27 [ 5,6) [ 0,2045) ------ --------- -------- 0.01
0.14
------ --------- [ 0,2045) 0.33
27 27 2 27 [ 5,6) [ 5,6) [ 0,2) [ 5,6) [ 0,2045) 0.03
[ 0,2045) 0.03
[ 0,2045) 0.25
[ 0,2045) 0.04
27 24 27 27 27 2 [ 5,6) [ 0,2045) 0.04
[ 483,-) [ 8424,-) 0.09
[ 5,6) [ 0,2045) 0.06
[ 5,6) [ 5,6) [ 0,2045) 0.06
[ 0,2045) 0.06
--------- [ 0,2045) 0.19
20 --------- [ 0,2045) 0.35
------ --------- [ 0,2045) 0.19
2 0.01
0.48
0.01
0.01
0.02
0.33
0.27
0.03
0.03
0.09
0.05
0.05
0.26
0.06
0.06
0.07
0.64
0.31
0.63
3 4 0.01 0.03
0.26 0.58
0.01 0.06
0.05 0.01
0.09 0.02
0.17
0.47
0.21 0.49
0.03 0.15
0.03 0.17
0.15
0.15
0.05 0.26
0.05 0.23
0.16
0.24
0.06 0.35
0.07 0.35
0.07 0.37
0.35 0.32
0.22 0.57
0.35 0.32
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.91 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 10 0 0.07 0.27 0 11 0 0 0 0 0 1 1 0 0.18
0 0.18
0 0 0 0 0 0 0 0 0 0 0 0.44 0 0 0 0 0.44 0 0 0 0 0.2
0 0 0 0.08
0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 0 0 0 0 0 0 0 0.28 0.25 0.01 0 0 0 0.99 0 0 0 0.98 0 0.79 0.15 0.01 0 0 0 0 0 0.96 0 0.97 0 0 0 0 0 0 0.93 0 0.93 0 0 0 0.18
0 0 0 0 14 0 0 1 0 0 15 1 0 0 0 0.92 0 0 0 0.28 0.01 0 0 0 0 0 0 0 16 0 0
UM computer connecting to a remote FTP server, running on port 5002 Summarized TCP reset packets received from 64.156.X.74, which is a victim of DoS attack, and we were observing backscatter, i.e. replies to spoofed packets Summarization of FTP scan from a computer in Columbia, 200.75.X.2
Summary of IDENT lookups, where a remote computer tries to get user name Summarization of a USENET server transferring a large amount of data
AHPCRC
12
Typical MINDS Output
9.1
9.1
9.1
9 8.9
8.9
5.7
score c1 c2 611 348 24 11 7.8
10 9.6
11 0 src IP 128.118.x.96
160.94.x.50
128.101.x.33
24.223.x.59
x.x.x.x
sPort 873 4529 20 1135 128.101.x.173 22 128.101.x.113
20 dst IP 160.94.x.50
dPort 4529 128.118.x.96 873 200.95.x.225 5001 160.94.x.1
8200 160.94.x.154 -- 24.26.x.13
81.168.x.40
554 4949 # # # 9.5
9.5
9.4
7.8
13 1 192.18.x.40
192.18.x.40
24.33.x.62
x.x.x.x
# # # # # # 2011 134.84.x.19
134.84.x.19
8200 134.84.x.21
# # # # # # 160.94.x.150 3989 -- 24.33.x.62
24.33.x.62
24.33.x.62
24.33.x.62
2011 2011 2011 2011 160.94.x.150 4010 160.94.x.150 3995 160.94.x.150 3992 160.94.x.150 4007 7.3
10 # 27 7 24.33.x.62
24.33.x.62
63.251.x.177
66.151.x.190
2011 2011 160.94.x.150 4004 160.94.x.150 4001 8200 x.x.x.x
-- 8200 x.x.x.x
-- 6 6 6 6 6 6 6 6 6 6 6 prot f lags 6 packet s bytes ---AP--- [ 24k,124k][ 20M ,182M ] 1 6 6 ---A--- [ 24k,124k][ 3M ,5M ] ---AP--- [ 24k,124k][ 20M ,182M ] 6 6 6 6 ---APRSF[ 338,379] [ 15k,17k] ---AP-SF [ 4,4] -- ---AP--- [ 24k,124k][ 3M ,5M ] ---AP-SF [ 24k,124k][ 20M ,182M ] 0 0 0 0.08
0.36
0 0 2 0 0 0 0.1
0.4
0 0 3 0 0 0 0.1
0.7
0 0 4 0 0 0 0.3
0 0 5 0.1 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 8 6 ---AP--F [ 24k,124k][ 20M ,182M ] ---AP--F [ 24k,124k][ 20M ,182M ] ---AP-SF [ 217,217] [ 252k,265k] ---AP-SF [ 4,4] -- ---AP-SF [ 217,217] [ 252k,265k] ---AP-SF [ 217,217] [ 252k,265k] ---AP-SF [ 217,217] [ 252k,265k] ---AP-SF [ 217,217] [ 252k,265k] ---AP-SF [ 218,234] [ 265k,309k] ---AP-SF [ 217,217] [ 252k,265k] ---AP-SF [ 4,4] ---AP-SF [ 4,4]
UM computers doing bulk transfers
-- [ 559,559] 0 0 0.16
0.37
0.16
0.16
0.16
0.16
0.16
0.16
0.38
0.39
0 0 0.2
0.4
0.2
0.2
0.2
0.2
0.2
0.2
0.4
0.4
0 0 0.3
0.7
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.7
0 0 0.2
0.3
0.1 0 0.1 0 0.1 0 0.1 0 0.1 0 0.1 0 0.4
0.2
0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 9 1 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 10 0.2
0 0 0 0 0 0
Attack on Real-Media server (Reported by CERT on September 9, 2003, RealNetworks media server RTSP protocol parser buffer overflow)
0 0 0 0.1
0 0 0 0 0 0 0 0.2
11 0 0 0 0 0 0 0 0 0 0 0.1
0 0 0 0 0 0.1
0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 0 0 0 0 0 14
8200/tcp traffic related to gotomypc.com which allows users to remotely control a desktop (involves a third party)
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mysterious traffic currently being investigated
AHPCRC
13
Typical MINDS Output
score c1 57973 6530 3227 1534 19.3
14.9
26.6
88.2
143 117 13.4
12.3
58.9 54 34.4 28.4 12.1
9 23 67 193.62.X.38
81 134.84.X.117 ---- 81 258 208.2.X.101
5 1 208.2.X.101
---- ---- ---- 4 31 128.101.X.204 ---- 11 101 xxx.xxx.xxx.xxx---- 23 c2 src IP 128.101.X.1
141.213.X.100
192.67.X.206
sPort dst IP 56025 192.67.X.205
dPort prot ocolflags 22 t cp ---A P -- packet s bytes 1 [ 32k,1M ] [ 8M ,1765M ] 4354 160.94.X.142 59999 t cp 43710 128.101.X.1
160.94.X.142 59999 141.213.X.100
22 t cp 4354 t cp ---A P -S F [ 32k,1M ] [ 8M ,1765M ] ---A P -- [ 32k,1M ] [ 8M ,1765M ] ---A --S F [ 32k,1M ] [ 3M ,8M ] 160.94.X.132
144.34.X.164
134.84.X.2
128.101.X.39
73 xxx.xxx.xxx.xxx
160.94.X.132 ---- xxx.xxx.xxx.xxx---- xxx.xxx.xxx.xxx
xxx.xxx.xxx.xxx
35755 193.62.X.38
1676 128.101.X.190
xxx.xxx.xxx.xxx---- 134.84.X.117 ---- 554 67.40.X.170
54906 65.221.X.2
62.70.X.101
17534 134.84.X.43
220.120.X.249 15074 160.94.X.1
t cp t cp 139 t cp 139 t cp 45288 t cp 22 t cp t cp t cp 62727 t cp 50789 t cp 6881 t cp 2355 t cp 57 216.196.X.78 ---- t cp ---A --S F ---A P -- --------- -------- ------S ------S ---A ---F --------- -------- [ 4,4] [ 4,4] -------- [ 200,200] [ 32k,1M ] [ 1M ,3M ] ---A --- [ 32k,1M ] [ 1M ,3M ] ---A ---F --------- -------- ---A P -- --------- -------- 0 0 0 0 0.3
0.3
0.2
0 0 0 0.3
0.3
---A P -S ---A -R - [ 32k,1M ] [ 8M ,1765M ] ---A P -- [ 32k,1M ] [ 8M ,1765M ] ---A P -- [ 32k,1M ] [ 8M ,1765M ] ---A P -- [ 32k,1M ] [ 8M ,1765M ] --------- -------- 0 0 0 0 0.2
2 0 0 0.3
0.2
0 0 0 0 0.3
0.3
0.3
0.3
0.1
0 0 0 0 3 0 0 0.3
0.5
0 0 0 0 0.3
0.3
0.3
0.3
0 0 0 0 0 4 5 0 0 0 0 0.5 0 0.5 0 0.4
0 0.1 0 0 0 0 0 0 0 0 0 0.5 0 0.3
0 0 0 0 0 0.4
0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0.1
0.1
0 0 1 1 1 1 1 1 0.2
0.1
1 1 0 1 1 9 0.1
0.1
0.1
0 0 0 0 0 0 0 0.1
0.1
0 0 0 0 0.2
10 0 0 0 0 0 0 0 0 0 0 0 0.1
0 0 0 0 0 11 0.1
0.1
0.1
0 0 0 0 0 0 0 0.1
0.1
0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13 0.1
0.1
0.1
0 0 0 0 0 0 0 0 0.1
0 0 0 0 0.2
14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
UMN computers doing bulk transfers 160.94.122.142 is running a rogue FTP server on 60000/TCP UMN Computers doing large transfers via BitTorrent to many outside hosts This computer is scanning for computers on port 139/TCP. Majority of the packets are 192bytes or 144bytes, except for the second summary (score 88.2) UMN computer running a RealMedia server, that was not known to the analyst Odd looking P2P traffic to/from a UMN computer (potentially KaZaA or Gnutella) The remote computer was scanning for 57/TCP, where RESET packets are sent back from computers that do not have 57/TCP open.
AHPCRC
14
Scan Detection
• • • • •
Despite the importance of scan detection its value is often overlooked
–
Lack of good tools for scan detection
• Existing methods either miss stealth scans or give too many false alarms
Fast scans are easy to catch using existing schemes but stealth scans are very difficult to recognize MINDS employs our new methodology for detecting network scans
–
Makes use of powerful new heuristics
• Only considers flows with a small number of packets • Only considers scans in a subnet (not the whole internet) –
Makes effective use of usage information
• Touches to rare IP / port combinations are more suspicious than others • A scanner will hit machines where the service is not available resulting in a low count
Very low False Alarm rate
– Evaluation of 36 million flows over a 30-minute window at the University of Minnesota showed 2583 alarms but only 22 false alarms – Evaluation on an hour of data at the ARL showed 1150 scans report, but only 5 false alarms
Routinely finds compromised machines at ARL-CIMP
AHPCRC
15
Detecting Suspicious Ports for Possible Worm Activity
• • • •
We find destinations located within the network for which there is a high connection failure rate on specific ports for inbound, non-scan connections Then we find ports on which there are many such destinations The existence of these ports indicates a potential worm or slow scan This warrants targeted and more detailed data collection and analysis that cannot be done easily on the entire data
–
Packet content analysis
–
Signature generation
AHPCRC
16
IP / port pairs for which a large percentage of connections failed
AHPCRC
17
IP / port pairs for which a large percentage of connections failed (only for ports with many hits)
AHPCRC
18
128 130 136 138 160 162 0 2 8 1 3 GE 4 6 5 7 16 HP 17 Apple 20 CSC 18 MIT 9 IBM 12 ATT 13 Xerox 24 Cable 19 Ford 25 22 28 21 23 29 64 66 72 10 11 14 15 HP 26 32 ATT 33 36 37 34 Halliburto n 35 Merit Netw orks 38 PSI 40 Eli Lily 41 44 Am Rad Digi Com 42 43 46 39 45 Interop Show Net 47 Nortel 48 Prudential 50 56 58 27 49 51 57 59 30 31 52 DuPont 53 Chrysler 54 Merck 55 60 62 61 63 74 96 98 104 106 168 170 129 131 137 139 161 163 169 171 132 134 140 142 164 166 172 AOL 174 133 135 141 143 165 167 173 175 144 146 152 154 176 178 184 186 145 147 153 155 177 179 185 187 148 150 156 158 180 182 188 190 149 151 157 159 181 183 189 191 232 234 192 194 200 202 224 226 65 67 73 75 97 99 105 107 193 195 201 203 225 227 233 235 68 70 76 78 100 102 108 110 196 198 204 206 228 230 236 238 69 71 77 79 101 103 109 111 197 199 205 207 229 231 237 239 80 82 88 90 112 114 120 122 208 210 216 218 240 242 248 250 81 83 89 91 113 115 121 123 209 211 217 219 241 243 249 251 84 86 92 94 116 118 124 126 212 214 220 222 244 246 252 254 85 87 93 95 117 119 125 127 213 215 221 223 245 247 253 255 APNIC (Asia) RIPE (Europe) LACNIC (Lat. Am.) Japan Inet SITA (French)
AHPCRC
US Military USPS ARIN UK Government IANA Reserved Multicast Private Use Loopback Public Data Network 19
999 unique sources (Min:1, Max:28, Avg:1) 1126 unique destinations (Min:1, Max:55, Avg:1) 1516 total flows involved 1472 scan flows on port 80 (found by scan detector)
7982 unique sources (Min:1, Max:16, Avg:1) 6184 unique destinations (Min:1, Max:28, Avg:1) 9930 total flows involved 9406 scan flows on port 445 (found by scan detector)
Clustering
• • • • • • •
Useful for detecting modes of behavior
–
Shared Nearest Neighbor (SNN) clustering works quite well at determining modes of behavior
• Not distracted by “noise” in the data
SNN is CPU intensive, O(N^2) Requires storing an N x K matrix
– –
K (number of neighbors) is typically between 10 – 20 K should be about the size of the smallest expect mode Clustered 850,000 connections collected over one hour at one US Army Fort Took 10 hours using 3 Quad 2.8 Ghz Servers, and 4 2 Ghz workstations (total of 16 CPUs) Required around 100 Meg of memory per PE for the distance calculations
–
500 Meg of memory for the final clustering step on a single PE Found 3135 clusters
–
Largest clusters around 500 records, smallest cluster 10 records
AHPCRC
24
Detecting Large Modes of Network Traffic Using Clustering
Large clusters of VPN traffic (hundreds of connections) Used between forts for secure sharing of data and working remotely
Start Time 20040407.10:00:00.428036
20040407.10:00:00.685520
20040407.10:00:00.748920
20040407.10:01:44.138057
20040407.10:01:59.267932
20040407.10:02:44.937575
20040407.10:04:00.717395
20040407.10:04:30.976627
20040407.10:04:46.106233
20040407.10:05:46.715539
20040407.10:06:16.975202
20040407.10:06:32.105013
Duration 0:00:00 0:00:03 0:00:00 0:00:00 0:00:00 0:00:01 0:00:00 0:00:01 0:00:00 0:00:00 0:00:01 0:00:00 Start Time 20040407.10:00:40.685522
20040407.10:00:58.748922
20040407.10:01:44.138059
20040407.10:02:14.678442
20040407.10:02:44.937577
20040407.10:03:15.308206
20040407.10:04:30.976629
20040407.10:06:16.975204
20040407.10:06:32.105015
20040407.10:06:47.234837
20040407.10:07:02.367471
20040407.10:07:17.494574
Duration 0:00:03 0:00:00 0:00:00 0:00:00 0:00:01 0:00:00 0:00:01 0:00:01 0:00:00 0:00:00 0:00:00 0:00:00 Src IP B B B B B B B B B B B B Src IP A A A A A A A A A A A A Src Port -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Src Port -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Dst IP A A A A A A A A A A A A Dst IP B B B B B B B B B B B B Dst Port Proto TTL Packets Bytes -1 gre 237 1 556 -1 -1 gre gre 237 237 1 1 556 556 -1 -1 -1 gre gre gre 237 237 237 1 1 1 556 96 556 -1 -1 -1 -1 -1 -1 gre gre gre gre gre gre 237 237 237 237 237 237 1 1 1 1 1 1 556 556 556 556 556 556 Dst Port Proto TTL packets -1 gre 237 1 Bytes 96 -1 gre -1 gre -1 gre 237 237 237 1 1 1 96 96 96 -1 gre -1 gre -1 gre -1 gre -1 gre -1 gre -1 gre -1 gre 237 237 237 237 237 237 237 237 1 1 1 1 1 1 1 1 96 96 96 96 96 96 96 96
Detecting Unusual Modes of Network Traffic Using Clustering
Clusters Involving GoToMyPC.com (Army Data) Policy violation, allows remote control of a desktop
Start Time 20040407.10:00:10.428036
20040407.10:00:40.685520
20040407.10:00:58.748920
20040407.10:01:44.138057
20040407.10:01:59.267932
20040407.10:02:44.937575
20040407.10:04:00.717395
20040407.10:04:30.976627
20040407.10:04:46.106233
20040407.10:05:46.715539
20040407.10:06:16.975202
20040407.10:06:32.105013
Duration 0:00:00 0:00:03 0:00:00 0:00:00 0:00:00 0:00:01 0:00:00 0:00:01 0:00:00 0:00:00 0:00:01 0:00:00 Start Time 20040407.10:00:40.685522
20040407.10:00:58.748922
20040407.10:01:44.138059
20040407.10:02:14.678442
20040407.10:02:44.937577
20040407.10:03:15.308206
20040407.10:04:30.976629
20040407.10:06:16.975204
20040407.10:06:32.105015
20040407.10:06:47.234837
20040407.10:07:02.367471
20040407.10:07:17.494574
Duration 0:00:03 0:00:00 0:00:00 0:00:00 0:00:01 0:00:00 0:00:01 0:00:01 0:00:00 0:00:00 0:00:00 0:00:00 Src IP B B B B B B B B B B B B Src IP A A A A A A A A A A A A Src Port 4125 4127 4138 4141 4143 4149 4163 4172 4173 4178 4180 4181 Src Port 8200 8200 8200 8200 8200 8200 8200 8200 8200 8200 8200 8200 Dst IP A A A A A A A A A A A A Dst IP B B B B B B B B B B B B Dst Port Proto TTL Flags 8200 tcp 123 ***AP*SF 8200 tcp 8200 tcp 123 ***AP*SF 123 ***AP*SF 8200 tcp 8200 tcp 8200 tcp 8200 tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 8200 tcp 8200 tcp 8200 tcp 8200 tcp 8200 tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF Packets Bytes 5 248 5 5 248 248 5 5 5 5 248 248 248 248 5 5 5 5 5 248 248 248 248 248 Dst Port Proto TTL Flags 4127 tcp 123 ***AP*SF 4138 tcp 4141 tcp 123 ***AP*SF 123 ***AP*SF 4145 tcp 4149 tcp 4153 tcp 4172 tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 4180 tcp 4181 tcp 4182 tcp 4183 tcp 4184 tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF packets Bytes 4 211 4 4 211 211 4 4 4 4 211 211 211 211 4 4 4 4 4 211 211 211 211 211
Detecting Unusual Modes of Network Traffic Using Clustering
Clusters involving mysterious ping and SNMP traffic
Start Time 20040407.10:01:00.181261
20040407.10:01:23.183183
20040407.10:02:54.182861
20040407.10:03:03.196850
20040407.10:04:45.179841
20040407.10:06:27.180037
20040407.10:09:48.420365
20040407.10:11:04.420353
20040407.10:11:30.420766
20040407.10:12:47.421054
20040407.10:13:12.423653
20040407.10:14:53.420635
Duration 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 Src IP A A A A A A A A A A A A Src Port 1176 -1 1514 -1 -1 -1 -1 3013 -1 3329 -1 -1 Dst IP B B B B B B B B B B B B Dst Port Proto TTL ICMP Type ICMP Code 161 udp 123 -1 161 -1 -1 -1 -1 161 -1 icmp udp icmp icmp icmp icmp udp icmp 123 123 123 123 123 123 123 123 8 1 8 8 8 8 8 0 95 0 0 0 0 0 161 -1 -1 udp icmp icmp 123 123 123 8 8 0 0 # Packets # Bytes 1 95 1 84 1 1 1 1 1 1 1 1 1 84 84 84 84 95 84 95 84 84 Start Time 20040407.10:01:00.181488
20040407.10:01:23.183291
20040407.10:01:55.180590
20040407.10:02:54.184537
20040407.10:03:03.196958
20040407.10:04:45.179965
20040407.10:05:09.180542
20040407.10:06:27.180159
20040407.10:09:48.420410
20040407.10:11:30.420773
20040407.10:13:12.423663
20040407.10:14:53.421019
Duration 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 Src IP B B B B B B B B B B B B Src Port 161 -1 161 161 -1 -1 161 -1 -1 -1 -1 -1 Dst IP A A A A A A A A A A A A Dst Port Proto TTL ICMP Type ICMP Code 1176 udp 63 1 103 # Packets # Bytes -1 1326 icmp udp 254 63 0 1 0 234 1 84 1514 -1 -1 1927 udp icmp icmp udp 63 254 254 63 1 0 0 1 134 0 0 234 1 1 84 84 -1 -1 -1 -1 -1 icmp icmp icmp icmp icmp 254 254 254 254 254 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 84 84 84 84 84
Detecting Unusual Modes of Network Traffic Using Clustering
Clusters involving unusual repeated ftp sessions Further investigations revealed a misconfigured Army computer was trying to contact Microsoft
Start Time 20040407.10:10:57.097108
20040407.10:11:27.113230
20040407.10:11:37.111176
20040407.10:11:57.118231
20040407.10:12:17.125220
20040407.10:12:37.132428
20040407.10:13:17.146391
20040407.10:13:37.153713
20040407.10:14:47.178228
20040407.10:15:47.199100
20040407.10:16:07.206450
Duration 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 0:00:00 Src IP A A A A A A A A A A A Src Port 3004 3007 3008 3011 3013 3015 3020 3022 3031 3040 3042 Dst IP B B B B B B B B B B B Dst Port Proto TTL Flags 21 tcp 123 ***AP*SF 21 21 tcp tcp 123 ***AP*SF 123 ***AP*SF 21 21 21 21 tcp tcp tcp tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 21 21 21 21 tcp tcp tcp tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF packets Bytes 7 318 7 7 318 318 7 7 7 7 318 318 318 318 7 7 7 7 318 318 318 318 Start Time 20040407.10:00:06.627895
20040407.10:00:16.633872
20040407.10:00:36.638794
20040407.10:01:16.652664
20040407.10:01:26.659694
20040407.10:01:56.666816
20040407.10:02:06.670680
20040407.10:02:56.687932
20040407.10:03:26.698413
20040407.10:04:06.712495
20040407.10:05:06.733731
20040407.10:06:16.758442
Duration 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 0:00:01 Src IP B B B B B B B B B B B B Src Port 21 21 21 21 21 21 21 21 21 21 21 21 Dst IP A A A A A A A A A A A A Dst Port Proto TTL Flags 2924 tcp 123 ***AP*SF 2925 2927 tcp tcp 123 ***AP*SF 123 ***AP*SF 2932 2933 2937 2938 tcp tcp tcp tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 2944 2947 2952 2961 2969 tcp tcp tcp tcp tcp 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF 123 ***AP*SF packets 7 7 7 7 7 7 7 7 7 7 7 7 Bytes 449 449 449 449 449 449 449 449 449 449 449 449
Header Analysis
Simple Scans Scans with Target Responses Scans with Automatic Virus Attacks
Packet-Based Signature Detection Behavior Analysis (MINDS)
Anomaly Detection and New Attacks New and Variant Attacks Viruses and Worms Compromises
Session-Based Signature Detection
Army Research Laboratory (ARL), supported by the AHPCRC and the MINDS initiative, successfully monitors and analyzes network data to protect ARL and its Army and DoD customer infospace
Current MINDS Research and Development Work
•
Correlation of suspicious events across network sites
–
Helps detect sophisticated attacks not identifiable by single site analyses
– – –
Scalable anomaly detection Distributed correlation algorithms Grids & middleware
•
Analysis of long term data (months/years)
–
Uncover suspicious stealth activities (e.g. insiders leaking/modifying information)
M M I N D S I N D S M I N D S M I N D S M I N D S
AHPCRC
30