ICSI Work on Detection/Defense Vern Paxson, Nicholas Weaver, et al ICSI Detection/Defense

Download Report

Transcript ICSI Work on Detection/Defense Vern Paxson, Nicholas Weaver, et al ICSI Detection/Defense

ICSI Detection/Defense
ICSI Work on Detection/Defense
Vern Paxson, Nicholas Weaver, et al
September 20, 2005
Overview
ICSI Detection/Defense
•
•
•
•
•
Forensic analysis of “Witty”
Internet “Situational Awareness”
Scan detection
Detecting “Triggers”
Preliminary: signature white-listing
• Students: Abhishek Kumar (Georgia Tech), Vinod
Yegneswaran (UWisc) Jaeyeon Jung (MIT), Juan
Caballero (CMU), Jayanthkumar Kannan (UCB),
Christian Kreibich (Cambridge)
Forensic Analysis of Witty
ICSI Detection/Defense
• March 2004 (flaw announced previous day)
• Single UDP packet - stateless spreading
• Exploited flaw in the passive analysis of
Internet Security Systems products
• Payload: slowly corrupt random disk blocks
• Telescope data from UCSD/CAIDA /8
– Also UWisc /8, sampled 1-in-10
Witty Abstract Pseudo-code
ICSI Detection/Defense
1. Seed the PRNG using system time.
2. Send 20,000 copies of self to randomly selected
destinations.
3. Open physical disk chosen randomly between 0 .. 7.
4. If success:
5.
Overwrite a randomly chosen block on
this disk.
6.
Goto line 1.
7. Else:
8.
Goto line 2.
More Detailed Pseudo-code
ICSI Detection/Defense
srand(seed) { X  seed }
rand() { X  X*214013 + 2531011; return X }
main()
1. srand(get_tick_count());
2. for(i=0;i<20,000;i++)
3.
dest_ip  rand()[0..15] || rand()[0..15]
4.
dest_port  rand()[0..15]
5.
packetsize  768 + rand()[0..8]
6.
packetcontents  top-of-stack
7.
sendto()
8. if(open_physical_disk(rand()[13..15] ))
9.
write(rand()[0..14] || 0x4e20)
10.
goto 1
11. else goto 2
Witty Becomes Deterministic
ICSI Detection/Defense
• Given top 16 bits of linear congruential pseudo-random
number generator, can brute-force possible bottom bits to
recover the pseudo-random state
• Keys to the kingdom: infectee operation effectively
becomes deterministic (except for pesky reseeding) with
packets carrying an implicit sequence number
• So, for example, we can compute each infectee’s local
access bandwidth even in the presence of heavy packet
loss (since Window’s sendto() call is blocking)
– Just based on sequence number of packets seen @ telescope and
the amount of data sent between them
Inferred Access Bandwidth of Individual Witty
Infectees
ICSI Detection/Defense
Precise Bandwidth Estimation vs. Rates
Measured by Telescope
ICSI Detection/Defense
ICSI Detection/Defense
srand(seed) { X  seed }
rand() { X  X*214013 + 2531011; return X }
main()
1. srand(get_tick_count());
2. for(i=0;i<20,000;i++)
3.
dest_ip  rand()[0..15] || rand()[0..15]
4 calls to rand()
4.
dest_port  rand()[0..15]
per loop
5.
packetsize  768 + rand()[0..8]
6.
packetcontents  top-of-stack
7.
sendto()
Plus one more every 20,000
8. if(open_physical_disk(rand()[13..15] ))
packets, if disk open fails
9.
write(rand()[0..14] || 0x4e20)
10.
goto 1
} Or complete reseeding if not
11. else goto 2
}
}
Witty Infectee Reseeding Events
ICSI Detection/Defense
• Recall every 20,000 packets, Witty burns a random
number picking a disk to open & trash. For packets
with state Xi and Xj:
– If from the same batch of 20,000 then
• j - i = 0 mod 4
– If from separate but adjacent batches, for which Witty did not
reseed, then
• j - i = 1 mod 4
(but which of the 100s/1000s of intervening packets marked the phase
shift?)
– If from batches across which Witty reseeded, then no
apparent relationship.
• Lets us find the phase of Witty reseeding events …
Finding Each Infectee’s Random Seed
ICSI Detection/Defense
• Given the phase of reseeding events …
• … plus the fact that Witty uses uptime (in
msec) for its entropy …
• thus its seeds increase linearly with time …
• plus some computational geometry …
= We can extract each infectee’s random seed
• I.e. we know its uptime
• And, by observing times it didn’t reseed, how
many disks it has attached
Uptime of 750 Witty Infectees
ICSI Detection/Defense
Disk Drives Per Witty Infectee
ICSI Detection/Defense
60
50
40
30
% Infectees w/ #
Drives
20
10
0
1
2
3
4
5
6
7
Given Exact Values
of Seeds Used for Reseeding …
ICSI Detection/Defense
• More generally, we know every packet each infectee
sent
– Can compare this to when new infectees show up
– i.e. Who-Infected-Whom
Infection Attempts That Were
Too Early, Too Late, or Just Right
ICSI Detection/Defense
Infector/Infectee
Signature
Witty is Incomplete
ICSI Detection/Defense
• Recall that LCD PRNG generates a complete
orbit over a permutation of 0..232-1.
• But: Witty author didn’t use all 32 bits of
single PRNG value
– dest_ip  (Xi)[0..15] || (XI+1)[0..15]
– This does not generate a complete orbit!
• Misses 10% of the address space
• Visits 10% of the addresses (exactly) twice
• So, were 10% of the potential infectees
protected?
Time When Infectees Seen At Telescope
ICSI Detection/Defense
Doubly-scanned infectees
infected faster
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
In fact, some are infected
Extremely Quickly!
Unscanned infectees
still get infected!
ICSI Detection/Defense
How Do Unscanned Infectees
Become Infected?
• Multihomed host infected via another address
• DHCP or NAT aliasing
• But what about the extra-quick ones?
• Either they were passively infected and had a
large cross-sections
• Or they were known in advance to the
attacker
Uptime of 750 Witty Infectees
ICSI Detection/Defense
Part of a group of 135
infectees from same /16
Time When Infectees Seen At Telescope
ICSI Detection/Defense
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
Most also belong to that /16
Witty Started With A “Hit List”
ICSI Detection/Defense
• Initial infectees exhibit super-exponential
growth  they weren’t found by random
scanning
– (And can in fact show large-scale passive infection
unlikely)
• Prevalent /16 = U.S. military base
• Attacker knew of ISS security software
installation at military site  ISS insider
(or ex-insider)
– Fits with very rapid development of worm after
public vulnerability disclosure
Are All The Worms In Fact Executing
Witty?
ICSI Detection/Defense
• Answer: No.
• One “infectee” probes addresses not on the orbit,
each of the form A.B.A.B rather than A.B.C.D.
• Each probe contains Witty contagion, but lacks
randomized payload size.
• Shows up very near beginning of trace.
 Patient Zero - machine attacker used to
launch Witty. (Really, Patient Negative One.)
• European retail ISP.
• Communicated to law enforcement.
Implications of Witty Forensics
ICSI Detection/Defense
• Provided a degree of worm attribution
– (truth be told, doesn’t require the full analysis)
• Powerful demonstration of opportunistic
measurement and exploiting structure
• Very labor intensive
• A one-trick pony?
Internet “Situational Awareness”
ICSI Detection/Defense
• Separate from ICSI honeyfarm, at LBL we
operate a 2,560 honeynet w/ honeyd
responders
• Basic question: how do we tell when it sees
something new …
– … and interesting
• Idea:
1. Characterize “background radiation” in abstract terms
2. Remove any matches, consider remainder “new” …
3. … except first run for a few months to converge on full
set of abstractions
Internet “Situational Awareness”, con’t
ICSI Detection/Defense
• It doesn’t work.
– There is constant churn in what arrives that’s new
– Though often with very minor variations
• In principle removable, but need better metaabstractions for doing so
• Basic question #2: What can we say about
an “event” seen by the honeynet?
– Is it a worm, a botnet, a misconfiguration?
– If a botnet, could it be more than one? Is the
scanning coordinated? How large a region is the
scan targeting?
26
27
28
29
30
Internet “Situational Awareness”, con’t
ICSI Detection/Defense
• It doesn’t work ... Yet.
• Significant noise problems
+ Significant modalities & variations
+ Calibration difficulties
Scan Detection
ICSI Detection/Defense
• TRW (Threshold Random Walk) very
effective at detecting random scanners …
– … at least, at a site’s border
– (we now have some enterprise traces to evaluate)
• What about non-random scanning worms?
– Topological, meta-server
• Idea: detect anomalously high fan-out rate
– But with what detection threshold? Too low and
busy hosts trigger false positives. Too high and
worm can fly under the radar.
Applying Sequential Hypothesis Testing
to Rate-based Detection
ICSI Detection/Defense
• Idea: per-host, learn its past rate of
contacting new hosts
– This becomes its Bayesian prior for non-infection
– Hypothesize higher rate for infected hosts
– As new contacts made, apply SHT to decision
between infection/non-infection
•
Benefits:
– No single fixed detection threshold
– Host’s behavior somewhat integrated over
multiple time scales by updates to SHT
RBS (Rate-Based Seq. Hyp. Testing)
ICSI Detection/Defense
•
Math based on Poisson arrivals for hosts contacting
new destinations (not too bad an assumption)
•
Evaluated on partial enterprise traces
– Proxies for topological scanners: internal security scanner,
web crawlers, printer manager, service monitor
– Prior for benign fan-out rate: 3.8 Hz
•
Preliminary: works fairly well, ≈ 1 FP/hr
– Also assess hybrid, RBS+TRW
•
But:
– FP high enough to make automatic response problematic
– Topological worm can still spread very fast @ 3.8 Hz if
avoids TRW’s failure detection
DNS-Based Scan Detection
ICSI Detection/Defense
•
Previous work: watch DNS traffic to detect randomaddress scanners because not preceded by name
lookup
•
Idea (preliminary): for non-random scanning worms,
use a site’s DNS server to gain insight into what
can’t otherwise be seen
– The hope: even if scanning activity occurs within an
unmonitored subnet, for topological worms will still often be
preceded by DNS lookup that is seen at DNS server
•
Assessed on traces from LBL’s name servers
•
Problem: there are a lot of hosts with significant
DNS fan-out (also, surprisingly, a lot of failure to
cache previous answers)
DNS-Based Scan Detection, con’t
ICSI Detection/Defense
•
Another idea: analyze DNS lookups to spot potential
contact graphs
– I.e., A looks up B which then looks up C which looks up D
•
Somewhat more promising, but:
– Needs to work on short chains, since trouble likely grows
exponentially with chain-length
– Trace evaluation finds clusters of hosts that frequently look
each other up. Need to distinguish these from true contact
graphs (by training? by a “tell”?)
Detecting “Triggers”
ICSI Detection/Defense
•
Observation: many forms of successful attack/abuse
manifest as incoming traffic to a host H triggers H to
initiate/receive connections it otherwise wouldn’t:
– “Phone home” signal on successful exploits
•
Also done by opening up a new port that’s probed by
attackware to determine success
– Incoming worm traffic triggers outgoing scanning
– Incoming email/IRC triggers outgoing email/IRC
•
Idea: such triggers manifest as apparently unrelated
connections occurring closer in time than should
happen just due to chance
Detecting “Triggers”, con’t
ICSI Detection/Defense
•
Mathematical framework assumes that application
sessions well-modeled as Poisson process.
•
Compute probability that two independent Poisson
processes would occur as close together as
observed. If low, flag as anomalous.
•
Requires recognizing known session structure, e.g.,
FTP user connection + FTP data connections … +
optional ident connection. Or: SMTP in to known
server (again w/ optional ident) that leads to SMTP
out as it forwards it.
– We codified 39 of these
Detecting “Triggers”, con’t
ICSI Detection/Defense
•
This works! … in terms of finding “hidden causality”,
i.e., connections that are related even though not
part of one of the recognized sessions.
•
This doesn’t work! …in terms of assuming that such
hidden causality reflects abuse.
– Instead, it nearly always means we’ve found a new type of
(benign) application session.
– Prevalence could be skewed by degree to which LBL’s
traffic includes a very diverse set of applications.
– We got the FP rate down to a few dozen per day; not good
enough. Serves as good anomaly signal but not actionable.
•
We’re now thinking about recasting in terms of
automatically discovering session structure.
Signature White-listing
ICSI Detection/Defense
•
Problem: when automatically distilling signatures
(e.g.., from honeypot traffic), how do we ensure that
the signature doesn’t reflect benign/common
protocol elements?
– E.g., USER-AGENT: Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98)
•
Idea: run signature distillation over large corpus of
mostly benign traffic, identify frequently occurring
protocol elements for white-listing
•
Status: basic algorithms developed, preliminary test
on HTTP traces promising …
– … with key questions being how will it scale to sufficiently
large datasets …
– … and will this suffice to construct a complete enough list?
(Additional Slides Re Witty Analysis)
ICSI Detection/Defense
42
43
44
45
46
47
48