Your Botnet is my Botnet: Analysis of a Botnet Takeover
Download
Report
Transcript Your Botnet is my Botnet: Analysis of a Botnet Takeover
Your Botnet is my Botnet:
Analysis of a Botnet Takeover
BRETT STONE-GROSS, MARCO COVA, LORENZO
CAVALLARO, BOB GILBERT, MARTIN SZYDLOWSKI,
RICHARD KEMMERER, CHRISTOPHER KRUEGEL,
AND GIOVANNI VIGNA
PRESENTATION BY SAM KLOCK
Background
Botnet: network of
machines compromised
by malware (bots) under
control of external agent
Botmaster: agent
controlling a botnet
Command and control
(C&C): mechanism by
which botmaster controls
a botnet
Motivation
Botnets: big and growing security issue on the
Internet
More broadband Internet access makes them easier to build
Wealth of information transported makes them profitable
Sizeable botnets can participate in large-scale malicious acts
We want to know more about them
How do they grow?
What can they do?
How do we address the threats existing and potential botnets
pose?
How do we preempt their growth (address user
vulnerabilities)?
Problem
Analyzing botnets is difficult
Topologies vary: top-down, P2P,
random
Protocols and goals vary
Sizes vary widely
Several techniques are typical
Passive analysis: collect spam
likely sent from bots; analyze
query patterns to DNS/DNSBL;
examine network traffic
Can’t scale to entire Internet
Some metrics only work for
botnets engaging in certain
activities
Infiltration: join the botnet and
monitor
Most botnets avoid supplying
information to member bots
Images: Wang, Sparks, Zou, “An Advanced Hybrid Peer-to-Peer Botnet”, in
IEEE Transactions on Dependable and Secure Computing, 7(2): 113-127.
Approach
Hijack the botnet
Idea: investigate botnet C&C, then tamper with it
Learn about botnet behavior from perspective of botmaster
Two ways to accomplish
Seize botmaster’s C&C machines
Law enforcement’s job
Better: collaborate with DNS providers
Goal: redirect C&C traffic to us
Then mimic C&C behavior
Approach depends on targeted botnet
Target: Torpig
“One of the most advanced pieces of crimeware ever
created”
Mainly harvests personal information
Opens ports for HTTP and SOCKS on victim machines
Useful for anonymous browsing, sending spam
Not yet clear what Torpig does with them
Good candidate for DNS-based hijacking
Centralized C&C
Bots identify C&C via domain names
Communication via HTTP
Torpig vs. Others
Torpig has interesting characteristics
Domain flux
Bot identifiers
A lot of harvested information
Implementable protocol
Past attempts:
Conficker: no bot identifiers, protocol authentication
Size estimation is hard
No authentication no data
Kraken: no data collection (spam sending)
Little insight into data harvesting
Torpig Characteristics
Basic idea: Trojan-horsed
based rootkit
Uses Mebroot
Attack via drive-by-download
Vulnerable web server
Vulnerable client/OS
Install Mebroot, then install
Torpig malware
(0) Inject URL
(1) Client HTTP GET
(2) Deliver injected URL
(3) Client HTTP GET from
DbD server
(4) Download & run
Mebroot
Torpig Characteristics (cont’d)
(5) Fetch Torpig libraries
(6) Configure, monitor
(7) Execute man-in-the-
browser phishing
Bot Behavior
Periodic C&C
communication
~20 minutes
Uploads harvested data
Server responds okn or okc
Man-in-the-browser more
complex
List of targeted URLs
Requests sent to special
injection server
Bypasses SSL, certificates, etc.
Can be hijacked (not attempted
here)
Hijacking Torpig
Domain flux
Related to fast flux
C&C hidden behind shifting
domains
Bots generate list of domains
to check periodically
Iterate through list; stop on
valid response
Domain generation
Pseudocode for daily DGA
algorithm (DGA) reverseengineered
Botmaster didn’t register
domains in advance: big
weakness
Hijacking Torpig (cont’d)
Conceptually simple with
DGA, protocol,
botmaster carelessness
Register domains first
Mimic protocol
(encryption easily broken)
Not a general approach
Conficker: 50,000
domains per day
Nondeterministic
Estimated cost: > $91.3m
per year
In practice:
Two different hosting
providers, two different
registrars
Redundancy
Apache handled requests
Data obtained
downloaded and
discarded from hosts
Total: 8.7 GB Apache logs,
69 GB pcap
Up three weeks, collected
ten days
Hijacking Torpig (cont’d)
Legal/ethical
implications
Botnet is a criminal
instrument
Precedent in past
research
Follow-through:
No new config (okn only)
Shared data with DoD,
FBI, ISPs
Torpig Data Format
Communication via HTTP
POST
URL: bot ID (nid), header
Body: stolen data
Header info:
ts
ip
hport, sport
os, cn
bld, ver
Torpig Data Collected
Analysis: Botnet Size
nid may be used to
count bots
Computed from HDD
model/serial
Not completely unique:
couple with os, cn, bld,
ver
Subtract researchers,
probes, casual machines
Found 182,800 likely
infected hosts
Identifying researchers
Intuition: analyze in
controlled environment
Use virtual machine
VMs have default
hardware specs (HDD
model/serial)
Eliminate nids computed
from VM defaults
Discounted 40 hosts
Analysis: Botnet Size (cont’d)
Much more accurate than IP
counting
DHCP churn causes overcount
706 machines: > 100 IPs
One host: 694 unique IPs
NAT causes undercount
1,247,642 unique IPs vs.
182,800 est. bots
Traffic characteristics
Peaks at 9am PST, troughs
9pm PST
Within hour: unique IPs =
unique bots
Within day: unique IPs >
unique bots
Analysis: Botnet Growth
Most bots in U.S.,
Germany, Italy
Intuition: targeted
websites mainly English,
German, Italian
IP counting overestimates
Italian/German infections
Found 49,294 new
infections
Most on Jan 25, 27
How? ts = 0
Analysis: Botnet as Service
Why bld?
Twelve different values
Some values more active
than others
dxtrbc: 5,432,528
submissions
mentat: 1,582,547
submissions
Features do not seem to
differ from build to build
Explanation: customers
Treat bld as identifier for
customers
Can process output on
basis of customer
payment, wants
Q: Paper doesn’t
mention distribution of
builds over members.
Could build activity be
attributable to that?
Analysis: Stolen Data
Institutional data
8,310 accounts, 410
institutions
Paypal (1,770)
Poste Italiane (765)
Capital One (314)
E*Trade (304)
Chase (214)
310 institutions: < 10
accounts
Notifying victims:
complicated
38% credentials stolen
from password managers
Analysis: Stolen Data (cont’d)
Credit cards
Checked prefixes, used Luhn
heuristic
Found 1,660 unique
debit/credit card numbers
1,056 Visa
447 MasterCard
81 American Express
36 Maestro
24 Discover
49% in U.S., 12% in Italy, 8%
in Spain, rest in 40 others
86%: only one card number
One case: 30 numbers
Value (via Symantec):
$0.10 to $25 per card
$10 to $1000 per account
$83k to $8.3m over ten
days: profitable
Assumes all data is fresh
Analysis: Proxies and Other Uses
HTTP/SOCKS proxies
20.2% machines public
accessible
Looked at 10,000 most
active IPs
Most likely to be used
Checked IPs against
Spamhaus list
One is known spammer
244 flagged as proxies or
malware-infected
Conclusion: usable, but
can’t claim current use
Distributed denial-of-
service (DDoS)
Question: how much
bandwidth?
Looked up connection types
for IPs via ip2location
65% analyzable IPs used
cable/DSL
Low baseline of 435 kbps
upstream: 19 Gbps total
Add in corporate connections
(22%) – much higher
Caveat: could not look up
for two-thirds of hosts
Analysis: Passwords
Sophos poll (March
2009): 33% of Internet
users use poor password
practices (n = 676)
Torpig supplied a lot of
passwords: we can
validate
297,262 user/password
pairs from 52,540
machines
28% reused passwords
for 368,501 sites, similar
to Sophos
Password strength
Fed 173,686 unique passwords
to John the Ripper
65 minutes: ~56,000 cracked
(simple replacement)
+10 minutes: ~14,000 cracked
(wordlist)
+24 hours: ~30,000 cracked
(brute force)
40% cracked in < 75 mins
Conclusion
Contributions:
Comprehensive analysis of Torpig
Insight into victims
Usability of botnets for fun, profit, attack
Lessons:
IP-counting wildly imprecise. Do not use it
User culture is a big problem
Lots of passwords were guessed easily in this sample
Intuition: users do not understand usage risks
Solution: educate, educate, educate
Coordination with registrars, hosting facilities, victim
institutions, law enforcement is hard
Makes redressing victims difficult
Solution: regulatory intervention