Reclaiming Network-wide Visibility Using Ubiquitous End System Monitors E Cooke*, R Mortier (mort), A Donnelly, P Barham, R Isaacs Systems and Networking, Microsoft.

Download Report

Transcript Reclaiming Network-wide Visibility Using Ubiquitous End System Monitors E Cooke*, R Mortier (mort), A Donnelly, P Barham, R Isaacs Systems and Networking, Microsoft.

Reclaiming Network-wide Visibility
Using Ubiquitous End System Monitors
E Cooke*, R Mortier (mort), A Donnelly, P Barham, R Isaacs
Systems and Networking, Microsoft Research Cambridge
(* University Michigan)
1
6 November 2015
The Visibility Crisis
• Visibility into the network is essential
for management: security/availability
• Problem: increasingly opaque traffic
and complex application behavior
obfuscates our view of the network
• Even collecting every packet at an
upstream router is not enough!
2
Opaque Traffic
• Encryption and tunneling
– Application-level (SSL)
– Network-level (IPSec)
• Where:
– Inter-domain visibility (operations outsourcing)
– Intra-domain visibility (IDS/IPS, flow-based
anomaly detection)
• Experiment: 8-day enterprise traffic trace 93% of the collected packets were IPSec
encapsulated!
3
Complex Application Behavior
• Example: “checking your email”
– Connect to authentication server to get
credentials
– Authenticate to mail server
– Connect to different services to download mail,
headers, attachments
– While concurrently synchronising address
book, calendar, etc.
• Result: very challenging to reconstruct
application behavior from packets/flows
• Other examples: Skype, Kazaa
4
Back to the Edge
• We must re-think where we measure
• Only end-systems can correctly
attach semantics to the traffic
they send and receive
• Solution: develop a scalable edgebased monitoring platform
5
Edge-Based Flow Monitoring
• Lots of good work on network
monitoring from the edge:
– Neti@Home, ForNet, DIMES, Spoofer
• We have a different objective:
1. Collect every flow on the network
2. Attach application-level semantics to
each flow (e.g. process name, userid)
6
Approach
• Place a monitoring daemon on every endsystem in a network
• Each monitor records all flows it sends or
receives
R
R
R
R
R
R
Enterprise
Capture and store traffic directly on
endsystems
7
Feasibility Questions
1. Where can you deploy monitors?
2. How many end-systems must be
instrumented?
3. What data should be collected?
4. How can that data be accessed?
5. What is the performance impact on
end-system monitors?
6. Security/Privacy Implications?
8
Prototype
• To help answer these questions we
constructed prototype:
– User-space monitoring daemon
– Based on Event Tracing for Windows
(ETW) facility
– Logs observed flows:
ts
srcip
dstip
sport
dport
proto
bytes
pkts
PID
App
11323298
160.128.6.59
160.120.7.201
2323
1005
TCP
1923
603
1374
outlook.
exe
11323321
160.120.7.201
160.128.6.59
1005
2323
TCP
13724
1000
1374
outlook.
exe
9
Where to deploy
• Most practical in environments with
direct control over end-systems:
– Enterprises
– Governments
• Integrate monitoring daemon into
standard client/server OS-images
• Not targeted at home, broadband
ISP’s, etc.
10
Total Byte Coverage
How many end-systems
A few hosts
contribute
most of the
traffic
If we randomly
choose 50% of
hosts we get
75% of the bytes
Percentage of Hosts Instrumented
11
8-day packet
trace from
enterprise
network
Flows Per Second
Performance Impact
Typical Max:
~200 flows/sec
Typical Mean:
~10 flows/sec
Disk/CPU Cost Measurements:
8-day packet
trace from
enterprise
network
12
If we write flows to
disk every 30s then
across all systems:
– Mean: 0.73 kB/s
– Max: 71.7 kB/s
Over one week total
bytes:
– Mean: 64 kB
– Max: ~1.5 GB
Novel Applications
• Network auditing:
– Determine applications/users using expensive WAN link
• Data-centre management:
– Per-user/per-virtual machine packet accounting
• Capacity Planning:
– Use historical data to predict future network usage
• Network Forensics:
– Application-level intrusion forensics across systems
• Anomaly Detection:
– Produce detailed reports of abnormal application usage
13
Thank You
Questions?
14
Security/Privacy
• Privacy: Storing personal information:
– Flow-level data is already collected on many
networks today
– System only collects data on what a host
already sends or receives
• Security: A malicious user could corrupt
the flow store:
– Correlate flows across hosts to find anomalies
– Hypervisor/Host-OS does data logging
– Store flows in central repositories
15