Transcript Document

Resource-Freeing Attacks:
Improve Your Cloud Performance
(at Your Neighbor's Expense)
(Venkat)anathan Varadarajan,
Thawan Kooburat,
Benjamin Farley,
Thomas Ristenpart,
and Michael Swift
DEPARTMENT OF COMPUTER SCIENCES
1
Public Clouds (EC2, Azure, Rackspace, …)
VM
Multi-tenancy
Different customers’
virtual machines (VMs)
share same server
VM
VM
VM
VM
VM
VM
Why multi-tenancy?
Improved resource utilization
2
Implications of Multi-tenancy
– CPU, cache, memory, disk, network, etc.
• Virtual Machine Managers (VMM)
– Goal: Provide Isolation
VMM
• VMs share many resources
VM
VM
• Deployed VMMs don’t
perfectly isolate VMs
– Side-channels [Ristenpart et al. ’09, Zhang et al. ’12]
Today: Performance degraded by other customers
3
Contention in Xen
Performance Degradation (%)
3x-6x Performance loss  Higher cost
600
500
Work-conserving
scheduling
VM
VM
400
300
Local Xen Testbed
200
Machine
Intel Xeon E5430,
2.66 Ghz
CPU
2 packages each
with 2 cores
Cache Size
6MB per package
100
0
CPU
Net
Non-work-conserving
CPU scheduling
Disk
Cache
4
What can a tenant do?
Ask provider for better isolation
… requires overhaul of the cloud
VM
Pack up VM and move
(See our SOCC 2012 paper)
… but, not all workloads cheap
to move
VM
This work: Greedy customer can recover
performance by interfering with other tenants
Resource-Freeing Attack
5
Resource-freeing attacks (RFAs)
• What is an RFA?
• RFA case studies
1. Two highly loaded web server VMs
2. Last Level Cache (LLC) bound VM and
highly loaded webserver VM
• Demonstration on Amazon EC2
6
The Setting
Victim:
– One or more VMs
– Public interface (eg, http)
Beneficiary:
– VM whose performance we want
to improve
Helper:
Victim
VM
VM
Beneficiary
– Mounts the attack
Beneficiary and victim fighting
over a target resource
Helper
7
Example: Network Contention
• Beneficiary & Victim
– Apache webservers hosting static and
dynamic (CGI) web pages.
– ℎ𝑎𝑙𝑓 the network bandwidth
What can you do?
Local Xen Test bed
• Target Resource: Network Bandwidth
Beneficiary
• Work-conserving scheduler
Victim
Clients
Net
8
Ways to Reduce Contention?
Break into victim VM and disable it
Helper
The good:
frees up resources used by victim
But:
• Requires knowledge of
vulnerability
• Drastic
• Easy to detect
Local Xen Test bed
Beneficiary
Victim
Clients
Net
9
Ways to Reduce Contention?
Backfires: May increase the
contention
Victim
Clients
Net
SYN flood
This may NOT free up target
resources
Beneficiary
Local Xen Test bed
Do a simple DoS attack?
Helper
10
Recipe for a Successful RFA
Proportion of CPU usage
Push towards CPU bottleneck
Shift resource away from the target resource
towards the bottleneck resource
CPU intensive dynamic pages
Shift resource usage
via public interface
Limits
Static pages
Proportion of Network usage
Reduce target resource usage
11
An RFA in Our Example
Result in our testbed:
Increases beneficiary’s share of
bandwidth
CPU Utilization
Clients
No RFA: 1800 page requests/sec
W/ RFA: 3026 page requests/sec
CGI Request
50% 85%
share of
bandwidth
Net
Helper
12
Resource-freeing attacks
1) Send targeted requests to victim
2) Shift resources use from target to a bottleneck
Can we mount RFAs when target
resource is CPU cache?
Shared CPU Cache:
– Ubiquitous: Almost all workloads need cache
– Hardware controlled: Not easily isolated via
software
– Performance Sensitive: High performance cost!
13
Cache Performance Degradation (%)
Cache Contention
250
RFA Goal
200
150
100
50
0
1000
2000
Webserver Request Rate
3000
14
Case Study: Cache vs. Network
– ~3x slower when sharing
cache with webserver
Local Xen Test bed
• Victim : Apache webserver hosting static
and dynamic (CGI) web pages
• Beneficiary: Synthetic cache bound
workload (LLCProbe)
Beneficiary Victim
• Target Resource: Cache
$$$
• No cache isolation:
Core
Clients
Core
Net
Cache
15
Cache vs. Network
Victim webserver frequently
interrupts, pollutes the cache
– Reason: Xen gives higher
priority to VM consuming
less CPU time
$$$
Core
Clients
Core
Net
Cache
Beneficiary starts to run
decreased cache efficiency
cache state
Webserver
receives a
request
Cache state time line
Heavily loaded
web server
16
Cache vs. Network w/ RFA
RFA helps in two ways:
1. Webserver loses its
priority.
2. Reducing the capacity
of webserver.
$$$
Core
Clients
Core
Net
Cache
cache state
Webserver
Heavily loaded
Heavily
loadedawebserver
requests
receives
web
server under RFA
request
CGI Request
Beneficiary starts to run
Cache state time line
Helper
17
RFA: Performance Improvement
RFA intensities – time in ms per second
60%
Performance
Improvement
196% slowdown
86% slowdown
18
RFA Effect on Interruptions
Beneficiary: LLCProbe
40%
85%
+
x
19
RFA Effect on Victim’s capacity
Decreases with
increasing RFA
intensity
20
Experiments on Amazon EC2
Multiple Accounts
VM
VM
Co-resident VMs from our accounts:
Stand-ins for victim and beneficiary
VM
VM
VM
VM
Separate instances for
helper and web clients
Instance type
m1.small
# of co-resident pairs 9 (23 total instances)
Machine type
Intel Xeon E5507 with 4MB LLC
No direct interact with any
other customers
Indirect interaction akin to
normal usage cases
21
LLCProbe Synthetic Benchmark
Highest performance
improvement of 13%,
recovering 33% of
performance lost.
Average performance
improvement: 6%
RFA improved performance of LLCProbe
on all experimental EC2 instances!
22
mcf from SPEC-CPU
3% performance improvement =
35% reduction in performance loss
10% slowdown
6% slowdown
On average RFA improved performance
across all SPEC workloads!
23
Discussion: Practical Aspects
RFA case studies used CPU intensive
CGI requests
– Alternative: DoS vulnerabilities
(Eg. hash-collision attacks)
Identifying co-resident victims
– Easy on most clouds
(Co-resident VMs have predictable
internal IP addresses)
VM
VM
No public interface?
– Paper discusses possibilities for RFAs
24
Conclusion
Resource-Freeing Attacks
– Interfere with victim to shift
resource use
– Proof-of-concept of efficacy in
public clouds
VM
VM
Open questions:
– Other RFAs?
– Countermeasures: Detection, stricter
isolation, smarter scheduling?
25
References
[MMSys10] Sean K. Barker and Prashant Shenoy. “Empirical evaluation of
latency-sensitive application performance in the cloud.” In MMSys, 2010.
[Security10] Thomas Moscibroda and Onur Mutlu. “Memory performance
attacks: Denial of memory service in multi-core systems.” In Usenix Security
Symposium, 2007.
[CCS09] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage. “Hey, you, get off
my cloud: exploring information leakage in third party compute clouds.” In
CCS, 2009.
26
Backup Slides
27
Discussion: Countermeasures
Detection?
– May be hard to differentiate RFA from legitimate
Stricter Isolation?
– Works but expensive
Contention-aware scheduling
– Not yet used in public IaaS
28
Discussion: Economies
• Cost of RFA
– Helper instance, and
– RFA traffic.
• Co-resident helper
– An efficient implementation of helper can run inside the
attacker’s VM.
– Current helper implementation consumes 15 Kbps of
network bandwidth and a CPU utilization of 0.7%.
• Multiplex Singe Helper Instance for many beneficiaries.
• Note: Currently, internal EC2 network traffic is free-ofcost.
29
Identifying Co-resident VMs
• Identifying the public interface:
– Predictable numerical distance between internal
IP addresses in public clouds.
– Identifying port used by the victim application
(standard ports like http(s), etc.).
30
Experiment:
Measuring Resource Contention
• Synthetic workloads
31
Other RFAs
• RFAs are not limited to the presented case
studies.
• LLC vs. Disk
– Sending spurious, random disk requests
asynchronously to create a bottleneck for the
shared disk resource.
• Memory vs. Disk
– Similarly to the above RFA
32
Discussion: More on Practical Aspects
• Work-conserving vs. Non-work-conserving
schedulers
– It is expected that public cloud environment
manage resources in a non-work-conserving
fashion.
– Eg. Net vs. Net RFA won’t work on Amazon EC2.
• Simulated client workload
– What is the effect of RFA in the presence of
multiple independent client requests originating
from numerous clients?
33
• Domain-0
– Privileged Domain, direct
access to I/O devices.
– All I/O requests goes
through Dom-0
• Xen scheduler internal
– Boost priority for
interactive workloads
VM
VM
VM
VM
VM
VM
VM
VM
Dom0
Dom0
Dom0
Dom0
Incoming request
Xen Internals
Hypervisor
Core
Core
Core
Core
N/W
cache
memory
Disk
34
Experiment:
Measuring Resource Contention
600
Machine
Intel Xeon E5430,
2.66 Ghz
500
Packages
Local Xen Test bed
Performance Degradation (%)
• On a local Xen test bed
Some have huge
performance
degradation
2, 2 cores per
package
400
LLC Size
6MB per package
300
200
VM
VM
VM
VM
VM
VM
VM
VM
Core
N/W
LLC
Not all
resources
conflict
100
Core
Observed
Workloads:
Core
Core
LLC
CPU
Net
Disk
memory
Memory
Cache
0
CPU
Disk
Net
Disk
Memory
Cache
Conflicting Workloads
35
Boost Priority and Interruptions
Victim: Webserver
Beneficiary: LLCProbe
40%
95%
85%
< 30%
Fewer interruptions  Higher cache efficiency
36
Demonstration on EC2
• Problem #1: Achieving Co-residence
– Launching multiple instances simultaneously from
two or more accounts.
• Problem #2: Verifying Co-residency
– Numerical distance between internal IP addresses
[CCS09].
– Faster packet round-trip times.
– Using resource contention experiments.
37
Normalized Performance on EC2
Aggregate
performance
degradation is within 5
performance points
On an average all
SPEC workloads
benefitted from RFA
Baseline
Higher is better
6%
38