Transcript slides

SPANStore: Cost-Effective Geo-Replicated
Storage Spanning Multiple Cloud Services
Zhe Wu, Michael Butkiewicz, Dorian Perkins,
Ethan Katz-Bassett, Harsha V. Madhyastha
UC Riverside and USC
Geo-distributed Services for Low Latency
•2
Cloud Services Simplify Geo-distribution
•3
Need for Geo-Replication
Data uploaded by a user may be viewed/edited
by users in other locations
• Social networking (Facebook, Twitter)
• File sharing (Dropbox, Google Docs)
 Geo-replication of data is necessary
Isolated storage service in each cloud data center
Application needs to handle replication itself
•4
Geo-replication on Cloud Services
Lots of recent work on enabling geo-replication
• Walter(SOSP’11), COPS(SOSP’11), Spanner(OSDI’12),
Gemini(OSDI’12), Eiger(NSDI’13)…
• Faster performance or stronger consistency
Added consideration on cloud services
Minimizing cost
•5
Outline
Problem and motivation
SPANStore overview
Techniques for reducing cost
Evaluation
•6
SPANStore
Key value store (GET/PUT interface) spanning
cloud storage services
Main objective: minimize cost
Satisfy application requirements
• Latency SLOs
• Consistency (Eventual vs. sequential consistency)
• Fault-tolerance
•7
SPANStore Overview
Data center A
SPANStor
e
App Library
Data center B
Read/write data based on
Metadata
optimal
replication policy
lookups
Data center C
Return
data/ACK
request
Data center D
•8
SPANStore Overview
SPANStore Characterization
Application Input
Inter-DC latencies
Pricing policies
Latency, consistency and
fault tolerance
requirements
Data center B
Data center A
SPANStor
e
Data center C
SPANStor
e
App
Placement
Manager
workload
Replication policy
SPANStor
e
App
Data center D
SPANStor
e
App
•9
Outline
Problem and motivation
SPANStore overview
Techniques for reducing cost
Evaluation
•10
Questions to be addressed for every object:
• Where to store replicas
• How to execute PUTs and GETs
Cloud Storage Service Cost
Storage cost
(the amount of data stored)
+
Request cost
(the number of PUT and
GET requests issued)
=
Storage service cost
+
Data transfer cost
(the amount of data transferred
out of data center)
•12
R
Latency bound = 100ms
6
S3-only
5
4
3
R
R
2
R
1
0
So
h
ut
c
ifi
c
ifi
2
1
a
ic
er
Am
3
c
Pa
c
Pa
ific
AWS regions
ia
As
ia
As
t2
es
t1
es
t
c
Pa
W
W
s
Ea
ia
As
EU
S
S
US
U
U
# of data centers within bound
Low Latency SLO Requires High
Replication in Single Cloud Deployment
•13
R
R
Latency bound = 100ms
6
5
4
3
S3-only
S3+Azure+GCS
R
R
R
2
R
1
0
t2
es
t1
es
t
ica
er
Am
h
3
ut
ic
So
cif
Pa 2
ia
ic
As
cif
Pa 1
ia
ic
As acif
P
W
W
s
Ea
ia
As
EU
US
US
US
# of data centers within bound
Technique 1: Harness Multiple Clouds
AWS regions
•14
Price Discrepancies across Clouds
Cloud region
Storage
price (GB)
Data transfer
price (GB)
GET request
price (10000
requests)
PUT request
price (1000
requests)
S3 US West
0.095$
0.12$
0.004$
0.005$
Azure Zone2
0.095$
0.19$
0.001$
0.0001$
GCS
0.085$
0.12$
0.01$
0.01$
…
…
…
…
…
Leveraging discrepancies judiciously can reduce cost
•15
Range of Candidate Replication Policies
Strategy 1: single replica in cheapest storage cloud
R
High latencies
•16
Range of Candidate Replication Policies
Strategy 2: few replicas to reduce latencies
High data transfer cost
R
R
•17
Range of Candidate Replication Policies
Strategy 3: replicated everywhere
High storage cost
ROptimal
R replication policy depends on:R
1. application requirementsR
PUT
2. workload properties
High latencies& cost of PUTs
•18
High Variability of Individual Objects
Analyze predictability of Twitter workload
CDF of analyzed hours
1
Error can be as
of
20% 60%
of
hours
havehave
error
high
ashours
1000%
error higher
higher
100%than 50%
User than
1
0.8
0.6
0.4
User 2
User 3
User 4
User
5
One
user
0.2
0
0.01
0.1
1
Relative error
10
100
Estimate workload based on same hour in previous week
•19
Technique 2: Aggregate Workload
Prediction per Access Set
Observation: stability in aggregate workload
• Diurnal and weekly patterns
Classify objects by access set:
• Set of data centers from which object is accessed
Leverage application knowledge of sharing pattern
• Dropbox/Google Docs know users that share a file
• Facebook controls every user’s news feed
•20
Technique 2: Aggregate Workload
Prediction per Access Set
CDF of analyzed hours
1
Aggregate workload is more
stable and predictable
0.8
0.6
All users
User 1
User 2
User 3
User 4
User 5
0.4
0.2
0
0.01
0.1
1
10
100
Relative error
Estimate workload based on same hour in previous week
•21
Optimizing Cost for GETs and PUTs
Use cheap (request + data transfer) data centers
R
R
R
GET
R
•22
Technique 3: Relay Propagation
Asynchronous propagation (no latency constraint)
R
0.2$/GB
0.12$/GB
R
R
PUT R
0.25$/GB
0.19$/GB
R
0.19$/GB
•23
Technique 3: Relay Propagation
Asynchronous
propagation
(no latency
constraint)
Synchronous
propagation
(bounded
by latency
SLO)
R
0.12$/GB
0.2$/GB
Violate SLO
R
R
PUT R
0.25$/GB
0.19$/GB
R
0.19$/GB
•24
Summary
Insights to reduce cost
• Multi-cloud deployment
• Use aggregate workload per access set
• Relay propagation
Placement manager uses ILP to combine insights
Other techniques
• Metadata management
• Two phase-locking protocol
• Asymmetric quorum set
•25
Outline
Problem and motivation
SPANStore overview
Techniques for reducing cost
Evaluation
•26
Evaluation
Scenario
• Application is deployed on EC2
• SPANStore is deployed across S3, Azure and GCS
Simulations to evaluate cost savings
Deployment to verify application requirements
• Retwis
• ShareJS
•27
Simulation Settings
Compare SPANStore against
• Replicate everywhere
• Single replica
• Single cloud deployment
Application requirements
• Sequential consistency
• PUT SLO: min SLO satisfies replicate everywhere
• GET SLO: min SLO satisfies single replica
•28
SPANStore Enables Cost Savings across
Disparate Workloads
Savings
by price
Savings
bydiscrepancy
relay propagation
of PUT request
#1: big objects, more GETs
(Lots of data transfers from replicas)
#2: big objects, more PUTs
(Lots of data transfers to replicas)
#3: small objects, more GETs
(Lots of GET requests)
#4: small objects, more PUTs
(Lots of PUT requests)
CDF of access sets
1
0.8
0.6
0.4
#1
#1
#2
#1
#2
#3
#1
#2
#3
#4
0.2
0
1
10
100
Savings
Savings
by pricew/by
discrepancy
reducing
(cost w/ everywhere)/(cost
SPANStore)
of GET request
data transfer
•29
Deployment Settings
Retwis
•
•
•
•
Scale down Twitter workload
GET: read timeline
PUT: make post
Insert: read follower’s timeline and append post to it
Requirements:
• Eventual consistency
• 90%ile PUT/GET SLO = 100ms
•30
SPANStore Meets SLOs
CDF of operations
SLO
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Insert SLO
90%ile
GET
PUT
Insert
0
20
40
60
80
100 120 140 160 180 200
Latencies (ms)
•31
Conclusions
SPANStore
• Minimize cost while satisfying latency, consistency
and fault-tolerance requirements
Use multiple cloud providers for greater data
center density and pricing discrepancies
Judiciously determine replication policy based
on workload properties and application needs
•32