Dealer: Application-aware Request Splitting for Interactive Cloud Applications Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue) and.

Download Report

Transcript Dealer: Application-aware Request Splitting for Interactive Cloud Applications Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue) and.

Dealer: Application-aware Request
Splitting for Interactive Cloud Applications
Mohammad Hajjat
Purdue University
Joint work with:
Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue)
and Kunwadee Sripanidkulchai (NECTEC Thailand)
1
Performance of
Interactive Applications
=
 Interactive apps  stringent requirements on user
response time
 Amazon: every 100ms latency cost
1% in sales
 Google: 0.5 sec’s delay increase  traffic
and revenue drop by 20%
 Tail importance: SLA’s defined on 90%ile and higher
response time
2
Cloud Computing:
Benefits and Challenges
 Benefits:
 Elasticity
 Cost-savings
 Geo-distribution:
 Service resilience, disaster recovery, better user experience, etc.
 Challenges:
 Performance is variable: [Ballani’11], [ Wang’10], [Li’10],
[Mangot’09], etc.
 Even worse, data-centers fail
Average of $5,600 per minute!
3
Approaches for Handling
Cloud Performance Variability
Autoscaling?
 Can’t tackle storage problems,
network congestion, etc.
 Slow: tens of mins in public clouds
DNS-based and Server-based
Redirection?
 Overload remote DC
 Waste local resources
 DNS-based schemes may take
hours to react
4
Contributions
 Introduce Dealer to help interactive multi-tier applications
respond to transient variability in performance in cloud
 Split requests at component granularity (rather entire DC)
 Pick best combination of component replicas (potentially
across multiple DC’s) to serve each individual request
 Benefits over naïve approaches:
o Wide range of variability in cloud (performance problems, network
congestion, workload spikes, failures, etc.)
o Short time scale adaptation (tens of seconds to few minutes)
o Performance tail (90th percentile and higher):
o Under natural cloud dynamics > 6x
o Redirection schemes: e.g., DNS-based load-balancers > 3x
5
Outline
 Introduction
 Measurement and Observations
 System Design
 Evaluation
6
Performance Variability in
Multi-tier Interactive Applications
Thumbnail Application
FE
Load
Balancer
Queue
eb
R
II
olR Web
II
IIS eol Role
SS
e
BE
Work
orker
erRol
Worker
Role
e
Role
blob
Queue
blob
BL1
BL2
Work
orker
erRol
Worker
Role
e
Role
 Multi-tier apps may consist of hundreds of components
 Deploy each app on 2 DC’s simultaneously
7
Performance Variability in
Multi-tier Interactive Applications
FE
Outliers
BL1
BE
BL2
75th
median
25th
8
Observations
 Replicas of a component are
uncorrelated
 Few components show
poor performance at any
time
FE
BL1
BE
BL2
 Performance problems are
short-lived; 90% < 4 mins
FE
BL1
BE
9
BL2
Outline
 Introduction
 Measurement and Observations
 System Design
 Evaluation
10
GTM
Dealer Approach:
Per-Component Re-routing
C1
C2
C3
C4
Cn
C1
C2
C3
C4
Cn
 Split req’s at each component dynamically
 Serve each req using a combination of replicas across multiple
DC’s
11
Dealer System Overview
GTM
C1
C1
C1
12
C2
C2
C2
C3
C3
C3
Cn
Cn
Cn
Dealer
Dealer High Level Design
Application
Determine
Delays
Compute
Split-Ratios
Stability
Dynamic
Capacity
Estimation
13
Determining Delays
 Monitoring:
 Instrument apps to record:
Compute
 ComponentDetermine
processing time
Application
 Inter-component
delay
Delays
Split-Ratios
 Use X-Trace for instrumentation, uses global ID
 Automate integration using Aspect Oriented Programming
Stability
(AOP)
 Push logs asynchronously toDynamic
reduce overhead
Capacity
Estimation
 Send req’s along lightly used
links and comps
 Active Probing:
 Use workload generators (e.g., Grinder)
 Heuristics for faster recovery by biasing towards better paths
14
Determining Delays
Monitoring
Probing
Combine
Estimates
Stability &
Smoothing
15
• Delay matrix D[,]: component processing and intercomponent communication delay
• Transaction matrix T[,]: transactions rate between
components
Calculating Split Ratios
C11
Application
user
C1
C12
16
FE
Determine
Delays
C21
FE
C2
C22
C41
C31
Compute
Split-Ratios
BE
C3
BE
C32
Dynamic
Capacity
Estimation
C51
BL1
BL1
C4
BL
C
BL2
C52
42
Stability
C52
Calculating Split Ratios
C11
C21
C41
C31
C51
C42
C12
C22
C32
 Given:
C52
 Delay matrix D[im, jn]
 Algorithm:
greedyT[i,j]
algorithm that assigns requests to the best
 Transaction matrix
performing
combination
of replicas
DC’s)
 Capacity matrix
C[i,m] (capacity
of (across
component
i in data-center m)
 Goal:
17
 Find Split-ratios TF[im, jn]: # of transactions between each pair of
components Cim and Cjn s.t. overall delay is minimized
Other Design Aspects
 Dynamic Capacity Estimation:
 Develop algorithm to dynamically capture capacities of comps
 Prevent comps getting overloaded by re-routed traffic
 Stability: multiple levels:
 Smooth matrices with Weighted Moving Average (WMA)
 Damp Split-Ratios by a factor to avoid abrupt shifts
 Integration with Apps:Determine
Application
 Can be integrated with any
Compute
Delays(stateful;Split-Ratios
app
e.g., StockTrader)
 Provide generic pull/push API’s
Dynamic
Capacity
Estimation
18
Stability
Outline
 Introduction
 Measurement and Observations
 System Design
 Evaluation
19
Evaluation
 Real multi-tier, interactive apps:
 Thumbnails: photo processing, data-intensive
 Stocktrader: stock trading, delay-sensitive, stateful
 2 Azure datacenters in US
 Workload:
 Real workload trace from big campus ERP app
 DaCapo benchmark
 Comparison with existing schemes:
 DNS-based redirection
 Server-based redirection
 Performance variability scenarios (single fault domain failure,
storage latency, transaction mix change, etc.)
20
Running In the Wild
 Evaluate Dealer under natural cloud dynamics
 Explore inherent performance variability in cloud environments
More than 6x difference
21
Running In the Wild
22
FE
BE
BL
A
FE
BE
BL
B
Dealer vs. GTM
 Global Traffic Managers (GTM’s) use DNS to route user
IP’s to closest DC
 Best performing DC ≠ closest DC (measured by RTT)
 Results: more than 3x improvement for 90th percentile
and higher
23
Dealer vs. Server-level Redirection
 Re-route entire request, granularity of DC’s
HTTP 302
DCA
DCB
24
Evaluating Against
Server-level Redirection
BL1
FE
BE
BL2
BL1
FE
25
BE
BL2
Conclusions
 Dealer: novel technique to handle cloud variability in multi



tier interactive apps
Per-component re-routing: dynamically split user req’s across
replicas in multiple DC’s at component granularity
Transient cloud variability: performance problems in cloud
services, workload spikes, failures, etc.
Short time scale adaptation: tens of seconds to few mins
Performance tail improvement:
o Natural cloud dynamics > 6x
o Coarse-grain Redirection: e.g., DNS-based GTM > 3x
26
Questions?
27