Dealer: Application-aware Request Splitting for Interactive Cloud Applications Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue) and.
Download ReportTranscript Dealer: Application-aware Request Splitting for Interactive Cloud Applications Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue) and.
Dealer: Application-aware Request Splitting for Interactive Cloud Applications Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue) and Kunwadee Sripanidkulchai (NECTEC Thailand) 1 Performance of Interactive Applications = Interactive apps stringent requirements on user response time Amazon: every 100ms latency cost 1% in sales Google: 0.5 sec’s delay increase traffic and revenue drop by 20% Tail importance: SLA’s defined on 90%ile and higher response time 2 Cloud Computing: Benefits and Challenges Benefits: Elasticity Cost-savings Geo-distribution: Service resilience, disaster recovery, better user experience, etc. Challenges: Performance is variable: [Ballani’11], [ Wang’10], [Li’10], [Mangot’09], etc. Even worse, data-centers fail Average of $5,600 per minute! 3 Approaches for Handling Cloud Performance Variability Autoscaling? Can’t tackle storage problems, network congestion, etc. Slow: tens of mins in public clouds DNS-based and Server-based Redirection? Overload remote DC Waste local resources DNS-based schemes may take hours to react 4 Contributions Introduce Dealer to help interactive multi-tier applications respond to transient variability in performance in cloud Split requests at component granularity (rather entire DC) Pick best combination of component replicas (potentially across multiple DC’s) to serve each individual request Benefits over naïve approaches: o Wide range of variability in cloud (performance problems, network congestion, workload spikes, failures, etc.) o Short time scale adaptation (tens of seconds to few minutes) o Performance tail (90th percentile and higher): o Under natural cloud dynamics > 6x o Redirection schemes: e.g., DNS-based load-balancers > 3x 5 Outline Introduction Measurement and Observations System Design Evaluation 6 Performance Variability in Multi-tier Interactive Applications Thumbnail Application FE Load Balancer Queue eb R II olR Web II IIS eol Role SS e BE Work orker erRol Worker Role e Role blob Queue blob BL1 BL2 Work orker erRol Worker Role e Role Multi-tier apps may consist of hundreds of components Deploy each app on 2 DC’s simultaneously 7 Performance Variability in Multi-tier Interactive Applications FE Outliers BL1 BE BL2 75th median 25th 8 Observations Replicas of a component are uncorrelated Few components show poor performance at any time FE BL1 BE BL2 Performance problems are short-lived; 90% < 4 mins FE BL1 BE 9 BL2 Outline Introduction Measurement and Observations System Design Evaluation 10 GTM Dealer Approach: Per-Component Re-routing C1 C2 C3 C4 Cn C1 C2 C3 C4 Cn Split req’s at each component dynamically Serve each req using a combination of replicas across multiple DC’s 11 Dealer System Overview GTM C1 C1 C1 12 C2 C2 C2 C3 C3 C3 Cn Cn Cn Dealer Dealer High Level Design Application Determine Delays Compute Split-Ratios Stability Dynamic Capacity Estimation 13 Determining Delays Monitoring: Instrument apps to record: Compute ComponentDetermine processing time Application Inter-component delay Delays Split-Ratios Use X-Trace for instrumentation, uses global ID Automate integration using Aspect Oriented Programming Stability (AOP) Push logs asynchronously toDynamic reduce overhead Capacity Estimation Send req’s along lightly used links and comps Active Probing: Use workload generators (e.g., Grinder) Heuristics for faster recovery by biasing towards better paths 14 Determining Delays Monitoring Probing Combine Estimates Stability & Smoothing 15 • Delay matrix D[,]: component processing and intercomponent communication delay • Transaction matrix T[,]: transactions rate between components Calculating Split Ratios C11 Application user C1 C12 16 FE Determine Delays C21 FE C2 C22 C41 C31 Compute Split-Ratios BE C3 BE C32 Dynamic Capacity Estimation C51 BL1 BL1 C4 BL C BL2 C52 42 Stability C52 Calculating Split Ratios C11 C21 C41 C31 C51 C42 C12 C22 C32 Given: C52 Delay matrix D[im, jn] Algorithm: greedyT[i,j] algorithm that assigns requests to the best Transaction matrix performing combination of replicas DC’s) Capacity matrix C[i,m] (capacity of (across component i in data-center m) Goal: 17 Find Split-ratios TF[im, jn]: # of transactions between each pair of components Cim and Cjn s.t. overall delay is minimized Other Design Aspects Dynamic Capacity Estimation: Develop algorithm to dynamically capture capacities of comps Prevent comps getting overloaded by re-routed traffic Stability: multiple levels: Smooth matrices with Weighted Moving Average (WMA) Damp Split-Ratios by a factor to avoid abrupt shifts Integration with Apps:Determine Application Can be integrated with any Compute Delays(stateful;Split-Ratios app e.g., StockTrader) Provide generic pull/push API’s Dynamic Capacity Estimation 18 Stability Outline Introduction Measurement and Observations System Design Evaluation 19 Evaluation Real multi-tier, interactive apps: Thumbnails: photo processing, data-intensive Stocktrader: stock trading, delay-sensitive, stateful 2 Azure datacenters in US Workload: Real workload trace from big campus ERP app DaCapo benchmark Comparison with existing schemes: DNS-based redirection Server-based redirection Performance variability scenarios (single fault domain failure, storage latency, transaction mix change, etc.) 20 Running In the Wild Evaluate Dealer under natural cloud dynamics Explore inherent performance variability in cloud environments More than 6x difference 21 Running In the Wild 22 FE BE BL A FE BE BL B Dealer vs. GTM Global Traffic Managers (GTM’s) use DNS to route user IP’s to closest DC Best performing DC ≠ closest DC (measured by RTT) Results: more than 3x improvement for 90th percentile and higher 23 Dealer vs. Server-level Redirection Re-route entire request, granularity of DC’s HTTP 302 DCA DCB 24 Evaluating Against Server-level Redirection BL1 FE BE BL2 BL1 FE 25 BE BL2 Conclusions Dealer: novel technique to handle cloud variability in multi tier interactive apps Per-component re-routing: dynamically split user req’s across replicas in multiple DC’s at component granularity Transient cloud variability: performance problems in cloud services, workload spikes, failures, etc. Short time scale adaptation: tens of seconds to few mins Performance tail improvement: o Natural cloud dynamics > 6x o Coarse-grain Redirection: e.g., DNS-based GTM > 3x 26 Questions? 27