Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt.
Download ReportTranscript Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt.
Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt. Ltd. UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science Internet Applications Proliferation of Internet applications auction site online game online store Growing significance in personal, business affairs Focus: Internet server applications UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 2 Multi-tiered Internet Applications requests Load balancer database http J2EE Internet applications: multiple tiers Example: 3 tiers: HTTP, J2EE app server, database Replicable components Individual tiers: partially or fully replicable Example: clustered HTTP, J2EE server, shared-nothing db Each tier uses a dispatcher: load balancing UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 3 Internet Workloads Are Dynamic Multi-time-scale variations 1200 Flash crowds Key issue: How to provide desired response time under varying workloads? 0 0 1 2 3 4 5 Time (days) 140K 140000 Request Rate (req/min) Arrivals per min Time-of-day, hour-of-day 120000 100000 80000 60000 40000 20000 0 0 0 0 5 10 12 15 Time (hrs) 20 24 Time (hours) UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 4 Internet Data Center Internet applications run on data centers Server farms o Provide computational and storage resources Applications share data center resources Problem: How should the platform allocate resources to absorb workload variations? UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 5 Our Provisioning Approach Flexible queuing theoretic model Captures all tiers in the application Predictive provisioning Long-term workload variations Reactive provisioning Short-term variations, flash crowds UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 6 Talk Outline Introduction Internet data center model Existing provisioning approaches Dynamic capacity provisioning Implementation and evaluation Summary UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 7 Data Center Model Retail Web site streaming Dedicated hosting: each application runs on a subset of servers in the data center Subsets are mutually exclusive: no server sharing Data center hosts multiple applications Free server pool: unused servers UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 8 Single-tier Provisioning Single tier provisioning well studied [Muse] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput 10 14 14 req/s C=15 C=10 10 C=10.1 dropped 4 req/s UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 9 Single-tier Provisioning Single tier provisioning well studied [Muse] Non-trivial to extend to multiple-tiers Strawman #1: use single-tier provisioning independently at each tier Problem: independent tier provisioning may not increase goodput 14 req/s C=15 10.1 14 14 C=20 C=10.1 dropped 3.9 req/s UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 10 Model-based Provisioning Black box approach Treat application as a black box Measure response time from outside Increase allocation if response time > SLA o Use a model to determine how much to allocate Strawman #2: use black box for multi-tier apps Problems: Unclear which tier needs more capacity May not increase goodput if bottleneck tier is not replicable 14 req/s C=15 10.1 14 14 C=20 C=10.1 UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 11 Provisioning Multi-tier Apps Approach: holistic view of multi-tier application Determine tier-specific capacity independently Allocate capacity by looking at all tiers (and other apps) Predictive provisioning Long-term provisioning: time scale of hours Maintain long-term workload statistics Predict and provision for the next few hours Reactive provisioning Short term provisioning: time scale of several minutes React to “current” workload trends Correct errors of long-term provisioning Handle flash crowds (inherently unpredictable) UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 12 Predictive Provisioning Workload predictor Predicts workload based on past observations Application model Infers capacity needed to handle given workload past workload Predictor predicted workload Model required capacity response time target UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 13 Workload Prediction Long term workload monitoring and prediction Monitor workload for multiple days Maintain a histogram for each hour of the day o o Tue Capture time of day effects Forecast based on o Mon Observed workload for that hour in the past Observed workload for the past few hours of the current day Wed Today Predict a high percentile of expected workload UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 14 Model-based Capacity Inference G/G/1 lpred G/G/1 G/G/1 Queuing theoretic application model Each individual server is a G/G/1 queue 1 a2 b2 l E(s) 2 * E(r) E(s) ( Derive per-tier E(r) from end-to-end SLA Monitor other parameters and determine l (per-server capacity) Use predicted workload lpred to determine # servers per tier o Assumes perfect load balancing in each tier UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 15 Reactive Provisioning lactual lpred Prediction error lerror >t time series Invoke reactor allocate servers Idea: react to current conditions Useful for capturing significant short-term fluctuations Can correct errors in predictions Track error between long-term predictions and actual Allocate additional servers if error exceeds a threshold Account for prediction errors Can be invoked if request drop rate exceeds a threshold Handles sudden flash crowds Operates over time scale of a few minutes Pure reactive provisioning: lags workload Reactive + predictive more effective! UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 16 Talk Outline Introduction Internet data center model Existing provisioning approaches Dynamic capacity provisioning Implementation and evaluation Summary UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 17 Prototype Data Center Server Node Control Plane OS Nucleus OS Applications Apps Nucleus OS Apps Nucleus Apps Resource monitoring Parameter estimation Dynamic provisioning 40+ Linux servers Gigabit switches Multi-tier applications Auction (RUBiS) Bulletin-board (RUBBoS) Apache, Tomcat (replicable) Mysql database UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 18 Only Predictive Provisioning Auction application RUBiS Factor of 4 increase in 30 min Workload Response time 160 Resp time (msec) Arrivals per min 140 120 100 80 60 40 20 0 10 20 30 40 Time (min) 50 60 7000 6000 5000 4000 3000 2000 1000 0 0 10 20 30 40 50 60 Time (min) Predictor fails during [15, 30] resulting in under-provisioning Response time violations occur UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 19 Only Reactive Provisioning Auction application RUBiS Factor of 4 increase in 30 min Response time 7000 140 6000 Resp time (msec) Arrivals per min Workload 160 120 100 80 60 40 20 0 10 20 30 40 Time (min) 50 60 5000 4000 3000 2000 1000 0 0 10 20 30 40 50 60 Time (min) Response time shows oscillatory behavior Several response time violations occur UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 20 Predictive + Reactive Provisioning Auction application RUBiS Factor of 4 increase in 30 min 120 100 80 60 40 10 20 30 40 Time (min) 50 60 10 7000 Web servers App servers Resp time (msec) 140 20 0 Response time 12 Number of servers Arrivals per min Server allocations Workload 160 8 6 4 2 0 0 10 20 30 40 Time (min) 50 60 6000 5000 4000 3000 2000 1000 0 0 10 20 30 40 Time (min) 50 60 Server allocations increased to match increased workload Response time kept below 2 seconds UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 21 Summary Dynamic provisioning for multi-tier applications Flexible queuing theoretic model o Captures all tiers in the application Predictive provisioning Reactive provisioning Implementation and evaluation on a Linux cluster UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 22 Thank you! More information at: http://www.cs.umass.edu/~bhuvan UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Computer Science 23