Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
Download ReportTranscript Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research 1 Motivation • Interactive services specify stringent SLA on response time • Long response time causes user dissatisfaction and revenue loss • Important to bound response time (e.g. mean, 95-percentile) • Address two challenges • Adapt to dynamic and changing environment • Achieve high response quality GOAL: Develop a self-managed scheduling system to meet response time target while achieving high quality. 2 Existing Techniques (1) • Static admission control approach – Define a fixed queue length limit; drop requests when queue is full. • Issues – Only works under a static system. – Determining an appropriate queue-length for every setting and load is challenging. • Small queue length => underutilize resources • Large queue length => long response time – Can not adapt to dynamic and changing environment. 3 Existing Techniques (2) • Classic feedback control approach: – Feedback control on queue length • Decrease queue length when response time is above target – Issue • Dropping requests results in degraded quality • Does not consider partial execution of requests 4 Partial Execution & Response Quality • Incomplete execution of requests may still return meaningful partial results • Many interactive services support partial execution – Web search, web server, video streaming, finance server • Quality profile – A function maps request execution time to response quality request completion time Quality Quality request completion time time (a) without partial execution time (b) with partial execution 5 Our Contributions • Propose a budget-based control model for interactive services with partial execution – Use feedback control to meet response time target – Apply optimization procedure to improve response quality • Exploit partial execution and request quality profile • Evaluation – Implementation at Bing search server – Simulation on finance server 6 Budget-based Control Model • Control Variable – Budget: amount of computation time for all pending requests • Control mechanism – Determine the budget based on response time feedback – Control budget to meet response time • Optimization procedure – Given a budget, assign processing time to requests – Exploit partial results of a query – Scheduling to improve quality 7 Control Mechanism • Basic idea – If response time is larger than target, smaller budget – If response time is smaller than target, larger budget • Criteria – Meet response time target accurately and quickly – Incur little runtime overhead. 8 Control Mechanism: Background • Integral control – Adjust budget based on the difference between the observed and target response time – Advantage: eliminate steady-state error – Limitation: response is slow (long settling time) • Adaptive control – Model estimator + Linear quadratic optimal controller – Advantage: quick adaptation, fast response – Limitation: computationally expensive, stead-state error 9 Control Mechanism: Hybrid Control • Combine the integral and adaptive control • Run adaptive control periodically in a coarsegrain time interval • Use integral control for execution of each request for fine-grain adjustment • Meet our goal – Quick and accurate adaptation – Little runtime overhead. 10 Optimization Procedure • • • • Objective: maximize total response quality Input: budget, pending requests Output: assigned processing time to requests Optimization procedure depends on applications Bing index server • Core part of Bing search – For a user query, match and rank docs, return top results • Concave quality profile – First-half of request execution receives higher quality gain than the second half. Quality 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Normalized Processing Time 12 Optimization Procedure for Index Server • Run the portion of requests with higher gain • Prevent long requests from starving short ones • Combine two techniques – Reservation at light load: • Reserve time for later requests in the queue based on mean service demand – Equal sharing at heavy load: • Allocate resource equally among requests 13 Evaluation • Implemented and evaluated at Bing index server – Meet response time target and achieve high quality • Simulation study on finance server – Double system throughput at desired quality 14 Bing Index Server • Implementations – BudgetIS • Feedback control on budget • Hybrid control + optimization procedure – QueueIS • Feedback control on queue length • Evaluation – Production trace 15 mean response time (ms) Compare Queue v.s. Budget Approach 50 40 Mean response time = 35ms 30 20 BudgetIS 10 QueueIS 0 200 250 300 350 QPS 400 450 500 1 0.95 average quality Budget approach • Meet response time accurately • Achieve high quality 0.9 0.85 0.8 0.75 BudgetIS 0.7 QueueIS 0.65 200 250 300 350 QPS 400 450 500 16 Conclusion • Propose a budget-based control optimization model for interactive services with partial execution – Hybrid control mechanism to meet response time target – Optimization procedure to improve response quality • Evaluation – Implemented and evaluated at Bing index server • Meet response time target and achieve high quality – Simulation study on finance server • Double system throughput at desired quality 17 Thank you & Questions? 18