Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.

Download Report

Transcript Budget-based Control for Interactive Services with Partial Execution Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.

Budget-based Control for Interactive
Services with Partial Execution
Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety
Microsoft Research
1
Motivation
• Interactive services specify stringent SLA on response time
• Long response time causes user dissatisfaction and revenue
loss
• Important to bound response time (e.g. mean, 95-percentile)
• Address two challenges
• Adapt to dynamic and changing environment
• Achieve high response quality
GOAL:
Develop a self-managed scheduling system to meet
response time target while achieving high quality.
2
Existing Techniques (1)
• Static admission control approach
– Define a fixed queue length limit; drop requests when
queue is full.
• Issues
– Only works under a static system.
– Determining an appropriate queue-length for every
setting and load is challenging.
• Small queue length => underutilize resources
• Large queue length => long response time
– Can not adapt to dynamic and changing environment.
3
Existing Techniques (2)
• Classic feedback control approach:
– Feedback control on queue length
• Decrease queue length when response time is above
target
– Issue
• Dropping requests results in degraded quality
• Does not consider partial execution of requests
4
Partial Execution & Response Quality
• Incomplete execution of requests may still return
meaningful partial results
• Many interactive services support partial execution
– Web search, web server, video streaming, finance server
• Quality profile
– A function maps request execution time to response quality
request completion
time
Quality
Quality
request completion
time
time
(a) without partial execution
time
(b) with partial execution
5
Our Contributions
• Propose a budget-based control model for
interactive services with partial execution
– Use feedback control to meet response time target
– Apply optimization procedure to improve response quality
• Exploit partial execution and request quality profile
• Evaluation
– Implementation at Bing search server
– Simulation on finance server
6
Budget-based Control Model
• Control Variable
– Budget: amount of computation time for all pending
requests
• Control mechanism
– Determine the budget based on response time
feedback
– Control budget to meet response time
• Optimization procedure
– Given a budget, assign processing time to requests
– Exploit partial results of a query
– Scheduling to improve quality
7
Control Mechanism
• Basic idea
– If response time is larger than target, smaller budget
– If response time is smaller than target, larger budget
• Criteria
– Meet response time target accurately and quickly
– Incur little runtime overhead.
8
Control Mechanism: Background
• Integral control
– Adjust budget based on the difference between the
observed and target response time
– Advantage: eliminate steady-state error
– Limitation: response is slow (long settling time)
• Adaptive control
– Model estimator + Linear quadratic optimal controller
– Advantage: quick adaptation, fast response
– Limitation: computationally expensive, stead-state error
9
Control Mechanism: Hybrid Control
• Combine the integral and adaptive control
• Run adaptive control periodically in a coarsegrain time interval
• Use integral control for execution of each
request for fine-grain adjustment
• Meet our goal
– Quick and accurate adaptation
– Little runtime overhead.
10
Optimization Procedure
•
•
•
•
Objective: maximize total response quality
Input: budget, pending requests
Output: assigned processing time to requests
Optimization procedure depends on applications
Bing index server
• Core part of Bing search
– For a user query, match and rank docs, return top results
• Concave quality profile
– First-half of request execution receives higher quality gain
than the second half.
Quality
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Normalized Processing Time
12
Optimization Procedure for Index Server
• Run the portion of requests with higher gain
• Prevent long requests from starving short ones
• Combine two techniques
– Reservation at light load:
• Reserve time for later requests in the queue based on
mean service demand
– Equal sharing at heavy load:
• Allocate resource equally among requests
13
Evaluation
• Implemented and evaluated at Bing index server
– Meet response time target and achieve high quality
• Simulation study on finance server
– Double system throughput at desired quality
14
Bing Index Server
• Implementations
– BudgetIS
• Feedback control on budget
• Hybrid control + optimization procedure
– QueueIS
• Feedback control on queue length
• Evaluation
– Production trace
15
mean response time (ms)
Compare Queue v.s. Budget Approach
50
40
Mean response time = 35ms
30
20
BudgetIS
10
QueueIS
0
200
250
300
350
QPS
400
450
500
1
0.95
average quality
Budget approach
• Meet response time accurately
• Achieve high quality
0.9
0.85
0.8
0.75
BudgetIS
0.7
QueueIS
0.65
200
250
300
350
QPS
400
450
500
16
Conclusion
• Propose a budget-based control optimization model
for interactive services with partial execution
– Hybrid control mechanism to meet response time target
– Optimization procedure to improve response quality
• Evaluation
– Implemented and evaluated at Bing index server
• Meet response time target and achieve high quality
– Simulation study on finance server
• Double system throughput at desired quality
17
Thank you & Questions?
18