Scheduling in Server Farms Mor Harchol-Balter Computer Science Dept Carnegie Mellon University [email protected] = today Outline I. = tomorrow Review of scheduling in single-server II.
Download ReportTranscript Scheduling in Server Farms Mor Harchol-Balter Computer Science Dept Carnegie Mellon University [email protected] = today Outline I. = tomorrow Review of scheduling in single-server II.
Scheduling in Server Farms
Mor Harchol-Balter Computer Science Dept Carnegie Mellon University [email protected]
1
I.
Outline
Review of scheduling in single-server II. Supercomputing
FCFS Router FCFS
= today = tomorrow
III. Web server farm model
PS Router PS SRPT
IV. Towards Optimality …
& Router SRPT SRPT
Metric: Mean Response Time, E[T] 2
Single Server Model (M/G/1)
Poisson arrival process w/rate l
Load
r = l
E[X]<1
X: job size (service requirement)
C X
2 = 8
C X
2
1 ½ ¼
Bounded Pareto Pr{ Job size
x
}
~
1
x
• Supercomputing job sizes [Schroeder, Harchol-Balter 00]
D.F.R.
E X
2 50 is typical
k p
Top-heavy:
top 1% jobs make up half load
3
Scheduling Single Server (M/G/1)
Poisson arrival process Load r
<1
Huge Variance
Question: Order these scheduling policies for mean response time, E[T]:
1. FCFS (First-Come-First-Served, non-preemptive) 2. PS (Processor-Sharing, preemptive) 3. SJF (Shortest-Job-First, a.k.a., SPT, non-preemptive) 4. SRPT (Shortest-Remaining-Processing-Time, preemptive) 5. LAS (Least Attained Service, a.k.a., FB, preemptive) 4
Scheduling Single Server (M/G/1)
Poisson arrival process Load r
<1
Huge Variance
LOW E[T] HIGH E[T] SRPT < LAS < PS < SJF < FCFS
OPT for all arrival sequences [Schrage 67] Requires D.F.R. [Righter, Shanthikumar89] Insensitive to E[X 2 ] Surprisingly bad: (E[X 2 ] term) No “Starvation!” Even the biggest jobs prefer SRPT to PS: [Bansal, Harchol-Balter 01], [Wierman, Harchol-Balter 03]: THM: E[T(x)]
SRPT
< E[T(x)]
PS
for all x, for Bounded Pareto, r < .9.
~E[X 2 ] (shorts caught behind longs) 5
Effect of Variability
E[T] PS LAS LAS SRPT C 2 = Bounded Pareto job sizes r 6
Closed vs. Open Systems
Open System Closed System
Think Send Receive
QUESTION: What’s the effect of scheduling?
7
Closed vs. Open Systems
E[T] FCFS E[T] SRPT r Open System Results FCFS SRPT r Closed System Results Closed & open systems analyzed under same job size distribution, with same average load.
[Schroeder, Wierman, Harchol-Balter, NSDI 06] 8
Summary Single-Server
Single-server system
r <1 X: job size -- highly-variable LESSONS LEARNED: Smart scheduling greatly improves mean response time.
Variability of job size distribution is key.
Closed system sees much reduced effect.
9
Multiserver Model
Server farms: + Cheap + Scalable capacity
Sched. policy
Incoming jobs: Poisson Process
Routing (assignment) policy Router Sched. policy Sched. policy
2 Policy Decisions (Sometimes scheduling policy is fixed – legacy system) 10
I.
Outline
Review of scheduling in single-server II. Supercomputing
FCFS Router FCFS
III. Web server farm model
PS Router PS SRPT
IV. Towards Optimality …
& Router SRPT SRPT
Metric: Mean Response Time, E[T] 11
Supercomputing Model
FCFS
Poisson Process
Routing (assignment) policy Router FCFS FCFS
Jobs are not preemptible .
Jobs processed in FCFS order .
Assume hosts are identical.
Jobs i.i.d. ~ G: highly variable size distribution.
Size may or may not be known. Initially assume known.
12
Q: Compare Routing Policies for E[T]?
1. Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs.
Poisson Process Supercomputing
FCFS
Routing policy
Router FCFS FCFS 3. Least-Work-Left, equivalent to M/G/k/FCFS Go to host with least total work.
Jobs i.i.d. ~ G: highly variable 4. Central-Queue-Shortest-Job (M/G/k/SJF) Host grabs shortest job when free.
5. Size-Interval Splitting Jobs are split up by size among hosts.
13
A: Size-Interval Splitting: best so far
High E[T] 1. Round-Robin
Supercomputing
FCFS 2. Join-Shortest-Queue Go to host w/ fewest # jobs.
Routing policy
Router FCFS FCFS 3. Least-Work-Left, equivalent to M/G/k/FCFS Go to host with least total work.
Highly variable job sizes 4. Central-Queue-Shortest-Job (M/G/k/SJF) Host grabs shortest job when free.
Low E[T] 5. Size-Interval Splitting Jobs are split up by size among hosts.
[Harchol-Balter, Crovella, Murta, JPDC 99] 14
Routing Policies: Remarks
High E[T] 1. Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs.
Central-Queue: + Good utilization of servers.
+ Some isolation for smalls
3. Least-Work-Left, equivalent to M/G/k/FCFS Go to host with least total work.
4. Central-Queue-Shortest-Job (M/G/k/SJF) Host grabs shortest job when free.
Size-Interval WAY Better!
- Worse utilization of servers.
+ Great isolation for smalls!
Low E[T] 5. Size-Interval Splitting Jobs are split up by size among hosts.
[Harchol-Balter, Crovella, Murta, JPDC 99].
15
Size-Interval Splitting
Size Interval Routing
S M L XL
job size x
Question: How to choose the size cutoffs?
“To Balance Load or Not to Balance Load?”
S
xf
(
x
)
dx
M xf
(
x
)
dx L
xf
(
x
)
dx
XL xf
(
x
)
dx
16
Size-Interval Splitting
FCFS
s s s s
FCFS L L L S L
Size Interval Routing job size x
Answer: Recent Research for case of Bounded Pareto job size: Pr{X>x} ~ x -
a a
<1
UNBALANCE favor smalls
a
=1
BALANCE LOAD
a
>1
UNBALANCE favor larges
[Harchol-Balter,Vesilo, 06+], [Glynn, Harchol-Balter, Ramanan, 06+] 17
Beyond Size-Interval Splitting
FCFS
s s s s
FCFS L L L S L
Size Interval Routing job size x
Q: Is Size-Interval Splitting as good as it gets?
18
Size-Interval Splitting with Stealing
Answer: Allow Cycle Stealing!
Size Interval Routing with Cycle Stealing
FCFS FCFS S L Send Shorts Here Send Longs Here.
But, if idle, send Short.
Gain to Shorts is high Pain to Longs is very small.
Cycle Stealing analysis very hard: Harrison, Borst, Williams … New easy approach: Fayolle, Iasnogorodski, Konheim, Meilijson, Melkman, Cohen, Boxma, van Uitert, Jelenkovic, Foley, McDonald, Dimensionality Reduction 2D 1D [Harchol-Balter, Osogami, Scheller-Wolf, Squillante SPAA03] 19
What if Don’t Know Job Size?
FCFS 1. Round-Robin 2. Join-Shortest-Queue Go to host w/ fewest # jobs.
3. Least-Work-Left, equivalent to M/G/k/FCFS Go to host with least total work.
Routing policy
Router FCFS FCFS
Highly variable job sizes 4. Central-Queue-Shortest-Job (M/G/k/SJF) Host grabs shortest job when free.
5. Size-Interval Splitting Jobs are split up by size among hosts. Q: What can we do to minimize E[T] when don’t know job size?
20
The
TAGS
algorithm “
T
ask
A
ssignment by
G
uessing
S
ize”
s Host 1 m Host 2 Outside Arrivals Host 3 Answer: When job reaches size limit for host, then it is killed and restarted from scratch at next host.
[Harchol-Balter, JACM 02] 21
Results of Analysis
Random Least-Work-Left TAGS
High variability Lower variability
22
Summary – Part I
Single-server system
r <1 X: job size -- highly-variable LESSONS LEARNED: Smart scheduling greatly improves mean response time.
Variability of job size distribution is key.
Closed system sees much reduced effect.
Supercomputing
FCFS Router FCFS
LESSONS LEARNED: Greedy routing policies, like JSQ, LWL are poor.
To combat variability, need size-interval splitting.
By isolating smalls, can achieve effects of smart single-server policies Load UN-balancing Don’t need to know size.
23
Tomorrow …
I. Review of scheduling in single-server II. Supercomputing/Manufacturing
FCFS
M/GI/1
III. Web server farm model
PS Router Router FCFS PS
IV. Towards Optimality …
Homework
SRPT SRPT & Router SRPT
24
Scheduling in Multiserver Systems PART II Mor Harchol-Balter Computer Science Dept Carnegie Mellon University [email protected]
25
I.
Outline
Review of scheduling in single-server II. Supercomputing
FCFS Router FCFS
III. Web server farm model
PS Router PS SRPT
IV. Towards Optimality …
& Router SRPT SRPT
26
I.
Outline
Review of scheduling in single-server II. Supercomputing
FCFS Router FCFS
III. Web server farm model
PS Router PS SRPT
IV. Towards Optimality …
& Router SRPT SRPT
27
Web Server Farm Model
Poisson Process
Routing policy Router PS PS PS
Cisco Local Director IBM Network Dispatcher Microsoft SharePoint F5 Labs BIG/IP
HTTP requests are immediately dispatched to server.
Requests are fully preemptible .
Commodity servers utilized Do Processor-Sharing.
Jobs i.i.d. ~ G: highly variable size distribution, 7 orders magnitude difference in job size [Crovella, Bestavros 98].
28
Q: Compare Routing Policies for E[T]?
High E[T] FCFS 1. Random
Web Server Farm
PS PS Router PS 2 . Join-Shortest-Queue Go to host w/ fewest # jobs.
?
High variance job size ?
3. Least-Work-Left Go to host with least total work.
Low E[T] FCFS 4. Size-Interval Splitting Jobs are split up by size among hosts.
(Central-Queue policies aren’t possible for PS farms) 29
Q: Compare Routing Policies for E[T]?
PS 1. Random Router PS PS 2. Join-Shortest-Queue Go to host w/ fewest # jobs.
3. Least-Work-Left Go to host with least total work.
4. Size-Interval Splitting Jobs are split up by size among hosts.
Answer: Shortest Queue is greedier & better.
E[T] 8 servers, Answer: Same for E[T], but not great.
Also, want to balance load!
RAND SIZE
E T
M/G/1/PS l 1 r r JSQ LWL
E T
PS farm =
i p i
l 1
p i
1 r r 30
Prior Analysis of JSQ Routing All prior JSQ analysis assumes FCFS servers
FCFS JSQ FCFS 2-server:
[Kingman 61] , [Flatto, McKean 77], [Wessels, Adan, Zijm 91] [Foschini, Salz 78], [Knessl, Makkowsky, Schuss, Tier 87] [Conolly 84], [Rao, Posner 87], [Blanc 87], [Grassmann 80], [Muntz, Lui, Towsley 95] [Cohen, Boxma 83]
>2-server approximations:
[Nelson, Philips, Sigmetrics 89] [Nelson, Philips, Perf.Eval. 93] [Lin, Raghavendra, TPDS 96] 31
[Nelson, Philips] Idea
FCFS JSQ FCFS
E
[
Waiting Time
] =
E
[
JobSize
]
n
Pr{
n
total in
Assume this is: p
M n
/
M
/
k
all }
E
[ length shortest queue
Assume this is:
n k
k: #servers
|
n
total]
32
First Analysis of JSQ for PS
[Gupta, Harchol-Balter, Sigman, Whitt, 06+] PS
Poisson Process
JSQ PS PS Near insensitivity to C 2 : PS server farm with General service
PS server farm w/Exponential service
=
FCFS server farm w/Exponential service Single-queue equivalence: For PS server farm w/Exponential service, Multiserver system
=
Single queue w/ contingent arrival rates
33
Supercomputing
FCFS Router
Summary so far
FCFS
LESSONS LEARNED: Greedy routing policies, like JSQ, LWL are poor.
To combat variability, need size-interval splitting.
By isolating smalls, can achieve effects of smart single-server policies Load UN-balancing Don’t need to know size.
Web server farm
PS Router PS PS
LESSONS LEARNED: JSQ routing is good!
Job size variability not a problem.
Load Balancing 34
Outline
I. Review of scheduling in single-server II. Supercomputing/Manufacturing
FCFS
M/GI/1
III. Web server farm model
PS Router Router FCFS PS
IV. Towards Optimality …
SRPT SRPT & Router SRPT
35
What is Optimal Routing/Scheduling?
Incoming jobs
Routing policy Router Sched. policy Sched. policy Sched. policy
2 Policy Decisions Assume no restrictions: Jobs are fully preemptible.
Can have central queue if want it, or not.
Know job size (of course don’t know future jobs ...) 36
What is Optimal Routing/Scheduling?
Central-Queue-SRPT
SRPT Recall: minimizes E[T] on every sample path!
[Schrage 67] Question: Central-Queue-SRPT looks pretty good!
Does it minimize E[T]?
37
Central-Queue-SRPT
SRPT Answer: This does not minimize E[T] on every arrival sequence.
Bad Arrival Sequence: @time 0: 2 jobs size 2 9 , 1 job size 2 10 @time 2 10 : 2 jobs size 2 8 , 1 job size 2 9 @time 2 10 + 2 9 : 2 jobs size 2 7 , 1 job size 2 8 , etc.
OPT
2 8 2 8 2 9 2 9 2 10 2 9
Central-Queue-SRPT
2 8 2 9 2 9 2 8 2 10 2 9 preempted
38
Central-Queue-SRPT
SRPT
Adversarial (Worst-Case) Guarantees: THM: [Leonardi, Raz, STOC 97]: Central-Queue-SRPT is
O
log min
biggest size smallest size
, # #
jobs servers
competitive for E[T], and no online policy can improve upon this by more than constant factor.
Remarks: log(biggest/smallest) could be factor 7 in practice!
Closest stochastic result analyzes only central-queue w/priorities: [Harchol-Balter, Wierman, Osogami, Scheller-Wolf, QUESTA 05] 39
What is Optimal Routing/Scheduling with Immediate Dispatch?
Sched. policy
Incoming jobs
Routing policy Router Sched. policy Sched. policy
2 Policy Decisions Practical Assumption: jobs must be immediately dispatched!
Jobs are fully preemptible within queue.
Know job size.
40
What is Optimal Routing/Scheduling when Immediately Dispatch?
Incoming jobs
Immediately Dispatch Jobs Router SRPT SRPT
Claim: The optimal routing/sched.
pair given immed. dispatch uses SRPT at the hosts.
(Assuming an opt pair exists.) PROOF: Let A: optimal routing/scheduling pair wrt E[T].
Suppose by contradiction: A does not use SRPT at the hosts.
Let policy pair B mimic A with respect to A’s dispatching of jobs to hosts.
I.e., policy B may be different from A, but sends the same jobs to the same hosts at the same times as A.
But after the dispatching, B does SRPT scheduling at the hosts.
Thus B improves upon A with respect to E[T]. Contradiction!
IMPACT: Claim
narrow search to policies with SRPT at hosts.
41
In search of good Immediate Dispatch Routing
Incoming jobs
Immediately Dispatch Jobs Router SRPT SRPT SRPT
Q: What should immediate dispatch routing
policy be, given SRPT sched. at hosts?
42
In search of good Immediate Dispatch Routing … why not obvious
Immediately Dispatch Jobs
JSQ
SRPT SRPT
Bad Arrival Sequence: @time 0: job of size 10 arrives.
@time 0 + : job of size 1000 arrives.
@time 0 ++ : job of size 10 arrives.
@time 0+++: job of size 1 arrives. @time 1+++: job of size 1000 arrives.
1000
OPT
10 10 1 1000
JSQ/SRPT
10 10 1000
43
Smart Immediate Dispatch Policy
Incoming jobs
Immediately Dispatch Router SRPT SRPT SRPT
Answer: IMD Algorithm due to [Avrahami,Azar 03]: Split jobs into size classes Assign each incoming job to server w/ fewest #jobs in that class Remarks:
O
log min
biggest smallest
, #
jobs
Immediate Dispatching is “as good as” Central-Queue-SRPT Similar policy proposed by [Wu,Down 06] for heavy-traffic setting.
44
Supercomputing
Some Key Points
FCFS
Web server farm model
PS Router FCFS
• Need Size-interval splitting to combat job size variability and enable good performance.
Towards Optimality …
SRPT & Router Router PS
• Job size variability is not an issue.
• Greedy, JSQ, performs well.
SRPT SRPT
• Both these have similar worst-case E[T].
• Almost exclusively worst-case analysis, so hard to compare with above results.
• Need stochastic research here!
45
If you want to know more …
My class lectures are all available online.
15-849 Performance Modeling
** Highly-recommended for CS theory, Math, TEPPER, and ACO doctoral students
Queueing theory is an old area of mathematics which has recently become very hot. The goal of queueing theory has always been to improve the design/performance of systems, e.g. networks, servers, memory, disks, distributed systems, etc., by finding smarter schemes for allocating resources to jobs.
In this class we will study the beautiful mathematical techniques used in queueing theory, including stochastic analysis, discrete-time and continuous-time Markov chains, renewal theory, product-forms, transforms, supplementary random variables, fluid theory, scheduling theory, matrix-analytic methods, and more. Throughout we will emphasize realistic workloads, in particular heavy-tailed workloads. This course is packed with open problems -- problems which if solved are not just interesting theoretically, but which have huge applicability to the design of computer systems today.
Instructor: Mor Harchol-Balter ([email protected]) www.cs.cmu.edu/~harchol/ 46
References
N. Avrahami and Y. Azar, “Minimizing Total flow time and total completion time with immediate dispatching.” SPAA 2003, pp. 11-18.
N. Bansal and M. Harchol-Balter, "Analysis of SRPT scheduling: Investigating unfairness," Proceedings of ACM Sigmetrics 2001. P. Barford and M. Crovella, “Generating representative web workloads for network and server performance evaluation,” ACM Sigmetrics 1998, pp. 151-160.
J. Blanc, “A note on waiting times in systems with queues in parallel,” J. Appl. Prob., Vol. 24, 1987 pp 540-546. S. Borst, O. Boxma, and P. Jelenkovic, “Reduced load equivalence and induced burstiness in GPS queues with long-tailed traffic flows,” Queueing Systems, Vol. 43, 2003, pp. 274-285.
S. Borst, O. Boxma, and M. van Uitert, “The asymptotic workload behavior of two coupled queues,” Queueing Systems, Vol. 43, 2003, pp. 81-102.
J.W. Cohen and O. Boxma, Boundary Value Problems in Queueing System Analysis, North Holland, 1983 B. Conolly, “The Autostrada queueing problem,” J. Appl. Prob.: Vol. 21., 1984, pp. 394-403. 47
References, cont.
M. Crovella and A. Bestavros, “Self-similarity in world wide web traffic: evidence and possible causes,” Proceedings of the 1996 ACM Sigmetrics International Conference on Measurement and Modeling of Computer Systems, May 1996, pp. 160-169.
D. Down and R. Wu, “Multi-layered round robin scheduling for parallel servers,” Queueing Systems: Theory and Applications, Vol. 53, No. 4, 2006, pp. 177-188.
G. Fayole and R. Iasnogorodski, “Two coupled processors: the reduction to a Riemann-Hilbert problem,” Zeitschrift fur Wahrscheinlichkeistheorie und vervandte Gebiete, vol. 47, 1979, pp. 325-351.
L. Flatto and H.P. McKean, “Two queues in parallel,” Communication on Pure and Applied Mathematics, Vol. 30, 1977, pp. 255-263. R. Foley and D. McDonald, “Exact asymptotics of a queueing network with a cross-trained server,” Proceedings of INFORMS Annual Meeting, October 2003, pp. MD-062.
G. Foschini and J. Salz, “A basic dynamic routing problem and diffusion,” IEEE Transactions on Communications, Vol. Com-26, No. 3, March 1978. 48
References, cont.
P. Glynn, M. Harchol-Balter, K. Ramanan, “Heavy-traffic approach to optimizing size-interval task assignment,” Work in progress, 2006.
W. Grassmann, "Transient and steady state results for two parallel queues," Omega, vol. 8, 1980, pp. 105-112.
V. Gupta, M. Harchol-Balter, K. Sigman, and W. Whitt, “Analysis of join-the shortest-queue policy for web server farms.” In submission, 2006.
M. Harchol-Balter and A. Downey. "Exploiting process lifetime distributions for dynamic load balancing," Proceedings of ACM Sigmetrics '96 Conference on Measurement and Modeling of Computer Systems , May 1996, pp. 13-24.
M. Harchol-Balter, M. Crovella, and C. Murta, "On choosing a task assignment policy for a distributed server system," Journal of Parallel and Distributed Computing , vol. 59, no. 2, Nov. 1999, pp. 204-228.
M. Harchol-Balter, C. Li, T. Osogami, and A. Scheller-Wolf, and M. Squillante, “Cycle stealing under immediate dispatch task assignment,” Proceedings of the Annual ACM Symposium on Parellel Algorithms and Architectures (SPAA), June 2003, pp. 274-285.
49
References, cont.
M. Harchol-Balter, B. Schroeder, N. Bansal, M. Agrawal. "Size-based scheduling to improve web performance." ACM Transactions on Computer Systems , Vol. 21, No. 2, May 2003, pp. 207-233. M. Harchol-Balter and R.Vesilo, “Optimal cutoffs for size-interval task assignment,” Work in progress, 2006. M. Harchol-Balter, A. Wierman, T. Osogami, and A. Scheller-Wolf, "Multi-server queueing systems with multiple priority classes," Queueing Systems: Theory and Applications (QUESTA), vol. 51, no. 3-4, 2005, pp. 331-360. J. Kingman, “Two similar queues in parallel,” Biometrika, Vol. 48, 1961, pp. 1316-1323. A. Konheim, I. Meilijson, and A. Melkman, “Processor-sharing of two parallel lines,” J. Appl. Prob., Vol. 18, 1981, pp. 952-956.
C. Knessl, B. Matkowsky, Z. Schuss, and C. Tier, “Two parallel M/G/1 queues where arrivals join the system with the smaller buffer content,” IEEE Transactions on Communications, Vol. Com-35, No. 11,1987, pp. 1153-1158.
S. Leonardi and D. Raz, “Approximating total flow time on parallel machines,” ACM Symposium on Theory of Computing (STOC), 1997.
50
References, cont.
H. Lin, and C. Raghavendra, “An approximate analysis of the join the shortest queue (JSQ) policy”, IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 3, March 1996.
J. Lui, R. Muntz, D. Towsley, “Bounding the mean response time of the minimum expected delay routing policy: an algorithmic approach,” IEEE Transactions on Computers, Vol. 44, No. 12, Dec 1995.
S. Muthukrishnan, R. Rajaraman, A. Shaheen, and J. Gehrke, “Online scheduling to minimize average stretch,” Proceedings of the 40th Annual Symposium on Foundations of Computer Science, October 1999, pp. 433.
R. Nelson and T. Philips, “An approximation to the response time for shortest queue routing,” ACM SIGMETRICS Performance Evaluation Review, Vol. 17 No. 1, May 1989, pp. 181-189.
R. Nelson and T. Philips, “An approximation for the mean response time for shortest queue routing with general interarrival and service times,” Performance Evaluation, Vol. 17 No. 2, March 1993 pp. 123-139. 51
References, cont.
T. Osogami, M. Harchol-Balter, and A. Scheller-Wolf, “Analysis of cycle stealing with switching cost,” Proceedings of the ACM Sigmetrics, June 2003, pp. 184-195.
T. Osogami, M. Harchol-Balter, and A. Scheller-Wolf. "Analysis of cycle stealing with switching times and thresholds" Performance Evaluation, Vol. 61, No. 4, 2005, pp. 347-369.
B. Rao and M. Posner, “Algorithmic and approximate analysis of the shorter queue,” Model Naval Research Logistics, Vol. 34, 1987, pp. 381-398.
R. Righter and J. Shanthikumar, “Scheduling multiclass single server queueing systems to stochastically maximize the number of successful departures," Probability in the Engineering and Informational Sciences, Vol. 3, 1989, pp. 323-333.
L.E. Schrage, “A proof of the optimality of the shortest processing remaining time discipline,” Operations Research, Vol. 16, 1968, pp. 678-690.
B. Schroeder and M. Harchol-Balter, "Evaluation of task assignment policies for supercomputing servers: The case for load unbalancing and fairness," 9th IEEE Symposium on High Performance Distributed Computing (HPDC '00) , August 2000.
52
References, cont.
B. Schroeder, A. Wierman, and M. Harchol-Balter. "Closed versus open system models: A cautionary tale,” Proceedings of NSDI , 2006.
A. Shaikh, J. Rexford, and K. Shin, “Load-sensitive routing of long-lived IP flows,” Proceedings of SIGCOMM, September, 1999.
J. Wessels, I. Adan, and W. Zijm, “Analysis of the asymmetric shortest queue problem,” Queueing Systems, Vol. 8, 1991, pp. 1-58.
A. Wierman and M. Harchol-Balter. "Classifying scheduling policies with respect to higher moments of conditional response time." Proceedings of ACM Sigmetrics 2005 Conference on Measurement and Modeling of Computer Systems.
A. Wierman and M. Harchol-Balter. "Nearly insensitive bounds on SMART scheduling." Proceedings of ACM Sigmetrics 2005 Conference on Measurement
and Modeling of Computer Systems.
A. Wierman and M. Harchol-Balter, "Classifying scheduling policies with respect to unfairness in an M/GI/1," Proceedings of ACM Sigmetrics 2003 Conference on Measurement and Modeling of Computer Systems , June 2003.
53