Tradeoffs? - Rice University

Download Report

Transcript Tradeoffs? - Rice University

A Load-Balanced
Switch with an
Arbitrary Number of
Linecards
Offense
Anwis Das
High Level Goal
 Design Scalable, fault-tolerant router
 Operate at 100 Tb/s, 40 times more than
current standards
 Clearly a challenge
Tradeoffs?
 Architecture is based upon load-balanced
Birkhoff-von Neumann switch
– Essentially a load-balancer followed by input buffered
switch
 Types of switches
– Input buffered switches
– Output buffered switches
– Combined input-output switches (CIOS)
 But is throughput all that matters??
Quality of Service
 Load balanced BV switch cannot guarantee
any rate of service to any flow
Providing Guaranteed Rate Services in the
Load Balanced Birkhoff-von Neumann switch.
Chang et al. Infocom 2003.
 Such guarantees are required if a router
wishes to implement certain classes of QoS
such as Expedited Forwarding in DiffServe
Flexibility
 Original Frame based scheduling allowed
for flow guarantees
 In this architecture, everything is fixed
 Impossible to guarantee any flow any
bandwidth without running scheduling
algorithm again
Average Delay and Delay
Variability
 “frame based scheduling suffers from an
important drawback: it often results in large cell
delays and large delay variability”
– Issac Keslassy in “On Guaranteed Smooth Scheduling
for Input-Queued Switches” in Infocom 2003
– Failed to mention packet mis-sequencing problem
already solved. Load Balanced Birkhoff-von Neumann
Switches, part II: one-stage buffering”, Computer
Communications 2002
– Problem is even worse in this paper due to their
solution to solve the packet-missequencing problem
 Scheduling is not smooth
– Average delay high, burstiness, low short-term
Linecards and Delay
 Delay is proportional to frame size and and
frame size is proportional to number of
linecards
– Delay is proportional to number of linecards!!
– Large groups of linecards=> lots of
linecards=> large delay!!
Fault Tolerance
 Authors claim that lack of centralized scheduler
leads to greater fault tolerance: Agree
 Ironically, the paper discusses how to improve
fault-tolerance
– Linecards, or MEMS switches, or connectors more
likely to fail than centralized scheduler
 Proposed Solution:
– Run algorithm to figure out static MEMS
configuration
 Too slow!!! (50 seconds vs. 50 milliseconds)
– Polynomial algorithm means nothing in practice
– Authors partially failed in what they set out to do
Crux of Paper

Outline- Part 1
–

Outline- Part 2
–
–

L-L, L-G, G-G (“easily deduce”)
G-G (Interesting, but more about this later)
G-L, L-L (Uninteresting) and similar work already
done. “Load Balanced Birkhoff-von Neumann
Switches, part I: one-stage buffering”, Computer
Communications 2002.
Invoking Santa’s principle, not much original
work