Dionysus: Dynamic Scheduling of Network Updates

Download Report

Transcript Dionysus: Dynamic Scheduling of Network Updates

Dynamic Scheduling of
Network Updates
Based on the slides by Xin Jin
Hongqiang Harry Liu, Rohan Gandhi, Srikanth Kandula, Ratul Mahajan,
Ming Zhang, Jennifer Rexford, Roger Wattenhofer
Presented by Sergio Rivera
SDN: Paradigm Shift in Networking
• Direct and centralized updates
of forwarding rules in switches
Controller
• Many benefits
–
–
–
–
Traffic engineering [B4, SWAN]
Flow scheduling [Hedera, DevoFlow]
Access control [Ethane, vCRIB]
Device power management [ElasticTree]
• Systems operate by updating the data plane state of
the network (periodically or by triggers)
1
Motivation
• Time to update switch rules varies and leads to slow
network updates:
–
–
–
–
Switch hardware capabilities
Control load on the switch
Nature of updates
RPC delays (not that much…)
• Controlled experiments explore the impact of four
factors:
–
–
–
–
# rules to be updated
Priorities of the rules
Type of rules (insertion vs modification)
Control load
2
Observations
Single-Priority rules
Random-priority rules
3.3 ms per-rule update time
Reaches 18 ms per-rule update
time when inserting 600 rules
3
Observations (cont’d).
>10x
Modifying rules (TCAM starts
Control activities were performed
with
600in
single-priority
updating
100time
rules varies
Even
controlled rules)
conditions,while
switch
update
significantly!
11 ms per-rule modification
time (insert & delete).
Time to update highly varies. (e.g.
99th percentile vs. 50th percentile)
4
Network Update is Challenging
• Requirement 1: fast
– Slow updates imply longer periods where congestion or
packet loss occur
• Requirement 2: consistent
– No congestion, no blackhole, no loop, etc.
Controller
Network
5
What is Consistent Network Update
A
F1: 5
B
F2: 5
D
C
A
B
F2: 5
F3: 10
F4: 5
Current State
C
F3: 10
F1: 5
E
D
F4: 5
E
Target State
6
What is Consistent Network Update
A
F1: 5
B
F2: 5
D
C
A
B
F2: 5
F3: 10
F4: 5
C
F3: 10
F1: 5
E
D
F4: 5
E
Target State
Current State
A
F1: 5
B
F2: 5
D
F4: 5
C
F3: 10
E
Transient State
• Asynchronous updates can cause congestion
• Need to carefully order update operations
7
Existing Solutions are Slow
• Existing solutions are static [ConsistentUpdate’12, SWAN’13, zUpdate’13]
– Pre-compute an order for update operations
Static
Plan A
A
F1: 5
F3
F2
F4
F1
B
F2: 5
D
C
A
B
F2: 5
F3: 10
F4: 5
Current State
C
F3: 10
F1: 5
E
D
F4: 5
E
Target State
8
Existing Solutions are Slow
• Existing solutions are static [ConsistentUpdate’12, SWAN’13, zUpdate’13]
– Pre-compute an order for update operations
Static
Plan A
F3
F2
F4
F1
• Downside: Do not adapt to runtime conditions
– Slow in face of highly variable operation completion time
9
Static Schedules can be Slow
Static
Plan A
F3
F2
F4
F1
F1
F2
F3
F4
F1
F2
F3
F4
1
Static
Plan B
F3
F2
F1
F4
B
F2: 5
3
4
5 Time
F1
F2
F3
F4
C
F3: 10
1
2
3
4
5 Time
1
2
3
4
5 Time
F1
F2
F3
F4
1
A
2
2
3
4
5 Time
A
B
F2: 5
C
F3: 10
F1: 5
No staticD schedule
is a clear winner under
allE conditions!
E
D
F1: 5
F4: 5
F4: 5
Current State
Target State
10
Dynamic Schedules are Adaptive and Fast
Static
Plan A
F3
F2
F4
F1
Dynamic F3
Plan
F4
Static
Plan B
F3
F2
F2
F1
Adapts to actual conditions!
F1
F4
A
B
F2: 5
C
F3: 10
A
B
F2: 5
C
F3: 10
F1: 5
No staticD schedule
is a clear winner under
allE conditions!
E
D
F1: 5
F4: 5
F4: 5
Current State
Target State
11
Challenges of Dynamic Update Scheduling
• Exponential number of orderings
• Cannot completely avoid planning
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
Current State
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
F3: 5
E
F4: 5
Target State
12
Challenges of Dynamic Update Scheduling
• Exponential number of orderings
• Cannot completely avoid planning
F5: 10
A
F5: 10
C
B
F2: 5
F1: 5
D
F3: 5
Current State
A
C
B
F2: 5
F1: 5
E
F4: 5
D
F3: 5
E
F4: 5
Target State
F2
Deadlock
13
Challenges of Dynamic Update Scheduling
• Exponential number of orderings
• Cannot completely avoid planning
F5: 10
A
F5: 10
C
B
F2: 5
F1: 5
A
C
B
F2: 5
F1: 5
D
F3: 5
Current State
E
F4: 5
D
F3: 5
E
F4: 5
Target State
F1
14
Challenges of Dynamic Update Scheduling
• Exponential number of orderings
• Cannot completely avoid planning
F5: 10
A
F5: 10
C
B
F2: 5
F3: 5
Current State
F1
C
B
F2: 5
F1: 5
F1: 5
D
A
E
F4: 5
D
F3: 5
E
F4: 5
Target State
F3
15
Challenges of Dynamic Update Scheduling
• Exponential number of orderings
• Cannot completely avoid planning
F5: 10
A
F5: 10
C
B
F2: 5
E
F3: 5
F4: 5
Current State
F1
C
B
F2: 5
F1: 5
F1: 5
D
A
F3
D
F3: 5
E
F4: 5
Target State
F2
Consistent update plan
16
Dionysus Pipeline
• Balance planning and opportunism using a two-stage
approach
Current
State
Target
State
Consistency
Property
Dependency Graph Generator
Scheduler is independent
of CP
Update Scheduler
Encode valid orderings
Determine a fast order
based on constraints of CP
Network
17
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
Current State
D
F3: 5
E
F4: 5
Target State
Dependency Graph Elements
Operation node
Node
Edge
Resource node (link capacity,
switch table size)
Group operations and
Path node
link capacity resources
Dependencies between nodes
18
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
Current State
F3: 5
E
F4: 5
Target State
Move
F2
Move
F3
Move
F1
19
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
Current State
F3: 5
E
F4: 5
Target State
Move
F2
A-E: 5
Move
F3
Move
F1
20
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
Current State
F3: 5
E
F4: 5
Target State
Move
F2
A-E: 5
5
Move
F3
Move
F1
21
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
Current State
F3: 5
E
F4: 5
Target State
5
A-E: 5
Move
F2
5
Move
F3
Move
F1
22
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
Current State
F3: 5
E
F4: 5
Target State
5
A-E: 5
5
5
Move
F3
Move
F1
Move
F2
23
Dependency Graph Generation
F5: 10
A
B
F2: 5
F1: 5
F3: 5
D
F5: 10
C
A
C
B
F2: 5
F1: 5
E
F4: 5
D
Current State
F3: 5
E
F4: 5
Target State
5
A-E: 5
5
5
Move
F3
Move
F1
5
Move
F2
5
D-E: 0
24
Dependency Graph Generation
• DG captures dependencies but still leaves scheduling
flexibility.
• Two challenges in dynamically scheduling updates:
– Cycle resolution in the DG
– At any given time, decide which subset of rules will be
issued first
25
Dependency Graph Generation
• Supported scenarios
– Tunnel-based forwarding: WANs
– WCMP forwarding: data center networks
• Supported consistency properties
–
–
–
–
Loop freedom
Blackhole freedom
Packet coherence
Congestion freedom
26
Tunnel-Based Forwarding
• A flow is forwarded along one or more tunnels
• Ingress switch tags packet with the tunnel identifier
• Ingress switch splits incoming traffic accross the
tunnels based on configured weights.
• Offers loop freedom and packet coherence by design
• Blackhole freedom is guaranteed if:
– A tunnel is fully established before the ingress switch puts
any traffic on it
– All traffic is removed from the tunnel before the tunnel is
27
deleted
Tunnel-Based Dependency Graph
S4-S5: 10
S1-S4: 10
1
S5: 50
S4: 50
S1: 50
5
5
C
1
B
1
A
5
D
p3
5
5
p2
1
S2-S5: 5
G
F
E
1
S1-S2: 0
5
1
S2: 50
28
WCMP Forwarding
• Switches at every hop match on packet headers and split
flows over multiple next hops with configured weights
• Packet coherence is based on version numbers:
– Ingress switch tags each packet with a version number
– Downstream switches handle packets based on the embedded
version number
– Tagging ensures packets never mix configurations
• Blackhole- and Loop-Freedom is possible:
– Do not require versioning
– Can be encoded in the dependency graph
– (authors omit description of graph construction for such
conditions)
29
WCMP Dependency Graph
@S1: [(S2, 1) , (S4,0)] -> [(S2, 0.5) , (S4, 0.5)]
@S2: [(S3, 0.5) , (S5, 0.5)] -> [(S3, 1) , (S5, 0)]
@S4: Has only one next-hop. No change needed
S4-S5: 10
5
5
p3
X
Y
p2
S1-S4: 10
5
1
S2: 50
S1-S2: 0
5
5
Z
1
S2-S5: 5
30
Dionysus Pipeline
Current
State
Target
State
Consistency
Property
Dependency Graph Generator
Encode valid orderings
Update Scheduler
Determine a fast order
Network
31
Dionysus Scheduling
• Scheduling as a resource allocation problem
5
A-E: 5
5
5
Move
F3
Move
F1
5
Move
F2
5
D-E: 0
• How to allocate available resources to operations to minimize
the update time?
32
Dionysus Scheduling
• Scheduling as a resource allocation problem
Move
F2
A-E: 0
5
5
Move
F3
Move
F1
5
Deadlock!
5
D-E: 0
33
Dionysus Scheduling
• Scheduling as a resource allocation problem
5
A-E: 0
Move
F2
5
Move
F1
Move
F3
5
5
D-E: 0
34
Dionysus Scheduling
• Scheduling as a resource allocation problem
5
A-E: 0
Move
F2
5
Move
F3
5
D-E: 5
35
Dionysus Scheduling
• Scheduling as a resource allocation problem
5
A-E: 0
Move
F2
5
Move
F3
36
Dionysus Scheduling
• Scheduling as a resource allocation problem
5
A-E: 5
Move
F2
37
Dionysus Scheduling
• Scheduling as a resource allocation problem
Move
F2
Done!
38
Dionysus Scheduling
• Scheduling as a resource allocation problem
• NP-complete problems under link capacity and switch table
size constraints
• Approach
– DAG: always feasible, critical-path scheduling to solve ties
– Convert to a virtual DAG (general case include cycles)
– Rate limit flows to resolve deadlocks
39
Critical-Path Scheduling
• Calculate critical-path length (CPL) for
each node
Move
F1
CPL=3
5
A-B:5
CPL=2
5 5
CPL=1
Move
F2
– Extension: assign larger weight to operation nodes if
we know in advance the switch is slow
• Resource allocated to operation nodes
with larger CPLs
Move
F3
CPL=2
5
C-D:0
5
CPL=1
Move
F4
CPL=1
40
Critical-Path Scheduling
• Calculate critical-path length (CPL) for
each node
Move
F1
CPL=3
5
A-B:0
CPL=2
5
CPL=1
Move
F2
– Extension: assign larger weight to operation nodes if
we know in advance the switch is slow
• Resource allocated to operation nodes
with larger CPLs
Move
F3
CPL=2
5
C-D:0
5
CPL=1
Move
F4
CPL=1
41
Handling Cycles
Move
F2
5
• Convert to virtual DAG
– Consider each strongly connected
component (SCC) as a virtual node
A-E: 5
5
5
Move
F3
Move
F1
5
5
D-E: 0
• Critical-path scheduling on virtual DAG
– Weight wi of SCC: number of operation nodes
CPL=1
CPL=3
Move
F2
weight = 2
weight = 1
42
Handling Cycles
Move
F2
5
• Convert to virtual DAG
– Consider each strongly connected
component (SCC) as a virtual node
A-E: 0
5
Move
F1
Move
F3
5
5
D-E: 0
• Critical-path scheduling on virtual DAG
– Weight wi of SCC: number of operation nodes
CPL=1
CPL=3
Move
F2
weight = 2
weight = 1
43
Deadlocks may still happen
• The algorithm does not find feasible solution among all
possible orderings (problem hardness)
• No feasible solution exists even if the target state is valid
• Resolve deadlocks reducing flow rates (informing rate
limiters)
44
Resolving Deadlocks
• K* = Max number of flows rate limited each time (affects
throughput).
• Centrality to decide the order of iteration.
• Resource consumption by path-nodes or tunnel-add
operations must happen within the same SCC.
45
Resolving Deadlocks (cont’d)
• K* = 1
8
8
R3:0
A
P5
4
4
R1:0
P6
4
8
P4
8
P1
P7
8
8
B
4
P3
4
R2:0
4
4
P2
4
C
• By centrality, node A is selected.
• Reduces 4 units of traffic on both P6 and P7.
46
Resolving Deadlocks (cont’d)
• K* = 1
8
R3:0
8
P5
4
0
A
R1:0
P6
0
8
P4
8
P1
P7
8
8
B
4
P3
4
R2:0
4
4
P2
4
C
• Which releases 4 units of free capacity in R1 and R2
47
Resolving Deadlocks (cont’d)
• K* = 1
8
8
R3:0
A
P5
R1:4
4
8
P4
P1
4
8
8
8
B
P3
4
R2:4
P2
4
C
• A has no children. It does not belong to the SCC any more
• C is scheduled and partially schedules B
48
Resolving Deadlocks (cont’d)
• K* = 1
0/8
0/8
R3:0
A
P5
R1:0
4/8
P4
4/8
4/8
4/8
B
P3
4
R2:0
4
P2
C
• Moves 4 units from P3 to P4
• After C finishes, it schedules the remainded operation B
49
Resolving Deadlocks (cont’d)
• K* = 1
0/8
0/8
R3:4
A
P5
R1:0
0/4
P4
0/4
0/4
0/4
B
P3
R2:4
• For node A, increase its rate on P5 as long as R3 receives free
capacity released by P4
50
Resolving Deadlocks (cont’d)
• K* = 1
4/8
4/8
R3:0
P5
A
R1:0
4/4
P4
4/4
B
R2:0
• For node A, increase its rate on P5 as long as R3 receives free
capacity released by P4
51
Resolving Deadlocks (cont’d)
• K* = 1
4/8
4/8
R3:4
P5
A
R1:0
R2:0
• For node A, increase its rate on P5 as long as R3 receives free
capacity released by P4
52
Resolving Deadlocks (cont’d)
• K* = 1
R3:0
R1:0
NO DEADLOCK!
R2:0
53
Evaluation: Testbed – TE Case
Straggler switches (+500 ms latency)
Install Tunnel
Remove Tunnel
Evaluation: Testbed – Failure Recovery
Case
Straggler switches (+500 ms latency)
Install Tunnel
Remove Tunnel
Evaluation: Large-Scale Simulations
• Focus on congestion freedom as consistency property
• WAN: Dataset from a large WAN that interconnects O(50) sites.
o Collect traffic logs on routers every 5-minute intervals
o 288 traffic matrices collected. (each is a different state)
o Tunnel-based routing is used
• DC: Dataset from a large data center network with several
hundred switches.
o Collect traffic traces by logging the socket events on all servers and
aggregate into ToR-to-ToR flows over 5-minutes intervals
o 288 traffic matrices collected per day
o 1500 largest flows are chosen (40-60% of all traffic)
o 1500 as switch rule memory size
• Alternative approaches: OneShot (opportunistic) and SWAN
(static)
Evaluation: Update Time
99th
90th
50th
Dionysus outperforms SWAN in both configurations. For instance, in WAN TE:
• Normal: 57% - 49% - 52% faster
• Straggler: 88% - 84% - 81% faster
57
Evaluation: Link Oversubscription
(WAN Failure Recovery – Random Link)
• Least oversubscription among the three
• Reduces oversubscription by 41% (N) - 42% (S) < SWAN
• Update time: 45% (N) - 82% (S) faster in the 99th percentile
58
Evaluation: Deadlocks
• Planning-based approaches lead to no deadlocks
• Opportunistic approach deadlocks 90% (WAN) and 70% (DC)
of the time
WAN Topology is less regular than DC topology
(i.e., more complex dependencies)
Evaluation: Deadlocks (cont’d)
K*=5 in RateLimit algorithm
10% memory slack = Memory size
as 1100 when switch is loaded
with 1000 rules
60
Conclusion
• Dionysus provides fast, consistent network updates
through dynamic scheduling
– Dependency graph: compactly encodes orderings
– Scheduling: dynamically schedules operations
Dionysus enables more agile SDN control loops
61
Thanks!
62