PortLand: A Scalable Fault-Tolerant Layer 2 Data Center

Download Report

Transcript PortLand: A Scalable Fault-Tolerant Layer 2 Data Center

PortLand: A Scalable FaultTolerant Layer 2
Data Center Network Fabric
B97703099 財金三 婁瀚升
1
Outline
•
•
•
•
•
Introduction
Background
Design
Implementation
Conclusion
2
Introduction
• LAN insufficiency
• Requirement Network in datacenter :
– VM Migration without IP change
– Needn’t config switch before deployment
– Host in datacenter efficiently communicate
– No forwarding loop
– Fault-tolerant and recovery
3
Introduction
• Layer 2 kind fabric
– Layer 3 not workable
• VM migration with IP change
• Config switches of subnet and DHCP synchronite
• TTL method: prevent forwarding loop
• Prevent routing broadcast
4
Background:DatacenterNetwork
• Topology
• Forwarding
– Layer 3 : IP assigned hierachically
• Broadcast (Failure avoidance, overhead )
• Config switch subnet & DHCP syncro. Fault
• VM migration unable
– Layer 2 : MAC
• Single spanning tree problem (not shortest way)
• Broadcast entire fabric
– VLAN :
• Source pre-assignment (decrease flex. & scal.)
• Switch need to maintain VLAN’s state
5
Background:DatacenterNetwork
• End host Virtualization
– Layer 3 setting : not work
– ARP to solute ?
6
Fat Tree Network
• Multi-rooted
• Stage : edge, aggregation, core
• K-port swithes:
– k3/4 end hosts
– 5k2/4 individual k-port switches
– k individual pods
– each pod : k2/4 hosts
7
8
Design : Fabric Manager
• Centralized Manager
• Fuction :
– Mantain soft state network config (ex.topology)
– Responsible for
• ARP resolution
• Fault tolerance
• Multi-cast
• Only soft state (no hard state, ex.#of switch)
9
Design : Pseudo MAC
• Assigned to end hosts
• Information include:
– Location : same pod, same prefix
– Pod number→ position numer
• End host with their own AMAC
• LDP (location discovery protocol)
– employed to assign
– Pod.position.port.vmid
10
11
Design : Proxy-based ARP
• Ethernet : B-cast to all host(same layer2)
• Used for communication in datacenter
• If FM not available for IP-PMAC mapping
→bcat to core(O(k) state )
• VM migration supporting
– FM sending invalidation message to old pos.
– If contact with this message
→ new PMAC address in the host’s cache
12
13
Design :
Distributed Location Discovery
• Location Discovery Protocol (LDP)
• No administative config (no manul set)
• Location Discovery Message:
– Sent by switches
– Several information
– Edge got from aggregation, learned
→Aggregation learned
→Core learned
14
Design :
Distributed Location Discovery
• Location Discovery Message:
– Position number acquisition:
• Randomly chosen number
• Verified by aggregation
– Pod number acquisition:
• FM assignment to aggregation
– Exception : non-existence
• LDM not correct
• Disable suspicious port
15
Design:Loop free Forwarding
• Prevent using Spanning tree
• Downward seperate from Upward
16
Design : Fault Tolerance
• Unicast Fault Detection and Action
• Multicast Fault Detection and Action
17
18
19
20
Deisign : Comparison
21
Implementation : System
22
Implementation :Evaluation
23
Implementation :Evaluation
24
Implementation :Evaluation
25
Implementation :Evaluation
26
Implementation :Evaluation
27
Implementation :Evaluation
28
Conclusion
• Commercial Use
• Datacenter Network Protocol
29