Network Services for Enhanced Cloud Computing T. V. Lakshman Bell Labs (Jointly with F.

Download Report

Transcript Network Services for Enhanced Cloud Computing T. V. Lakshman Bell Labs (Jointly with F.

Network Services for Enhanced Cloud Computing
T. V. Lakshman
Bell Labs
(Jointly with F. Hao, S. Mukherjee, H. Song)
Network Support For Cloud Computing: Scenario 1

The mobile platforms … are so powerful now that you can build client apps that
do magical things that are connected with the cloud … don’t limit your imagination
to this set of problems
- Eric Schmidt, Google CEO, Oct.
2009
User attaches to cloud
to get service
VM
Cloud service provider
creates VM to serve user
Data Center
Data Center
Transparent VM migration across WAN without losing service continuity
2 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Network Support For Cloud Computing : Scenario 2
VM
VM
Data Center
VM
VM
VM
VM
VM
VM
Sudden surge of demand
Data Center
Transparent VM migration across WAN to allow resource sharing
3 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Benefits of VM migration across multiple networks/data centers
 For Cloud Service Provider
 Service migration across data centers
 Load balancing within and across data centers
 Performance optimization
 Green computing
 For Cloud Users
 Faster access
 Efficient data delivery
4 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Why Not Use Mobile IP to Handle VM Migration?
 All traffic anchored at home agent (HA)
 Triangular routing increase delay and burdens the network
 Fine for end user device with low traffic volume, but not for servers
User 2
All traffic goes through
the anchor point
User 1
HA
VM 2
 MIPv6 requires correspondent
nodes to support MIP
VM 1
 Transition from IPv4 to IPv6??
 End users will be mixture of
IPv4 and IPv6 clients
5 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Network Architecture with Central Control
 Forwarding Element (FE)
 Handles all data plane functions
CC
han
nel
 Sets up a virtual backplane
between each other as necessary
FE1
ntr
ol C
 Can be distributed across WAN
Tun
n
Co
 Forwarding element or
router with APIs for central control
 Centralized Controller (CC)
Client
VM
el T
hru
Inte
rne
t
FE2
FE5
 Controls routing and signaling for
mobile VM IP prefixes
FE4
FE3
VICTOR
 Computes and installs forwarding
table for each FE
 Centralized Architectures –
SoftRouter (Hotnets 2004),
OpenFlow, RCP, 4D)
VM1
VM
VM2
VM
FE is the first layer-3
access and aggregation
point for mobile VM
Act as a loosely coupled router with FEs as line cards, and CCs as control plane
6 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Enabling Seamless Migration Of Virtual Machines Within Cloud
 Virtual machine (VM) location
and migration should be
transparent to customers
CC
FE
 Migrate VMs without losing
connectivity
 Enable seamless migration using
central controller architecture
FE
Data Center
FE
Data Center
 Centralized Controller (CC)
registers VM location,
coordinates VM movement
 Forwarding elements (FEs)
distributed in different cloud
data-centers
 FEs announce public VM prefixes
externally from all data centers
Data Center
 External packets first reach
closest FE, then tunneled to
actual destination
 VM location is made transparent
to external network
7
All Rights Reserved © Alcatel-Lucent 2008
Our Approach
 Traditional virtualization approach
 Slice and isolate resources in a physical router
 Each slice acts as a different router
 Virtual router with distributed forwarding elements managed by
logically centralized controller
 Similar in concept to [SoftRouter/Openflow/RCP/4D]
 Logically combine multiple physical devices to form a virtual
router
 A physical device mimics a virtual line card with multiple
virtual ports
 Virtual line cards are interconnected to mimic a virtual
backplane
 Dedicated facilities (e.g.,for data centers of a cloud service
provider)
 MPLS bandwidth-guaranteed paths
 Tunnels through the public Internet
8 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Routing of Packets to VMs In Centrally Controlled
Architecture
 External Routing
 Mobile VM IP prefixes announced from all FEs to external network
 To the external network, all FEs controlled by a CE appear as “one sink” for
all mobile VMs that it supports
 External routers access centrally controlled router through the FE closest
to them
 Internal Routing
 VM registers with local FE
 CC maintains “global view” of all VM locations
 Each FE maintains a forwarding table
 local bindings for locally registered VMs
 foreign bindings for remotely registered VMs
9 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Packet Forwarding
 VM  client
 FE receives a packet destined to an
external IP address
 Packet is directly sent out by looking
up external forwarding table
 Client  VM
VM 2
 Client sends packet to closest FE
VM 1
 FE tunnels packet VM’s local FE
 Local FE strips off tunnel header and
delivers packet to VM
 VM  VM
 Packets for VM with local binding are
directly forwarded
 Packets for VM with foreign binding
are forwarded to the current FE
 Packet discarded if no binding is found
10 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
VM Migration across Data Centers
Data Center 2
Data Center 1
5
3
locations
4
2) VM sends an ARP
VM
VM
2
1
New location
1) Start copy between old and new
Old location
Data Center 3
3) Local FE receives the ARP and
sends the message to CC
4) CC installs new routing entry in
local FE for the VM
5) CC installs new routing entry in
the old FE

Mobile VMs must have IP addresses that do not conflict with any other hosts in the cloud

VMs with destination NAT-ed addresses are moved by allocating non-conflicting private
addresses to the mobile VMs
11 | |VM
similar2009.inAll rights
principle
but simpler
2009 migration within a data center is
© Alcatel-Lucent
reserved.
Experimental Prototype
10.6.7.1
Server3
 Prototype based on Linux (FC 9)
 All FEs are controlled by CC
Controller
1
 FE2 and FE3 have 4-port NetFPGA GbE card
 Developed new Openflow controller to support
0
10.2.3.1
Server1
 Mobile node registration
 Layer 3 routing
VM
10.10.0.11
 VM Migration
1
2
FE1 2
1
10.4.5.1
Server2
FE2 0 4-port 0 FE3 2
3
3
NetFPGA
AP1
AP2
 VM migrated from Server1 to Server2
 Ping VM from Server3 at 0.01 sec interval
 Packet loss = 350  3.5 sec connectivity interruption
 Same downtime over LAN migration  negligible overhead
 Physical Host Migration
 Mobile PC changed attachment from AP1 to AP2
 Ping mobile PC from Server3 at 0.01 sec interval
 Packet loss = 1—2  0.01—0.02 sec connectivity interruption
12 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Mobile PC
10.10.0.36
Details in ACM CCR, ACM
SIGCOMM VISA workshop
paper
Network Support For Cloud Computing: Scenario 3

Enterprise need extra
computing capacity off-and-on
to accommodate variation in
demands
Data Center
Data Center

Home user need
extra server to
VM
VM
VM
VM
support/ interact with
various devices in
home network
Enterprise Network
Home Network
Transparent cloud computing service to enable seamless integration of
computing resources between cloud and user
13 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Transparent Cloud Computing -- Challenges
 Address mapping: Address space of cloud-based resources must be
mapped to enterprise address space
 Isolation: Customers should only see their network extension in the
cloud and should be isolated from other customers using the cloud
 Location independence: Virtual machines running customer
application should movable between customer sites and anywhere
in the data center
 Policy control: each customer can change its policy settings for the
cloud resources on the fly
 Scalability: service scale only restricted by total resources
available, not dependent on customer composition
 A few large enterprises vs. many small business or individual users
14 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Isolation Using VLANs
 Servers are partitioned into LANs or
VLANs, connected by L2 switches
VPN connecting
user to cloud
 LANs and VLANs are connected by
routers
 VLANs can be extended across
routers via VLAN trunks (tunnels)
 To support virtual private network
for enterprise:
VLAN for each
customer
 Use VLAN to isolate customers and
avoid IP address conflict
VM
VM
VM
VM
VM
Virtual switch in
hypervisor
15 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
 VLAN trunking to expand one subnet
across L3 routers
 L3 IPSec tunnel used between
enterprise edge and data center edge
Central Control Based architecture
 Partition data center network into smaller domains
 Use VLANs to isolate customer within a domain
 No “global” VLANs
 VLAN ids reused across domains
 Use router with central control to “glue” different domains together
 FEs forward traffic between domains
 CC stores mapping between user and their VLANs in each domain
 Per-user policy control
 Middleboxes attached to FEs
 Policy routing enforced by FEs
 CC stores per-customer policy
 User can configure their policy on-the-fly
16 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Transparent Cloud Computing Using Central Control
Each edge domain partitioned
into different VLANs
One VLAN per customer subnet
VLAN id reused across domains
Mapping cloud-based resources into customer networks • CC stores address
mapping, policy rules,
VM locations
Data Center
Controller
• Uses a controller (CC) in data
center that controls a a set of
forwarding elements (FEs)
Prov/NMS
VM
VM
VM
VM
VM
VM
• FEs resolves addresses, enforce
policies, forward packets
• MAC-in-MAC tunnel between
FEs
Edge
Domain
VM
FE2
FE1
• Middleboxes (FW, LB, etc.)
attached to FE
• Virtual MAC address
for each VM
Core
Domain
FW,LB,...
FE4
FE3
• Same customer
across multiple
domains sees one
logical network
VM
VM
• Core domain transports packets
between edge domains
FW,LB,...
VM
• No VLAN, flat L2 network
VPLS
IPsec
Customer site network is
a special edge domain
17
All Rights Reserved © Alcatel-Lucent 2008
Network-Cloud Joint Resource Allocation
18 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
User Resource Request and Allocation
User request
Cloud Broker
Provide users with all the resources needed for a
service (network, computation, storage ….)
Cloud brokers offer this service by partnering
with cloud and network providers and providing
brokering services amongst the providers
Allocation
User requirement
Network Service Providers
Cloud Service Providers
Cloud Service Providers
Infrastructure services, software
Services, platforms ….
Infrastructure services, software
Services, platforms ….
19 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Example Request Scenario 1: Online Game Networks On Demand
 A group wants to set up a gaming session on demand from different points of
attachments into the network
 Request for group specifies network service needs, game service, server need
 Group requests gaming session set up to a broker or service provider
 The provider sets up a “virtual private network” for the group on demand
 The game is provisioned in cloud resident servers and plug-ins are installed on user’s
browsers* if the user does not have the game console
 The resources are allocated flexibly so that new users can be easily provisioned into
the session and/or session’s properties can be modified (e.g., more bandwidth,
lower delay, etc)
*http://www.onlive.com/service.html
20 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Example Request Scenario 2: Supercomputing For The Masses*
 Application: Offering immense computing power to any interested party
 User requests computing resources at different locations
 User specifies locations where data resides and network needs for data access
 Users require network resources along with computation resources.
 User request sent to a broker or service provider which coordinates allocation
of network and computation resources
 The provider allocates computing and storage resources
 The provider allocates network resources to move data around the computing cloud
 The resources are allocated flexibly so that new tasks can be easily provisioned into
the system, resources taken out when not in use
Supercomputing for the Masses
**http://www.nytimes.com/2009/11/23/technology/23compute.htm
l
21 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Offering Cloud Brokering With Service Provider And Cloud Resources
 Cloud providers and service providers partner to support service
 Participating cloud and service providers:
 Publish their resources to participants using a common language
– computing, storage, networking, special services, applications, etc.
 Need a universal publish subscribe and manage model for specifying resources
 Brokering service:
 Matches available resources to user requests dynamically
 Provides a value-added service by using pricing, congestion, location, traffic information
 Coordinates provisioning of requested resources and presents an integrated network-IT service
to users
 A user request specifies:
 Resource category, load and duration
 Connectivity needs and location constraints
 Traffic treatment …..
22 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Challenges: How To Choose Resources For Allocation To User Requests?
User requests network and cloud resources
Multiple cloud providers have available resources
How to choose resources using dynamic pricing,
connectivity, traffic and routing needs?
User requirement
VM
Cloud and network
resources
Mapping
Data Center
Data Center
Disk
Disk
Disk
VM
VM
VM
Data
Center
Data Center
Network state and available resources used for deciding which resources
to allocate satisfying service level requirements and best use of network
23 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Resource Mapping Challenges: Mapping one user request
How to Optimally Place Allocated Virtual Resources?
 User input: resource and

communicaiton needs


 Run MapReduce for computation
User input used for optimizing choice
of resources allocated to user
Better performance for user application
Least cost network service for user
 Read from large databases at S1, S2
“map”
S2
“map”
S1
30
50
S
3 TB
5 TB
50 VMs on one
rack, 5 TB storage
on same LAN
30 VMs on one
rack, 3 TB storage
on same LAN
S2
S1
“reduce”
8
web server
2 TB
24 | | 2009
Data center
CA
Two racks: one with
8 VM, one with 2 VM,
2 TB storage
2
anywhere
btwn
S1 and S2
Allocation
And Mapping
Data center
NY
User input
© Alcatel-Lucent 2009. All rights reserved.
Connectivity
between sites
Data center
IA
Mapping Multiple Users: Optimally Mapping Multiple Virtual Networks and Cloud
Resources Into Network
 Game network:
•
 Traffic demand not known pointto-point.
 E-science cloud network:
 Traffic matrix: A->C: 10Mbps, B->D:
20Mbps, …
 Game server location & capacity
to be fixed, receiver location &
capacity fixed, …
 Known traffic, low burstiness
 QoS guaranteed session creation
 Very low latency.
Deplo
y
 Deploy
B
 Low jitter and low bandwidth
C
D
A
E-Science Network
Game Network
Storage
Compute
servers
Game servers
Physical Network
E
25 | | 2009
© Alcatel-Lucent 2009. All rights reserved.
Challenges ….
 How do you take the user request and allocate the requested resources from resources
available from multiple cloud providers and connectivity providers
 Allocation must meet service requirements and maximize resource usage
 How do you provision and instantiate these networks and resources in very short times
scales?
 How do you handle a large number of set-up and tear-downs when the requests are far
more complex than connections?
 Keeping track of the current state of resources and updating distributedly
 What happens when different components requested belong to different domains?
 Standard description of resources offered and customer demand?
 Processing, memory, storage, location, bandwidth, routing, …
 Standardized, un-ambiguous and expressive request specification
26 | | 2009
© Alcatel-Lucent 2009. All rights reserved.