Document 7300841

Download Report

Transcript Document 7300841

Hybrid Reliable Multicast with TCP-XM

Karl Jeacle Jon Crowcroft Marinho P Barcellos, UNISINOS Stefano Pettini, ESA

Rationale

 Would like to achieve high-speed bulk data delivery to multiple sites  Multicasting would make sense  Existing multicast research has focused on sending to a large number of receivers  But Grid is an applied example where sending to a moderate number of receivers would be extremely beneficial CoNEXT 2005

Multicast availability

 Deployment is a problem!

 Protocols have been defined and implemented  Valid concerns about scalability; much FUD  “chicken & egg” means limited coverage  Clouds of native multicast  But can’t reach

all

destinations via multicast  So applications abandon in favour of unicast  What if we could multicast when possible…  …but fall back to unicast when necessary?

CoNEXT 2005

Multicast TCP?

 TCP  “single reliable stream between two hosts”  Multicast TCP  “multiple reliable streams from one to

n

hosts”  May seem a little odd, but there is precedent…  TCP-XMO – Liang & Cheriton  M-TCP – Mysore & Varghese  M/TCP – Visoottiviseth et al  PRMP – Barcellos et al  SCE – Talpade & Ammar CoNEXT 2005

ACK implosion

CoNEXT 2005

Building Multicast TCP

 Want to test multicast/unicast TCP approach  But new protocol == kernel change  Widespread test deployment difficult  Build new TCP-like engine  Encapsulate packets in UDP  Run in userspace  Performance is sacrificed…  …but widespread testing now possible CoNEXT 2005

TCP/IP/UDP/IP

Sending Application TCP IP UDP IP

If natively implemented test deployment

Receiving Application TCP IP UDP IP CoNEXT 2005

TCP engine

 Where does initial TCP come from?

 Could use BSD or Linux  Extracting from kernel could be problematic  More compact alternative  lwIP = Lightweight IP  Small but fully RFC-compliant TCP/IP stack  lwIP + multicast extensions = “

TCP XM”

CoNEXT 2005

TCP-XM overview

 Primarily aimed at push applications  Sender initiated – advance knowledge of receivers  Opens sessions to

n

destination hosts simultaneously  Unicast is used when multicast not available  Options headers used to exchange multicast info  API changes  Sender incorporates multiple destination and group addresses  Receiver requires no changes  TCP friendly, by definition CoNEXT 2005

TCP

SYN Sender ACK DATA FIN ACK CoNEXT 2005 SYNACK ACK ACK FIN Receiver

TCP-XM

Sender CoNEXT 2005 Receiver 1 Receiver 2 Receiver 3

TCP-XM connection

 Connection  User connects to multiple unicast destinations  Multiple TCP PCBs created  Independent 3-way handshakes take place  SSM or random ASM group address allocated  (if not specified in advance by user/application)  Group address sent as TCP option  Ability to multicast depends on TCP option CoNEXT 2005

TCP Group Option

kind=50

1 byte

len=6

1 byte

Multicast Group Address

4 bytes  New group option sent in all TCP-XM SYN packets  Non TCP-XM hosts will ignore (no option in SYNACK)  Presence implies multicast capability  Sender will automatically revert to unicast CoNEXT 2005

TCP-XM transmission

 Data transfer  Data replicated/enqueued on all send queues  PCB variables dictate transmission mode  Data packets are multicast (if possible)  Retransmissions are unicast  Auto fall back/forward to unicast/multicast  Close  Connections closed as per unicast TCP CoNEXT 2005

TCP-XM protocol states

CoNEXT 2005

Fall back / fall forward

 TCP-XM principle  “Multicast if possible, unicast when necessary”  Initial transmission mode is group unicast  Ensures successful initial data transfer  Fall forward to multicast on positive feedback  Typically after ~75K unicast data  Fall back to unicast on repeated mcast failure CoNEXT 2005

Multicast feedback Option

kind=51

1 byte

len=3

1 byte

% Multicast Packets Received

1 byte  New feedback option sent in ACKs from receiver  Only used between TCP-XM hosts  Indicates % of last n packets received via multicast  Used by sender to fall forward to multicast transmission CoNEXT 2005

TCP-XM reception

 Receiver  No API-level changes  Normal TCP listen  Auto-IGMP join on TCP-XM connect  Accepts data on both unicast/multicast ports  tcp_input() accepts:  packets addressed to existing unicast destination…  …but now also those addressed to multicast group  Tracks how last

n

segs received (u/m) CoNEXT 2005

API changes

 Only relevant if natively implemented!

 Sender API changes  New connection type  Connect to port on array of destinations  Single write sends data to all hosts  TCP-XM in use: conn = netconn_new(NETCONN_TCPXM); netconn_connectxm(conn, remotedest, numdests, group, port); netconn_write(conn, data, len, …); CoNEXT 2005

PCB changes

 Every TCP connection has an associated Protocol Control Block (PCB)  TCP-XM adds: struct tcp_pcb { … struct tcp_pcb *firstpcb;/* first of the mpcbs */ struct tcp_pcb *nextm; /* next tcpxm pcb */ enum tx_mode txmode; /* unicasting or multicasting */ u8_t nrtxm; /* number of retransmits for multicast */ u32_t nrtxmtime; /* time since last retransmit */ u32_t mbytessent; /* total bytes sent via multicast */ u32_t mbytesrcvd; /* total bytes received via multicast */ u32_t ubytessent; /* total bytes sent via unicast */ u32_t ubytesrcvd; /* total bytes received via unicast */ struct segrcv msegrcv[128]; /* ismcast boolean for last n segs */ u8_t msegrcvper; /* % of last segs received via mcast */ u8_t msegrcvcnt; /* counter for segs recvd via mcast */ u8_t msegsntper; /* % of last segs delivered via mcast */ } CoNEXT 2005

Linking PCBs

next M1 nextm next U1 nextm next M2 nextm next M3 nextm next U2 nextm  *next points to the next TCP session  *nextm points to the next TCP session that’s part of a particular TCP-XM connection  Minimal timer and state machine changes CoNEXT 2005

What happens to the cwin?

 Multiple receivers  Multiple PCBs  Multiple congestion windows  Default to

min(cwin)

 i.e. send at rate of slowest receiver  Is this really so bad?

 Compare to time taken for

n

unicast transfers CoNEXT 2005

LAN speed

CoNEXT 2005

LAN efficiency

CoNEXT 2005

WAN speed

CoNEXT 2005

WAN efficiency

CoNEXT 2005

Multiple TCP-XM flows

CoNEXT 2005

TCP-XM vs TCP flows

CoNEXT 2005

Protocol performance

CoNEXT 2005

Protocol efficiency

CoNEXT 2005

Grid multicast?

 How can multicast be used in Grid environment?

 TCP-XM is new multicast-capable protocol  Globus is de-facto Grid middleware  Would like TCP XM support in Globus… CoNEXT 2005

Globus XIO

 eXtensible Input Output library  Allows “i/o plugins” to Globus  API  Single POSIX-like API / set of semantics  Simple open/close/read/write API  Driver abstraction  Hides protocol details / Allows for extensibility  Stack of 1 transport &

n

transform drivers  Drivers can be selected at runtime CoNEXT 2005

XIO architecture

CoNEXT 2005

XIO implementation

CoNEXT 2005

XIO/XM driver specifics

 Two important XIO data structures 1.

Handle   Returned to user when XIO framework ready Used for all open/close/read/write calls  lwIP netconn connection structure used 2.

Attribute   Used to set XIO driver specific parameters… … and TCP-XM protocol-specific options  List of destination addresses CoNEXT 2005

XIO code example

// init stack

globus_xio_stack_init(&stack, NULL);

// load drivers onto stack

globus_xio_driver_load("

tcpxm

", &txdriver); globus_xio_stack_push_driver(stack, txdriver);

// init attributes

globus_xio_attr_init(&attr); globus_xio_attr_cntl(attr, txdriver, GLOBUS_XIO_TCPXM_SET_REMOTE_HOSTS, hosts, numhosts);

// create handle

globus_xio_handle_create(&handle, stack);

// send data

globus_xio_open(&handle, NULL, target); globus_xio_write(handle, "hello\n", 6, 1, &nbytes, NULL); globus_xio_close(handle, NULL); CoNEXT 2005

One-to-many issues

 Stack assumes one-to-one connections  XIO user interface requires modification  Needs support for one-to-many protocols  Minimal user API changes  Framework changes more significant  GSI is one-to-one  Authentication with peer on connection setup  But cannot authenticate with

n

 Need some form of “GSI-M” peers CoNEXT 2005

Driver availability

 Multicast transport driver for Globus XIO  Requires Globus 3.2 or later  Source code online  Sample client  Sample server  Driver installation instructions  http://www.cl.cam.ac.uk/~kj234/xio/ CoNEXT 2005

mcp & mcpd

 Multicast file transfer application using TCP-XM  ‘ mcpd & ’ on servers  ‘ mcp file host1 host2… hostN ’ on client  http://www.cl.cam.ac.uk/~kj234/mcp/  Full source code online  FreeBSD, Linux, Solaris CoNEXT 2005

All done!

 Thanks for listening!

 Questions?

CoNEXT 2005