Transcript Document 7300841
Hybrid Reliable Multicast with TCP-XM
Karl Jeacle Jon Crowcroft Marinho P Barcellos, UNISINOS Stefano Pettini, ESA
Rationale
Would like to achieve high-speed bulk data delivery to multiple sites Multicasting would make sense Existing multicast research has focused on sending to a large number of receivers But Grid is an applied example where sending to a moderate number of receivers would be extremely beneficial CoNEXT 2005
Multicast availability
Deployment is a problem!
Protocols have been defined and implemented Valid concerns about scalability; much FUD “chicken & egg” means limited coverage Clouds of native multicast But can’t reach
all
destinations via multicast So applications abandon in favour of unicast What if we could multicast when possible… …but fall back to unicast when necessary?
CoNEXT 2005
Multicast TCP?
TCP “single reliable stream between two hosts” Multicast TCP “multiple reliable streams from one to
n
hosts” May seem a little odd, but there is precedent… TCP-XMO – Liang & Cheriton M-TCP – Mysore & Varghese M/TCP – Visoottiviseth et al PRMP – Barcellos et al SCE – Talpade & Ammar CoNEXT 2005
ACK implosion
CoNEXT 2005
Building Multicast TCP
Want to test multicast/unicast TCP approach But new protocol == kernel change Widespread test deployment difficult Build new TCP-like engine Encapsulate packets in UDP Run in userspace Performance is sacrificed… …but widespread testing now possible CoNEXT 2005
TCP/IP/UDP/IP
Sending Application TCP IP UDP IP
If natively implemented test deployment
Receiving Application TCP IP UDP IP CoNEXT 2005
TCP engine
Where does initial TCP come from?
Could use BSD or Linux Extracting from kernel could be problematic More compact alternative lwIP = Lightweight IP Small but fully RFC-compliant TCP/IP stack lwIP + multicast extensions = “
TCP XM”
CoNEXT 2005
TCP-XM overview
Primarily aimed at push applications Sender initiated – advance knowledge of receivers Opens sessions to
n
destination hosts simultaneously Unicast is used when multicast not available Options headers used to exchange multicast info API changes Sender incorporates multiple destination and group addresses Receiver requires no changes TCP friendly, by definition CoNEXT 2005
TCP
SYN Sender ACK DATA FIN ACK CoNEXT 2005 SYNACK ACK ACK FIN Receiver
TCP-XM
Sender CoNEXT 2005 Receiver 1 Receiver 2 Receiver 3
TCP-XM connection
Connection User connects to multiple unicast destinations Multiple TCP PCBs created Independent 3-way handshakes take place SSM or random ASM group address allocated (if not specified in advance by user/application) Group address sent as TCP option Ability to multicast depends on TCP option CoNEXT 2005
TCP Group Option
kind=50
1 byte
len=6
1 byte
Multicast Group Address
4 bytes New group option sent in all TCP-XM SYN packets Non TCP-XM hosts will ignore (no option in SYNACK) Presence implies multicast capability Sender will automatically revert to unicast CoNEXT 2005
TCP-XM transmission
Data transfer Data replicated/enqueued on all send queues PCB variables dictate transmission mode Data packets are multicast (if possible) Retransmissions are unicast Auto fall back/forward to unicast/multicast Close Connections closed as per unicast TCP CoNEXT 2005
TCP-XM protocol states
CoNEXT 2005
Fall back / fall forward
TCP-XM principle “Multicast if possible, unicast when necessary” Initial transmission mode is group unicast Ensures successful initial data transfer Fall forward to multicast on positive feedback Typically after ~75K unicast data Fall back to unicast on repeated mcast failure CoNEXT 2005
Multicast feedback Option
kind=51
1 byte
len=3
1 byte
% Multicast Packets Received
1 byte New feedback option sent in ACKs from receiver Only used between TCP-XM hosts Indicates % of last n packets received via multicast Used by sender to fall forward to multicast transmission CoNEXT 2005
TCP-XM reception
Receiver No API-level changes Normal TCP listen Auto-IGMP join on TCP-XM connect Accepts data on both unicast/multicast ports tcp_input() accepts: packets addressed to existing unicast destination… …but now also those addressed to multicast group Tracks how last
n
segs received (u/m) CoNEXT 2005
API changes
Only relevant if natively implemented!
Sender API changes New connection type Connect to port on array of destinations Single write sends data to all hosts TCP-XM in use: conn = netconn_new(NETCONN_TCPXM); netconn_connectxm(conn, remotedest, numdests, group, port); netconn_write(conn, data, len, …); CoNEXT 2005
PCB changes
Every TCP connection has an associated Protocol Control Block (PCB) TCP-XM adds: struct tcp_pcb { … struct tcp_pcb *firstpcb;/* first of the mpcbs */ struct tcp_pcb *nextm; /* next tcpxm pcb */ enum tx_mode txmode; /* unicasting or multicasting */ u8_t nrtxm; /* number of retransmits for multicast */ u32_t nrtxmtime; /* time since last retransmit */ u32_t mbytessent; /* total bytes sent via multicast */ u32_t mbytesrcvd; /* total bytes received via multicast */ u32_t ubytessent; /* total bytes sent via unicast */ u32_t ubytesrcvd; /* total bytes received via unicast */ struct segrcv msegrcv[128]; /* ismcast boolean for last n segs */ u8_t msegrcvper; /* % of last segs received via mcast */ u8_t msegrcvcnt; /* counter for segs recvd via mcast */ u8_t msegsntper; /* % of last segs delivered via mcast */ } CoNEXT 2005
Linking PCBs
next M1 nextm next U1 nextm next M2 nextm next M3 nextm next U2 nextm *next points to the next TCP session *nextm points to the next TCP session that’s part of a particular TCP-XM connection Minimal timer and state machine changes CoNEXT 2005
What happens to the cwin?
Multiple receivers Multiple PCBs Multiple congestion windows Default to
min(cwin)
i.e. send at rate of slowest receiver Is this really so bad?
Compare to time taken for
n
unicast transfers CoNEXT 2005
LAN speed
CoNEXT 2005
LAN efficiency
CoNEXT 2005
WAN speed
CoNEXT 2005
WAN efficiency
CoNEXT 2005
Multiple TCP-XM flows
CoNEXT 2005
TCP-XM vs TCP flows
CoNEXT 2005
Protocol performance
CoNEXT 2005
Protocol efficiency
CoNEXT 2005
Grid multicast?
How can multicast be used in Grid environment?
TCP-XM is new multicast-capable protocol Globus is de-facto Grid middleware Would like TCP XM support in Globus… CoNEXT 2005
Globus XIO
eXtensible Input Output library Allows “i/o plugins” to Globus API Single POSIX-like API / set of semantics Simple open/close/read/write API Driver abstraction Hides protocol details / Allows for extensibility Stack of 1 transport &
n
transform drivers Drivers can be selected at runtime CoNEXT 2005
XIO architecture
CoNEXT 2005
XIO implementation
CoNEXT 2005
XIO/XM driver specifics
Two important XIO data structures 1.
Handle Returned to user when XIO framework ready Used for all open/close/read/write calls lwIP netconn connection structure used 2.
Attribute Used to set XIO driver specific parameters… … and TCP-XM protocol-specific options List of destination addresses CoNEXT 2005
XIO code example
// init stack
globus_xio_stack_init(&stack, NULL);
// load drivers onto stack
globus_xio_driver_load("
tcpxm
", &txdriver); globus_xio_stack_push_driver(stack, txdriver);
// init attributes
globus_xio_attr_init(&attr); globus_xio_attr_cntl(attr, txdriver, GLOBUS_XIO_TCPXM_SET_REMOTE_HOSTS, hosts, numhosts);
// create handle
globus_xio_handle_create(&handle, stack);
// send data
globus_xio_open(&handle, NULL, target); globus_xio_write(handle, "hello\n", 6, 1, &nbytes, NULL); globus_xio_close(handle, NULL); CoNEXT 2005
One-to-many issues
Stack assumes one-to-one connections XIO user interface requires modification Needs support for one-to-many protocols Minimal user API changes Framework changes more significant GSI is one-to-one Authentication with peer on connection setup But cannot authenticate with
n
Need some form of “GSI-M” peers CoNEXT 2005
Driver availability
Multicast transport driver for Globus XIO Requires Globus 3.2 or later Source code online Sample client Sample server Driver installation instructions http://www.cl.cam.ac.uk/~kj234/xio/ CoNEXT 2005
mcp & mcpd
Multicast file transfer application using TCP-XM ‘ mcpd & ’ on servers ‘ mcp file host1 host2… hostN ’ on client http://www.cl.cam.ac.uk/~kj234/mcp/ Full source code online FreeBSD, Linux, Solaris CoNEXT 2005
All done!
Thanks for listening!
Questions?
CoNEXT 2005