Transcript Slide 1

FABRIC WP1.2 Broadband Data Path: Protocols and Processor Interface Bonn 20/09/07 Ralph Spencer The University of Manchester

Contents:

• Outline • WP1.2.1 Broadband Protocols • WP1.2.2 Broadband data processing interface 20 September 2007 Fabric: WP1.2 Broadband data path Slide #2

Outline

• •

WP1.2.1 Protocols

• Investigation of suitable protocols for real time e-VLBI in EVN context • 1 FTE funded from EXPReS, RA: Stephen Kershaw • Contributed work over last year funded by ESLEA project • Strategic document May 2006 • Protocols performance report (interim) June 2007

WP1.2.2 Broadband Data Processor interface

• Interface to e-MERLIN correlator • 4 Gbps input (from Onsala) • 4 x 1 Gbps output to JIVE (SA1, EXPReS) • 2 FTE (EXPReS+FABRIC), Johnathan Hargreaves (since Dec 2006) • Using iBOBs: Xilinx vertex 2 FPGAs • e-MERLIN station boards: Xilinx IVs.

20 September 2007 Fabric: WP1.2 Broadband data path Slide #3

WP1.2.1 Protocols: What’s in the report?

• TCP_delay – constant bit rate data transfer over TCP • Reaction to lost packets – data delayed • Can catch up, needs large data buffers and provided link bandwidth adequate • Impractical, needs alternative protocol • VLBI_UDP • UDP based transfer system using ring buffers • Allows selective packet dropping • Implementation on PCs works, tests with correlator • Implemented on MkVAs – code diversion (JIVE/JBO) • Both work at 512 Mbps. • 1 Gbps tests…..

• DCCP • Datagram congestion control • In Linux kernel, uses selectable congestion control algorithm (CCID) • Needs suitable CCID for e-VLBI • Further work needed if to be sued in eVLBI 20 September 2007 Fabric: WP1.2 Broadband data path Slide #4

WP1.2.1 Protocols: What’s now/next?

• Work on TCP-delay completed • VLBI_UDP ideas incorporated into Haro’s/Arpad’s code – 512 Mbps successful on Mk5A’s

Stephen

• Bottleneck on VLBI_UDP identified: selective packet dropping implemented (can run 1024 Mbps VLBI over 1 GE) -

Simon

• Work on multi-destination protocols initiated -

Stephen

• VSI-E implemented, trans-Atlantic tests underway -

Tony

• 10 Gbps tests undertaken on GEANT2 research network • Tests to Onsala being planned

Rich

20 September 2007 Fabric: WP1.2 Broadband data path Slide #5

4 Gbit flows over G

É

ANT2

• Set up 4 Gigabit Lightpath Between G É ANT2 PoPs • Collaboration with DANTE • G É ANT2 Testbed London – Prague – London • And London-Amsterdam-Frankfurt-Prague-Paris-London • PCs in the DANTE London PoP with 10 Gigabit NICs • VLBI Tests: • UDP Performance • Throughput, jitter, packet loss, 1-way delay, stability • Continuous (days) Data Flows – VLBI_UDP and udpmon • Multi-Gigabit TCP performance with current kernels • Multi-Gigabit CBR over TCP/IP • Experience for FPGA Ethernet packet systems • DANTE Interests: • Multi-Gigabit TCP performance • The effect of (Alcatel 1678 MCC 10GE port) buffer size on bursty TCP using BW limited Lightpaths • 10 Gigabit London –New York Alcatel-Ciena Interoperability 20 September 2007 Fabric: WP1.2 Broadband data path Slide #6

The G

É

ANT2 Testbed

• • • • • •

10 Gigabit SDH backbone Alcatel 1678 MCCs GE and 10GE client interfaces

• • • • •

Node location: London Amsterdam Paris Prague Frankfurt Can do lightpath routing so make paths of different RTT Locate the PCs in London

20 September 2007 Fabric: WP1.2 Broadband data path Slide #7

• • • • • • • •

4 Gbps on G

É

ANT: UDP Throughput

Kernel 2.6.20-web100_pktd-plus Myricom 10G-PCIE-8A-R Fibre

• rx-usecs=25 Coalescence ON MTU 9000 bytes Max throughput

4.199 Gbit/s

Sending host, 3 CPUs idle For <8 µ s packets,

1 CPU is >90% in kernel mode

inc ~10% soft int Receiving host 3 CPUs idle For <8 µ s packets,

1 CPU is ~37% in kernel mode

inc ~9% soft int

10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 0 100 80 60 40 20 0 0 100 80 60 40 20 0 0 5

exp2-1_prag_15May07

5 10 15 20 Spacing between frames us 25

exp2-1_prag_15May07

5 10 15 20 Spacing between frames us 25

exp2-1_prag_15May07

10 15 20 Spacing between frames us 25 30 30 30 35 35 35

20 September 2007 Fabric: WP1.2 Broadband data path Slide #11

40 40 40

1000 bytes 1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8972 bytes 8000 bytes 1000 bytes 1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8972 bytes 8000 bytes 1000 bytes 1472 bytes 2000 bytes 3000 bytes 4000 bytes 5000 bytes 6000 bytes 7000 bytes 8972 bytes 8000 bytes

4 Gig Flows on G

É

ANT: UDP Flow Stability

• • • • • • • •

Kernel 2.6.20-web100_pktd-plus Myricom 10G-PCIE-8A-R Fibre

• Coalescence OFF MTU 9000 bytes Packet spacing 18 us Trials send 10 M packets Ran for 26 Hours

Throughput very stable 3.9795 Gbit/s

3980.5

3980.4

3980.3

3980.2

3980.1

3980 3979.9

3979.8

3979.7

3979.6

3979.5

0 exp2-1_w18_i500_udpmon_21May 20000 40000 60000

Time during the transfer s

80000 Occasional trials have packet loss ~40 in 10M - investigating • • •

Our thanks go to all our collaborators DANTE really provided “Bandwidth on Demand” A record 6 hours ! including

Driving to the PoP

Installing the PCs

Provisioning the Light-path

100000 20 September 2007 Fabric: WP1.2 Broadband data path Slide #14

Alcatel Buffer size: Method

• • • •

Classic Bottleneck 10 Gbit/s input 4 Gbit/s output

• •

Slope gives buffer size ~57 kBytes Use udpmon to send a stream of spaced UDP packets Measure packet number of first lost frame as function of w packet spacing

Q len 1 / N 1lost   N 1lost (P  w * R out ) P/Q len  w * (R out /Q len ) 20 September 2007 Fabric: WP1.2 Broadband data path Slide #15

WP1.2.2 Processor Interface

• University of Berkeley iBOB design (Dan Wertheimer) • 10 tested iBOBs delivered to JBO in June 2007 • Firmware being developed -

Jonathan

• Priority: 10 GE data transfer through CX4 connector • iBOB connects via VSI-H to EVLA/e-MERLIN station board • Prototype station board tested at Penticton- new version will be produced • Delivery of SBs to JBO expected after end of year • Fringe tests will need correlator cards – some time in 2008?

20 September 2007 Fabric: WP1.2 Broadband data path Slide #16

Connection to e-MERLIN

Station Board VSI VSI Station Board VSI VSI Station Board VSI VSI Station Board VSI VSI VSI to ZDOK VSI to ZDOK VSI to ZDOK VSI to ZDOK VSI to ZDOK VSI to ZDOK VSI to ZDOK VSI to ZDOK iBOB 0

CX4 1Gbps

Switch iBOB 0

CX4 1Gbps

iBOB 0

CX4 1Gbps

iBOB 0

CX4 1Gbps JBO eMERLIN CORRELATOR

Station Board VSI VSI iBOB 0

CX4 4Gbps JBO

Switch

Onsala

Switch

CX4 4Gbps Or fibre if > 15m

iBOB 0 Switch

JIVE

ADC VLBI Mk V b receive rs

20 September 2007 Fabric: WP1.2 Broadband data path Slide #17

IBOB under test

20 September 2007 Fabric: WP1.2 Broadband data path Slide #18

iBOB Test Configuration

iBOB

Configured as network testing device CX4 10Gbps up to 15m

Network PC

Or

Switch

Optional second CX4 JTAG RS232 10/100 Ethernet 20 September 2007

Local PC

Download FPGA firmware over JTAG Local Monitoring over RS232

Remote PC

Remote login to network PC to run tests from JBO, Manchester or elsewhere Removed when firmware is stable Fabric: WP1.2 Broadband data path Slide #19

iBOB test set up

20 September 2007 Fabric: WP1.2 Broadband data path Slide #20

Simulink Design for Generating Bursts of UDP Packets

20 September 2007 Fabric: WP1.2 Broadband data path Slide #21

UDP Throughput vs. Packet Spacing

• • •

PC Kernel 2.6.20-web100_pktd-plus Myricom 10G-PCIE-8A-R CX4

• rx-usecs=25 Coalescence ON • MTU 9000 bytes •

UDP Packets

• Max throughput

9.4 Gbit/s

iBoB

• Packet 8234 Data: 8192+ Header: 42 • 100 MHz clock • Max rate 6.6 Gbit/s • See

6.44Gbit/s

20 September 2007 Fabric: WP1.2 Broadband data path Slide #22

Current status

• Using Network PC to test 10Gbps capability of iBOB • Can ARP, PING and send and receive UDP packets using software running on the iBOB’s PowerPC.

• 10 Gbps packets sent using FPGA hardware

Next few weeks:

• UDP network tests • DevelopVSI-E control protocols using Linux

Next 6 months

• iBOB to iBOB transmission over a network using a modified RTP packet header. Algorithms to buffer and re-order late packets in the receiver need to be developed and tested.

• Develop algorithms on a Xilinx development board • to remove the e-Merlin delay model, • remove the n x 10kHz offset, • filter a 128MHz band into VLBI compatible sub-bands.

• Implement on the Virtex 4 SX35 chips on the station board.

20 September 2007 Fabric: WP1.2 Broadband data path Slide #23

Questions?

• • Monty Midnight Maroon Nov 2006 Contact information: [email protected]

EXPReS is made possible through the support of the European Commission (DG-INFSO), Sixth Framework Programme, Contract #026642 20 September 2007 Fabric: WP1.2 Broadband data path Slide #24