Transcript Slide 1
DataTAG project Status & Perspectives Olivier MARTIN - CERN GNEW’2004 workshop 15 March 2004, CERN, Geneva GNEW’2004 – 15/03/2004 Presentation outline Project overview Testbed characteristics and evolution Major networking achievements Where are we? Lambda Grids Networking testbed requirements Acknowledgements Conclusions March 15, 2004 2 Final DataTAG Review, 24 March 2004 2 DataTAG Mission TransAtlantic Grid EU US Grid network research High Performance Transport protocols Inter-domain QoS Advance bandwidth reservation EU US Grid Interoperability Sister project to EU DataGRID March 15, 2004 3 Final DataTAG Review, 24 March 2004 3 Project partners March 15, 2004 http://www.datatag.org Final DataTAG Review, 24 March 2004 4 4 Funding agencies Cooperating Networks March 15, 2004 5 Final DataTAG Review, 24 March 2004 5 EU collaborators Brunel University CERN CLRC CNAF NIKHEF PPARC UvA University of Manchester University of Padova University of Milano University of Torino UCL DANTE INFN INRIA March 15, 2004 6 Final DataTAG Review, 24 March 2004 6 US collaborators ANL Northwestern University Caltech UIC Fermilab University of Chicago FSU University of Michigan Globus Indiana SLAC Wisconsin Starlight March 15, 2004 7 Final DataTAG Review, 24 March 2004 7 Workplan WP1: Establishment of a high performance intercontinental Grid testbed (CERN) WP2: High performance networking (PPARC) WP3 Bulk data transfer validations and application performance monitoring (UvA) WP4 Interoperability between Grid domains (INFN) WP5 & WP6 Dissemination and project management (CERN) March 15, 2004 8 Final DataTAG Review, 24 March 2004 8 DataTAG/WP4 framework and relationships HEP applications, Other experiments Integration HICB/HIJTB Interoperability standardization March 15, 2004 9 Final DataTAG Review, 24 March 2004 9 Testbed evolution The DataTAG testbed evolved from a simple 2.5 Gb/s Layer3 testbed (Sept. 2002) into an extremely rich multivendor 10 Gb/s Layer2/Layer3 testbed (Sept. 2003) Alcatel, Chiaro, Cisco, Juniper, PRocket Exclusive access to the testbed is granted through an advance testbed reservation application Direct extensions to Amsterdam UvA/Surfnet (10G) & Lyon INRIA/VTHD (2.5G) Layer 2 extension to INFN/CNAF over GEANT & GARR using Juniper’s CCC Layer 2 extension to the OptiPuter project at UCSD (University of California San Diego) through Abilene and CENIC under way. 1st L2/L3 Transatlantic testbed with native 10Gigabit Ethernet access. March 15, 2004 10 Final DataTAG Review, 24 March 2004 10 DataTAG testbed phase 1 (2.5Gbps) Linux PCs VTHD/INRIA ONS15454 STM64 SURF NET stm16 (FranceTelecom) Linux PCs ONS15454 r06gva Alcatel7770 stm64 (GC) GEANT r06chi-Alcatel7770 Alcatel 1670 stm16(Colt) backup+projects Alcatel 1670 SURFNET CESNET CNAF Linux PCs r05gva-JuniperM10 r05chi-JuniperM10 r04chi Cisco7609 s01chi Extreme S5i stm16 (T-Systems) Chicago Geneva 1G ethernet DataTAG testbed phase 2 (10Gbps) simplified 10G ethernet 2.5G STM16 10G STM64 Linux PCs Linux PCs 10Gbps Optical wave (T-Systems) ABILENE GEANT StarLight Force10 Juniper T320 Cisco7609 JuniperM10 Juniper T320 Cisco7606 VTHD/INRIA StarLight Cisco6509 Alcate l7770 [email protected] March 15, 2004 last update: 20030909 11 Final DataTAG Review, 24 March 2004 11 March 15, 2004 12 Final DataTAG Review, 24 March 2004 12 DataTAG testbed Alcatel Chiaro Cisco Juniper PRocket March 15, 2004 13 Final DataTAG Review, 24 March 2004 13 Main networking achievements (1) Internet landspeed records have been beaten one after the other by the DataTAG project partners and/or teams closely associated with DataTAG: Atlas Canada lightpath experiments during iGRID2002 (Gigabit Ethernet) and Telecom World 2003 (10Gigabit Ethernet, aka WANPHY) New Internet2 landspeed record (I2 LSR) by Nikhef/Caltech team (SC2002) FAST, GridDT, HS-TCP, Scalable TCP experiments (DataTAG partners & Caltech) Intel 10GigE tests between CERN (Geneva) and SLAC (Sunnyvale) (CERN, Caltech, Los Alamos Nationa Laboratory, SLAC) 2.38 Gbps sustained rate, single flow, 1TB in one hour I2 LSR awarded during Internet2 Spring member meeting (April 2003) March 15, 2004 14 Final DataTAG Review, 24 March 2004 14 ATLAS Canada Lightpath trials TRIUMF Vancouver & CERN Geneva through Amsterdam NetherLight “A full Terabyte of real data was transferred at rates equivalent to a full CD (680MB) in under 8 seconds and a DVD in under 1 minute” Wade Hong et al 09/2002Bringing effective data transfer rates below one second per CD! Subsequent 10GigE WAN-PHY Experiments during March 15, 2004 Telecom World 2003 Final DataTAG Review, 24 March 2004 15 15 10GigE Data Transfer Trial On Feb. 27-28 2003, a terabyte of data was transferred in European Commission 3700 seconds by S. Ravot of Caltech between the Level3 PoP in Sunnyvale near SLAC and CERN through the TeraGrid router at StarLight from memory to memory with a single TCP/IPv4 stream. This achievement translates to an average rate of 2.38 Gbps (using large windows and 9kB “jumbo frames”). This beat the former record by a factor of ~2.5 and used the 2.5Gb/s link at 99% efficiency. Huge distributed effort, 10-15 highly skilled people monopolized for several weeks! March 15, 2004 16 10G DataTAG testbed extension to Telecom World 2003 and Abilene/Cenic On September 15, 2003, the DataTAG project was the first transatlantic testbed offering direct 10GigE access using Juniper’s VPN layer2/10GigE emulation. Sponsors: Cisco, HP, Intel, OPI (Geneva’s Office for the Promotion of Industries & Technologies), Services Industriels de Geneve, March 15, 2004 17 Telehouse Europe, T-Systems Final DataTAG Review, 24 March 2004 17 Main networking achievements (2) Latest IPv4 & IPv6 I2LSR were awarded, live from the Internet2 fall member meeting in Indianapolis, to Caltech & CERN during Telecom World 2003: May 6, 2003: 987 Mb/s single TCP/IP v6 stream October 1, 2003 5.44 Gb/s single TCP/IP v4 stream between Geneva and Chicago: 1.1TB in 26 minutes or one 680MB CD in 1 second More records have been established by Caltech & CERN since then: November 6, 2003: 5.64 Gb/s single TCP/IP v4 stream between Geneva and Los Angeles (CENIC PoP) across DataTAG and Abilene. November 11, 2003, 4 Gb/s single TCP/IP v6 stream between Geneva and Phoenix (Arizona) through Los Angeles February 24, 2004 6.25 Gb/s with 9 streams for 638 seconds, i.e. half a terabyte transferred between CERN in Geneva and the CENIC PoP in Los Angeles across DataTAG and Abilene. March 15, 2004 18 Final DataTAG Review, 24 March 2004 18 Internet2 landspeed record history (IPv4&IPv6) Internet2 landspeed record history (in terabit-meters/second) Evolution of the I2LSR in Gigabit/second 70000 6.000 60000 5.000 50000 4.000 40000 3.000 30000 IPv4 terabit-meters/second) IPv6 (terabit-meters/second) IPv4 (Gb/s) IPv6 (Gb/s) 2.000 20000 1.000 10000 0.000 0 Month Mar-00 Apr-02 Sep-02 Oct-02 Nov-02 Month Feb-03 May-03 Oct-03 Nov-03 Nov-03 Month Mar-00 Apr-02 Sep-02 Oct-02 Nov-02 Month Feb-03 May-03 Oct-03 Nov-03 Nov-03 Impact of a single multiGb/s flow on the Abilene backbone March 15, 2004 19 Final DataTAG Review, 24 March 2004 19 Significance of I2LSRs to the Grid? Essential to establish the feasibility of multi-Gigabit/second single stream IPv4 & IPv6 data transfers: Over dedicated testbeds in a first phase Then across academic & research backbones Last but not least across campus networks Disk to disk rather than memory to memory Study impact of high performance TCP over disk servers Next steps: Above 6Gb/s expected soon between CERN and Los Angeles (Caltech/CENIC PoP) across DataTAG & Abilene Goal is to reach 10Gb/s with new PCI Express buses Study alternatives to standard TCP (Reno) Non-TCP transport (Tsunami, SABUL/VDT) HS-TCP, Scalable TCP, H-TCP, FAST, Grid-DT, Wesley+, etc… March 15, 2004 20 Final DataTAG Review, 24 March 2004 20 Main networking achievements (3) QoS Geneva Juniper M10 AF AF Layer2: VLAN Layer2: VLAN BE 1 GE bottleneck IP-Qos configured BE Advance bandwidth reservation GARA extensions AAA extensions March 15, 2004 21 Final DataTAG Review, 24 March 2004 21 Where are we? The DataTAG project came up at exactly the right time: Back in the late 2000, 2.5 Gb/s looked futuristic 10GigE, especially host interfaces, did not really exist However, it was already very clear that the standard TCP stack (Reno/Newreno) was problematic Much hope was placed on autotuning (Web100/Net100) & ECN/RED like solutions Actual bit error rates of transatlantic circuits were over-estimated Much better shape than expected on over-provisioned R&D backbones such as Abilene, Canarie, GEANT For how long? One of the strongest proof made by DataTAG is the extreme vulnerability of production R&D backbones in the presence of high performance flows (i.e. 10GigE or even less) March 15, 2004 22 Final DataTAG Review, 24 March 2004 22 Where are we (cont)? For many years the Wide Area Network has been the bottlemeck, this is no longer the case in many countries, thus making the deployment of data intensive Grid infrastructure, in principle, possible, e.g. EGEE the DataGrid successor Recent I2LSR records show, for the first time ever, that the network can be truly transparent and that throughput is only limited by the end hosts and/or campus network infrastructures. Challenge shifted from getting adequate bandwidth to deploying adequate LANs and cybersecurity infrastructure as well as making effective use of it! Non-trivial transport protocol issues still need to be resolved The only encouraging sign is that this is now widely recognized But we are still quite far from converging on a practical solution? March 15, 2004 Final DataTAG Review, 24 March 2004 23 23 Layer1/2/3 networking (1) Conventional layer 3 technology is no longer fashionable because of: High associated costs, e.g. 200/300 KUSD for a 10G router interfaces Implied use of shared backbones The use of layer 1 or layer 2 technology is very attractive because it helps to solve a number of problems, e.g. 1500 bytes Ethernet frame size (layer1) Protocol transparency (layer1 & layer2) Minimum functionality hence, in theory, much lower costs (layer1&2) March 15, 2004 24 Final DataTAG Review, 24 March 2004 24 Layer1/2/3 networking (2) « Lambda Grids » are becoming very popular: Pros: circuit oriented model like the telephone network, hence no need for complex transport protocols Lower equipment costs (i.e. « in theory » a factor 2 or 3 per layer) the concept of a dedicated end to end light path is very elegant Cons: « End to end » still very loosely defined, i.e. site to site, cluster to cluster or really host to host Higher circuit costs, Scalability, Additional middleware to deal with circuit set up/tear down, etc Extending dynamic VLAN functionality to the campus network is a potential nightmare! March 15, 2004 25 Final DataTAG Review, 24 March 2004 25 « Lambda Grids » What does it mean? Clearly different things to different people, hence the « apparently easy » consensus! Conservatively, on demand « site to site » connectivity Where is the innovation? What does it solve in terms of transport protocols? Where are the savings? Less interfaces needed (customer) but more standby/idle circuits needed (provider) Economics from the service provider vs the customer perspective? » Traditionally, switched services have been very expensive, Usage vs flat charge Break even, switches vs leased, few hours/day Why would this change? In case there are no savings, why bother? More advanced, cluster to cluster Implies even more active circuits in parallel Even more advanced, Host to Host All optical Is it realisitic? March 15, 2004 26 Final DataTAG Review, 24 March 2004 26 Networking testbed requirements Multi-vendor Unless a particular research group is specifically interested by the behaviour of TCP in the presence of out of order packets, running high performance TCP tests across a Juniper M160 backbone is pretty useless. IPv6 achievable performance vary widely between different vendors MPLS & QoS implementations also veary widely Interoperability Dynamic Implies manpower & money Partitionable Reservation application Reconfigurable Avoid manual recabling, implies Electronic or Optical switch/patch panel Extensible Extensions to other networks Implies collaboration Not limited to network equipment, must also include high performance servers, high perf. Disks & NICs, Coordination with other testbeds March 15, 2004 27 Final DataTAG Review, 24 March 2004 27 Acknowledments The project would not have accumulated so many successes without the active participation of our North American colleagues, in particular: Caltech/DoE University of Illinois/NSF iVDGL Starlight Internet2/Abilene Canarie and our European sponsors and colleagues as well, in particular: European Union’s IST program Dante/GEANT GARR Surfnet VTHD The GNEW2004 workshop is yet another example of successful collaboration between Europe and USA March 15, 2004 28 Final DataTAG Review, 24 March 2004 28