International Networks and the US

Download Report

Transcript International Networks and the US

Global Lambdas and Grids for Particle Physics in the LHC Era

Harvey B. Newman

California Institute of Technology SC2005 Seattle, November 14-18 2005

Beyond the SM: Great Questions of

Particle Physics and Cosmology

You Are Here .

1. Where does the pattern of particle families and masses come from ?

2. Where are the Higgs particles; what is the mysterious Higgs field ?

3. Why do neutrinos and quarks oscillate ?

4. Is Nature Supersymmetric ?

5. Why is any matter left in the universe ?

6. Why is gravity so weak?

7. Are there extra space-time dimensions?

Other elements 0.03% Neutrinos 0.3% Stars 0.5% Free H and He 4% Dark matter 23% Dark energy 72%

We do not know what makes up 95% of the universe.

Large Hadron Collider CERN, Geneva: 2007 Start

 

pp

s =14 TeV L=10 34 cm -2 s -1 27 km Tunnel in Switzerland & France TOTEM CMS pp, general purpose; HI

Atlas

5000+ Physicists 250+ Institutes 60+ Countries ALICE : HI LHCb: B-physics Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma,

the Unexpected

LHC Data Grid Hierarchy

Experiment

Tier 1

CERN/Outside Resource Ratio ~1:2 Tier0/(

Tier1)/(

Tier2) ~1:1:1 ~PByte/sec 10 - 40 Gbps Online System

Tier 0 +1

~150-1500 MBytes/sec CERN Center PBs of Disk; Tape Robot FNAL Center IN2P3 Center RAL Center INFN Center ~10 Gbps

Tier 2

Tier2 Center

Tier 3

~1-10 Gbps Institute Institute Institute Institute Tens of Petabytes by 2007-8.

An Exabyte ~5-7 Years later.

Physics data cache 1 to 10 Gbps Workstations

Tier 4

Emerging Vision: A Richly Structured, Global Dynamic System

Long Term Trends in Network Traffic Volumes: 300-1000X/10Yrs

May, 2005 ESnet Accepted Traffic 1990 – 2005 Exponential Growth: +82%/Year for the Last 15 Years; 400X Per Decade R. Cottrell W. Johnston 10 Gbit/s

0

Progress in Steps

  

SLAC Traffic Growth in Steps: ~10X/4 Years.

Projected: ~2 Terabits/s by ~2014 “Summer” ‘05: 2x10 Gbps links: one for production, one for R&D

Internet 2 Land Speed Record (LSR)

IPv4 Multi-stream record with FAST TCP: 6.86 Gbps X 27kkm: Nov 2004

IPv6 record: 5.11 Gbps between Geneva and Starlight: Jan. 2005

Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux)

End System Issues: PCI-X Bus, Linux Kernel, NIC Drivers, CPU

Internet2 LSRs: Blue = HEP 7.2G X 20.7 kkm

Internet2 LSR - Single IPv4 TCP stream

0.4 Gbps 12272km 0.9 Gbps 10978km 2.5 Gbps 10037km 5.4 Gbps 7067km 5.6 Gbps 10949km 4.2 Gbps 16343km A 0 r p 2 N v o -0 2 e F b -0 3 c O 0 t 3 o N v -0 3 A 0 r p 4 6.6 Gbps 16500km 7.21 Gbps 20675 km 160 140 120 100 80 60 40 20 0 u J n -0 4 No v -04

Nov. 2004 Record Network

NB: Manufacturers’ Roadmaps for 2006: One Server Pair to One 10G Link

HENP Bandwidth Roadmap for Major Links (in Gbps)

Year 2001 Production

0.155

Experimental

0.622-2.5

Remarks

SONET/SDH

2002 2003 2005

0.622 2.5 10 2.5 10 2-4 X 10 SONET/SDH DWDM; GigE Integ. DWDM; 1 + 10 GigE Integration

Switch;

Provisioning 1 st Gen.

Grids

2007

2-4 X 10 ~10 X 10; 40 Gbps

2009

~10 X 10 or 1-2 X 40 ~5 X 40 or ~20-50 X 10 40 Gbps

Switching

2011

~5 X 40 or ~20 X 10 ~25 X 40 or ~100 X 10 2 nd Gen

Grids Terabit Networks

2013

~Terabit ~MultiTbps ~Fill One Fiber Continuing Trend: ~1000 Times Bandwidth Growth Per Decade; HEP: Co-Developer as well as Application Driver of Global Nets

AsiaPac SEA Aus.

SNV

LHCNet , ESnet Plan 2006-2009: 20-80Gbps US-CERN, ESnet MANs, IRNC

ESnet 2nd Core: 30-50G Japan CHI Europe Japan Europe BNL LHCNet US-CERN: Wavelength Triangle 10/05: 10G CHI + 10G NY 2007: 20G + 20G 2009: ~40G + 40G NYC IRNC Links Metro Rings DEN FNAL DC GEANT2 SURFNet IN2P3 Aus.

SDG ALB ESnet IP Core ≥10 Gbps ELP

ATL

ESnet hubs New ESnet hubs Metropolitan Area Rings Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene Production IP ESnet core, 10 Gbps enterprise IP traffic Science Data Network core, 40-60 Gbps circuit transport Lab supplied Major international LHCNet Data Network NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant2 10Gb/s 10Gb/s 30Gb/s 2 x 10Gb/s CERN LHCNet Data Network (2 to 8 x 10 Gbps US-CERN ) ESNet MANs to FNAL & BNL; Dark fiber (60Gbps) to FNAL

Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths

Preview global-scale data analysis of the LHC Era (2007-2020+), using next-generation networks and intelligent grid systems

Using state of the art WAN infrastructure and Grid-based Web service frameworks, based on the LHC Tiered Data Grid Architecture

Using a realistic mixture of streams: organized transfer of multi-TB event datasets, plus numerous smaller flows of physics data that absorb the remaining capacity.

The analysis software suites are based on the Grid-enabled Analysis Environment (GAE) developed at Caltech and U. Florida, as well as Xrootd from SLAC, and dcache from FNAL

Monitored by Caltech’s MonALISA global monitoring and control system

Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths

We used Twenty Two [*] 10 Gbps waves to carry bidirectional traffic between Fermilab, Caltech, SLAC, BNL, CERN and other partner Grid Service sites including: Michigan, Florida, Manchester, Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in Brazil, Korea (KNU), and Japan (KEK)

Results

151 Gbps peak, 100+ Gbps of throughput sustained for hours: 475 Terabytes of physics data transported in < 24 hours

131 Gbps measured by SCInet bwc team on 17 of our waves

Using real physics applications and production as well as test systems for data access, transport and analysis: bbcp, xrootd, dcache, and gridftp; and grid analysis tool suites

Linux kernel for TCP based protocols, including Caltech’s FAST

Far surpassing our previous SC2004 BWC Record of 101 Gbps [*] 15 at the Caltech/CACR and 7 at the FNAL/SLAC Booth

Monitoring NLR, Abilene/HOPI, LHCNet, USNet, TeraGrid, PWave, SCInet, Gloriad, JGN2, WHREN, other Int’l R&E Nets, and 14000+ Grid Nodes Simultaneously I. Legrand

Switch and Server Interconnections at the Caltech Booth (#428)

15 10G Waves

72 nodes with 280+ Cores

64 10G Switch Ports: 2 Fully Populated Cisco 6509Es

45 Neterion 10 GbE NICs

200 SATA Disks

 

40 Gbps (20 HBAs) to StorCloud Thursday – Sunday Setup

http://monalisa-ul.caltech.edu:8080/stats?page=nodeinfo_sys

Fermilab

Our BWC data sources are the

Production Storage Systems and File Servers

CDF used by:

US CMS Tier 1

Sloan Digital Sky Survey

Each of these produces, stores and moves Multi TB to PB-scale data: Tens of TB per day

~600 gridftp servers (of 1000s) directly involved

bbcp ramdisk to ramdisk transfer (CERN to Chicago) (3 TBytes of Physics Data transferred in 2 Hours)

16MB window, 2 streams 440000 430000 420000 410000 400000 390000 380000 370000 1 39 77 115 153 191 229 267 305 343 381 419 457 495 533 571 609 647 685 723 761 799 837 875 913 951 989 1027

Units of 5 seconds

100 80 60 40 20 0

Xrootd Server Performance

A. Hanushevsky Single Server Linear Scaling Netw ork I/O in MB/Sec percent CPU remaining events/sec processed 40000 50 100 150 200 250 300 350 400 Number of Concurrent Jobs 30000 20000 10000 0

Excellent Across WANs

Scientific Results

Ad hoc Analysis of Multi TByte Archives

 

Immediate exploration Spurs novel discovery approaches

Linear Scaling

Hardware Performance

Deterministic Sizing

High Capacity

Thousands of clients

Hundreds of Parallel Streams

Very Low Latency

12us + Transfer Cost

Device + NIC Limited

Xrootd Clustering Client open file X go to C

Xrootd Clustering

A B open file X C go to F Redirector (Head Node) Supervisor

(

sub-redirector

)

Data Servers D E F Cluster

Client sees all servers as xrootd data servers

Unbounded Clustering

Self organizing

Total Fault Tolerance

Automatic real time reorganization

Result

Minimum Admin Overhead

Better Client CPU Utilization

More results in less time at less

Remote Sites: Caltech, UFL, Brazil…..

ROOT Analysis ROOT Analysis GAE Services GAE Services GAE Services

Authenticated users automatically discover, and initiate multiple transfers of physics datasets ( Root files) through secure Clarens based GAE services .

Transfer is monitored through MonALISA

Once data arrives at the target sites (remote) analysis can start by authenticated users, using the Root analysis framework.

Using the COJAC user.

Clarens Root viewer or event viewer data from remote can be presented transparently to the ROOT Analysis

SC|05 Abilene and HOPI Waves

GLORIAD: 10 Gbps Optical Ring Around the Globe by March 2007

China, Russia, Korea, Japan, US, Netherlands Partnership GLORIAD Circuits Today

10 Gbps Hong Kong-Daejon Seattle

10 Gbps Seattle-Chicago-NYC (CANARIE contribution to GLORIAD)

622 Mbps Moscow-AMS-NYC

2.5 Gbps Moscow-AMS

155 Mbps Beijing-Khabarovsk Moscow

2.5 Gbps Beijing-Hong Kong

1 GbE NYC-Chicago (CANARIE) US: NSF IRNC Program

ESLEA/UKLight SC|05 Network Diagram

6 X 1 GE OC-192

KNU (Korea) Main Goals

  

Uses 10Gbps GLORIAD link from Korea to US, which is called BIG GLORIAD, also part of UltraLight Try to saturate this BIG-GLORIAD link with servers and cluster storages connected with 10Gbps Korea is planning to be a Tier-1 site for LHC experiments Korea BIG-GLORIAD U.S.

KEK (Japan) at SC05 10GE Switches on the KEK-JGN2-StarLight Path

JGN2

: 10G Network Research Testbed

Operational since 4/0410Gbps L2 between

Tsukuba and Tokyo Otemachi

10Gbps IP to Starlight since

August 2004

10Gbps L2 to Starlight since

September 2005 Otemachi–Chicago OC192 link replaced by 10GE WANPHY in September 2005

Brazil HEPGrid: Rio de Janeiro (UERJ) and Sao Paulo (UNESP)

“Global Lambdas for Particle Physics”

A Worldwide Network & Grid Experiment

  

We have Previewed the IT Challenges of Next Generation Science at the High Energy Frontier (for the LHC and other major programs)

Petabyte-scale datasets

 

Tens of national and transoceanic links at 10 Gbps (and up) 100+ Gbps aggregate data transport sustained for hours; We reached a Petabyte/day transport rate for real physics data We set the scale and learned to gauge the difficulty of the global networks and transport systems required for the LHC mission

But we set up, shook down and successfully ran the system in <1 week We have substantive take-aways from this marathon exercise

An optimized Linux (2.6.12 + FAST + NFSv4) kernel for data transport; after 7 full kernel-build cycles in 4 days

A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions

Extension of Xrootd, an optimized low-latency file access application for clusters, across the wide area

Understanding of the limits of 10 Gbps-capable systems under stress

“Global Lambdas for Particle Physics”

A Worldwide Network & Grid Experiment

We are grateful to our many network partners: SCInet, LHCNet, Starlight, NLR, Internet2’s Abilene and HOPI, ESnet, UltraScience Net, MiLR, FLR, CENIC, Pacific Wave, UKLight, TeraGrid, Gloriad, AMPATH, RNP, ANSP, CANARIE and JGN2.

And to our partner projects: US CMS, US ATLAS, D0, CDF, BaBar, US LHCNet, UltraLight, LambdaStation, Terapaths, PPDG, GriPhyN/iVDGL, LHCNet, StorCloud, SLAC IEPM, ICFA/SCIC and Open Science Grid

Our Supporting Agencies: DOE and NSF

And for the generosity of our vendor supporters, especially Cisco Systems, Neterion, HP, IBM, and many others, who have made this possible

And the Hudson Bay Fan Company…

Extra Slides Follow

Global Lambdas for Particle Physics Analysis SC|05 Bandwidth Challenge Entry

Caltech, CERN, Fermilab, Florida, Manchester, Michigan, SLAC, Vanderbilt, Brazil, Korea, Japan, et al CERN's Large Hadron Collider experiments: Data/Compute/Network Intensive Discovering the Higgs, SuperSymmetry, or Extra Space-Dimensions - with a Global Grid Worldwide Collaborations of Physicists Working Together; while Developing Next-generation Global Network and Grid Systems

Analysis Sandbox 3 rd party application Service Clarens (ACL, X509, Discovery) Catalog Storage Web server XML-RPC SOAP Java RMI JSON RPC http/ https Clarens Client Network datasets

      

Authentication

Access control on Web Services.

Remote file access

(and access control on files).

Discovery of Web Services and Software.

Shell service. Shell like access to remote machines (managed by access control lists).

Proxy certificate functionality

Virtual Organization

management and role management.

 User's point of access to a Grid system.

 Provides environment where user can:  Access Grid resources and services.

 Execute and monitor Grid applications.

 Collaborate with other users. 

analysis

One stop shop for Grid needs

Portals can lower the barrier for users to access Web Services and using Grid enabled applications