Integrating Universities and Laboratories in

Download Report

Transcript Integrating Universities and Laboratories in

Integrating Universities and Laboratories
In National Cyberinfrastructure
Paul Avery
University of Florida
[email protected]
PASI Lecture
Mendoza, Argentina
May 17, 2005
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
1
Outline of Talk
 Cyberinfrastructure
 Data
 The
intensive disciplines and Data Grids
Trillium Grid collaboration
 GriPhyN,
 The
iVDGL, PPDG
LHC and its computing challenges
 Grid3
A
and Grids
and the Open Science Grid
bit on networks
 Education
and Outreach
 Challenges
for the future
 Summary
Presented from a physicist’s perspective!
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
2
Cyberinfrastructure (cont)
 Software
programs, services, instruments, data,
information, knowledge, applicable to specific projects,
disciplines, and communities.
 Cyberinfrastructure
layer of enabling hardware,
algorithms, software, communications, institutions, and
personnel. A platform that empowers researchers to
innovate and eventually revolutionize what they do, how
they do it, and who participates.
 Base
technologies: Computation, storage, and
communication components that continue to advance in
raw capacity at exponential rates.
[Paraphrased from NSF Blue Ribbon Panel report, 2003]
Challenge: Creating and operating advanced cyberinfrastructure and
integrating
it in science
and engineering
PASI: Mendoza, Argentina
(May 17, 2005)
Paul Averyapplications.
3
Cyberinfrastructure and Grids
 Grid:
Geographically distributed computing resources
configured for coordinated use
 Fabric:
Physical resources & networks provide raw capability
 Ownership: Resources controlled by owners and shared w/ others
 Middleware: Software ties it all together: tools, services, etc.
 Enhancing
collaboration via transparent resource sharing
US-CMS
“Virtual Organization”
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
4
Data Grids & Collaborative
Research
 Team-based
21st century scientific discovery
 Strongly
dependent on advanced information technology
 People and resources distributed internationally
 Dominant
 2000
 2005
 2010
 2015-7
 Drives
factor: data growth (1 Petabyte = 1000 TB)
~0.5 Petabyte
~10 Petabytes
~100 Petabytes
~1000 Petabytes?
How to collect, manage,
access and interpret this
quantity of data?
need for powerful linked resources: “Data Grids”
 Computation
 Data
storage and access
 Data movement
 Collaborative
 Data
Massive, distributed CPU
Distributed hi-speed disk and tape
International optical networks
research and Data Grids
discovery, resource sharing, distributed analysis, etc.
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
5
Examples of Data Intensive
Disciplines
 High
energy & nuclear physics
 Belle,
BaBar, Tevatron, RHIC, JLAB
Primary driver
 Large Hadron Collider (LHC)
 Astronomy
 Digital
sky surveys, “Virtual” Observatories
 VLBI arrays: multiple- Gb/s data streams
 Gravity
wave searches
 LIGO,
 Earth
GEO, VIRGO, TAMA, ACIGA, …
and climate systems
 Earth
 Biology,
Observation, climate modeling, oceanography, …
medicine, imaging
 Genome
databases
 Proteomics (protein structure & interactions, drug delivery, …)
 High-resolution brain scans (1-10m, time dependent)
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
6
Our Vision & Goals
 Develop
the technologies & tools needed to
exploit a Grid-based cyberinfrastructure
End-to-end
 Apply
and evaluate those technologies & tools in
challenging scientific problems
 Develop
the technologies & procedures to
support a permanent Grid-based cyberinfrastructure
 Create
and operate a persistent Grid-based
cyberinfrastructure in support of discipline-specific
research goals
PASI: Mendoza,
Argentina (May
2005) Particle Physics
Paul Avery
7
GriPhyN
+ iVDGL
+ 17,
DOE
Data Grid (PPDG) = Trillium
Our Science Drivers
at Large Hadron Collider
 New
 High
Energy & Nuclear Physics expts
 Top
quark, nuclear matter at extreme density
 ~1 Petabyte (1000 TB)
1997 – present
 LIGO
(gravity wave search)
 Search
for gravitational waves
 100s of Terabytes
2002 – present
 Sloan
2007
2005
2003
2001
Digital Sky Survey
Data growth
fundamental particles and forces
 100s of Petabytes
2007 - ?
2009
Community growth
 Experiments
 Systematic
survey of astronomical objects
 10s of Terabytes
2001 – present
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
8
Grid Middleware: Virtual Data Toolkit
VDT
NMI
Sources
(CVS)
Build & Test
Condor pool
22+ Op. Systems
Build
Binaries
Test
Pacman cache
Package
Patching
RPMs
Build
Binaries
GPT src
bundles
Test
Build
Binaries
Many Contributors
A unique laboratory for testing, supporting, deploying, packaging, upgrading, &
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
9
troubleshooting
complex sets of software!
VDT Growth Over 3 Years
www.griphyn.org/vdt/
35
VDT 1.1.x
25
VDT 1.2.x
VDT 1.3.x
VDT 1.1.8
First real use by LCG
20
VDT 1.0
Globus 2.0b
Condor 6.3.1
15
10
VDT 1.1.11
Grid3
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
Apr-05
Jan-05
Oct-04
Jul-04
Apr-04
Jan-04
Oct-03
Jul-03
Jan-03
Oct-02
Apr-02
0
Jul-02
5
Apr-03
VDT 1.1.7
Switch to Globus 2.2
Jan-02
# of components
30
10
Components of VDT 1.3.5
 Globus
3.2.1
 Condor 6.7.6
 RLS 3.0
 ClassAds 0.9.7
 Replica 2.2.4
 DOE/EDG CA certs
 ftsh 2.0.5
 EDG mkgridmap
 EDG CRL Update
 GLUE Schema 1.0
 VDS 1.3.5b
 Java
 Netlogger 3.2.4
 Gatekeeper-Authz
 MyProxy1.11
 KX509
PASI: Mendoza, Argentina (May 17, 2005)
 System
Profiler
 GSI OpenSSH 3.4
 Monalisa 1.2.32
 PyGlobus 1.0.6
 MySQL
 UberFTP 1.11
 DRM 1.2.6a
 VOMS 1.4.0
 VOMS Admin 0.7.5
 Tomcat
 PRIMA 0.2
 Certificate Scripts
 Apache
 jClarens 0.5.3
 New GridFTP Server
 GUMS 1.0.1
Paul Avery
11
Collaborative Relationships:
A CS + VDT Perspective
Partner science projects
Partner networking projects
Partner outreach projects
Requirements
Prototyping
& experiments
Other linkages
 Work
force
 CS researchers
 Industry
U.S.Grids
Int’l
Outreach
Production
Deployment
Computer
Virtual
Larger
Techniques
Tech
Science
Data
Science
& software
Research
Toolkit Transfer Community
Globus, Condor, NMI, iVDGL, PPDG
EU DataGrid, LHC Experiments,
QuarkNet, CHEPREO, Dig. Divide
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
12
U.S. “Trillium” Grid Partnership
 Trillium
= PPDG + GriPhyN + iVDGL
 Particle
Physics Data Grid: $12M (DOE)
 GriPhyN:
$12M (NSF)
 iVDGL:
$14M (NSF)
 Basic
(1999 – 2006)
(2000 – 2005)
(2001 – 2006)
composition (~150 people)
 PPDG:
4 universities, 6 labs
 GriPhyN: 12 universities, SDSC, 3 labs
 iVDGL:
18 universities, SDSC, 4 labs, foreign partners
 Expts:
BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO, SDSS/NVO
 Coordinated
internally to meet broad goals
 GriPhyN:
CS research, Virtual Data Toolkit (VDT) development
 iVDGL:
Grid laboratory deployment using VDT, applications
 PPDG:
“End to end” Grid services, monitoring, analysis
 Common use of VDT for underlying Grid middleware
 Unified entity when collaborating internationally
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
13
Goal: Peta-scale Data Grids for
Global Science
Production Team
Single Researcher
Workgroups
Interactive User Tools
Virtual Data Tools
Request Planning &
Scheduling Tools
Resource
Management
Services
Security and
Policy
Services
PetaOps
 Petabytes
 Performance

Other Grid
Services
Transforms
Distributed resources
Raw data
source
PASI: Mendoza, Argentina (May 17, 2005)
Request Execution &
Management Tools
(code, storage, CPUs,
networks)
Paul Avery
14
Sloan Digital Sky Survey (SDSS)
Using Virtual Data in GriPhyN
Sloan Data
100000
Galaxy cluster
size distribution
Number of Clusters
10000
1000
100
10
1
1
PASI: Mendoza, Argentina (May 17, 2005)
10
Number of Galaxies
Paul Avery 100
15
The LIGO Scientific Collaboration (LSC)
and the LIGO Grid
LIGO Grid: 6 US sites + 3 EU sites (Cardiff/UK, AEI/Germany)
iVDGL has enabled LSC to establish a persistent production grid
Birmingham•
Cardiff
AEI/Golm •
* LHO, LLO: observatory sites
* LSC - LIGO Scientific Collaboration - iVDGL supported
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
16
Large Hadron Collider &
its Frontier Computing
Challenges
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
17
Large Hadron Collider (LHC)
@ CERN
 27 km Tunnel in Switzerland & France
TOTEM
CMS
ALICE
LHCb
Search for
 Origin of Mass
 New fundamental forces
 Supersymmetry
 Other new particles
PASI: –
Mendoza,
 2007
? Argentina (May 17, 2005)
ATLAS
Paul Avery
18
CMS: “Compact” Muon Solenoid
Inconsequential humans
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
19
LHC Data Rates: Detector to
Storage
40 MHz
Physics filtering
~TBytes/sec
Level 1 Trigger: Special Hardware
75 GB/sec
75 KHz
Level 2 Trigger: Commodity CPUs
5 GB/sec
5 KHz
Level 3 Trigger: Commodity CPUs
0.15 – 1.5 GB/sec
100 Hz
Raw Data to storage
(+ simulated data)
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
20
Complexity: Higgs Decay to 4 Muons
(+30 minimum bias events)
All charged tracks with pt > 2 GeV
Reconstructed tracks with pt > 25 GeV
109 collisions/sec, selectivity: 1 in 1013
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
21
LHC: Petascale Global Science
 Complexity:
Millions of individual detector channels
 Scale:
PetaOps (CPU), 100s of Petabytes (Data)
 Distribution:
Global distribution of people & resources
BaBar/D0 Example - 2004
700+ Physicists
100+ Institutes
35+ Countries
CMS Example- 2007
5000+ Physicists
250+ Institutes
60+ Countries
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
22
LHC Global Data Grid (2007+)
 5000 physicists, 60 countries
 10s of Petabytes/yr by 2008
 1000 Petabytes in < 10 yrs?
CMS Experiment
Online
System
Tier 0
Tier 1
CERN Computer
Center
150 - 1500 MB/s
Korea
Russia
UK
10-40 Gb/s
USA
>10 Gb/s
U Florida
Tier 2
Caltech
UCSD
2.5-10 Gb/s
Tier 3
Tier 4
FIU
Physics caches
PASI: Mendoza, Argentina (May 17, 2005)
Iowa
Maryland
PCs
Paul Avery
23
University Tier2 Centers
 Tier2
facility
 Essential
university role in extended computing infrastructure
 20 – 25% of Tier1 national laboratory, supported by NSF
 Validated by 3 years of experience (CMS, ATLAS, LIGO)
 Functions
 Perform
physics analysis, simulations
 Support experiment software
 Support smaller institutions
 Official
role in Grid hierarchy (U.S.)
 Sanctioned
by MOU with parent organization (ATLAS, CMS, LIGO)
 Selection by collaboration via careful process
 Local P.I. with reporting responsibilities
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
24
Grids and Globally Distributed Teams
 Non-hierarchical: Chaotic analyses + productions
 Superimpose significant random data flows
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
25
Grid3 and
Open Science Grid
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
26
Grid3: A National Grid Infrastructure
 32
sites, 4000 CPUs: Universities + 4 national labs
 Part of LHC Grid, Running since October 2003
 Sites in US, Korea, Brazil, Taiwan
 Applications in HEP, LIGO, SDSS, Genomics, fMRI, CS
Brazil
PASI: Mendoza, Argentina (May 17, 2005)
http://www.ivdgl.org/grid3
Paul Avery
27
Grid3 World Map
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
28
Grid3 Components
 Computers
 Uniform
& storage at ~30 sites: 4000 CPUs
service environment at each site
 Globus
Toolkit: Provides basic authentication, execution
management, data movement
 Pacman: Installs numerous other VDT and application services
 Global
& virtual organization services
 Certification
& registration authorities, VO membership services,
monitoring services
 Client-side
tools for data access & analysis
 Virtual
data, execution planning, DAG management, execution
management, monitoring
 IGOC:
 Grid
iVDGL Grid Operations Center
testbed: Grid3dev
 Middleware
development and testing, new VDT versions, etc.
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
29
Grid3 Applications
CMS experiment
p-p collision simulations & analysis
ATLAS experiment
p-p collision simulations & analysis
BTEV experiment
p-p collision simulations & analysis
LIGO
Search for gravitational wave sources
SDSS
Galaxy cluster finding
Bio-molecular analysis Shake n Bake (SnB) (Buffalo)
Genome analysis
GADU/Gnare
fMRI
Functional MRI (Dartmouth)
CS Demonstrators
Job Exerciser, GridFTP, NetLogger
www.ivdgl.org/grid3/applications
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
30
Usage: CPUs
Grid3 Shared Use Over 6 months
ATLAS
DC2
CMS DC04
Sep 10
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
31
Grid3 Production Over 13 Months
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
32
U.S. CMS 2003 Production
10M
p-p collisions; largest ever
 2x
simulation sample
 ½ manpower
Multi-VO
sharing
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
33
1500
1000
FNAL_CMS
IU_ATLAS_Tier2
0
22-Jan 23-Jan 24-Jan 25-Jan 26-Jan 27-Jan 28-Jan
PASI: Mendoza, Argentina (May 17, 2005)
500
0
Paul Avery
CalTech-PG
10
2000
ANL_HEP
20
Cost Function
Vanderbilt
30
At Data
UM_ATLAS
40
Least Loaded
UFlorida-Grid3
50
2500
UCSanDiegoPG
60
0
UBuffalo-CCR
70
200
BNL_ATLAS
Average number of Idle CPUs
80
400
IU_ATLAS_Tier2
comparisons
with simulations
Total I/O traffic (MBytes)
 Enables
600
UFlorida-PG
data placement
in a realistic environment
(K. Ranganathan)
800
CalTech-Grid3
 Adaptive
Total Response Time (seconds)
Grid3 as CS Research Lab:
E.g., Adaptive Scheduling
34
Grid3 Lessons Learned
 How
to operate a Grid as a facility
 Tools,
services, error recovery, procedures, docs, organization
 Delegation of responsibilities (Project, VO, service, site, …)
 Crucial role of Grid Operations Center (GOC)
 How
to support people  people relations
 Face-face
 How
to test and validate Grid tools and applications
 Vital
 How
role of testbeds
to scale algorithms, software, process
 Some
 How
meetings, phone cons, 1-1 interactions, mail lists, etc.
successes, but “interesting” failure modes still occur
to apply distributed cyberinfrastructure
 Successful
production runs for several applications
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
35
Grid3  Open Science Grid
 Iteratively
build & extend Grid3
 OSG-0  OSG-1  OSG-2  …
 Shared resources, benefiting broad set of disciplines
 Grid middleware based on Virtual Data Toolkit (VDT)
 Emphasis on “end to end” services for applications
 Grid3
 OSG
collaboration
 Computer
and application scientists
 Facility, technology and resource providers (labs, universities)
 Further
develop OSG
 Partnerships
and contributions from other sciences, universities
 Incorporation of advanced networking
 Focus on general services, operations, end-to-end performance
 Aim
for Summer 2005 deployment
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
36
http://www.opensciencegrid.org
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
37
OSG Organization
Advisory Committee
Universities,
Labs
Service
Providers
Technical
Groups
Executive Board
(8-15 representatives
Chair, Officers)
Sites
Researchers
VOs
activity
activity
1
activity
1
1
Activities
Research
Grid Projects
Enterprise
PASI: Mendoza, Argentina (May 17, 2005)
Core OSG Staff
OSG Council
(few FTEs, manager)
(all members above
a certain threshold,
Chair, officers)
Paul Avery
38
OSG Technical Groups & Activities
 Technical
Groups address and coordinate technical areas
 Propose
and carry out activities related to their given areas
 Liaise & collaborate with other peer projects (U.S. & international)
 Participate in relevant standards organizations.
 Chairs participate in Blueprint, Integration and Deployment activities
 Activities
are well-defined, scoped tasks contributing to OSG
 Each
Activity has deliverables and a plan
 … is self-organized and operated
 … is overseen & sponsored by one or more Technical Groups
TGs and Activities are where the real work gets done
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
39
OSG Technical Groups
Governance
Charter, organization, by-laws, agreements,
formal processes
Policy
VO & site policy, authorization, priorities,
privilege & access rights
Security
Common security principles, security
infrastructure
Monitoring and
Information Services
Resource monitoring, information services,
auditing, troubleshooting
Storage
Storage services at remote sites, interfaces,
interoperability
Infrastructure and services for user support,
helpdesk, trouble ticket
Training, interface with various E/O projects
Support Centers
Education / Outreach
Networks (new)
Including interfacing with various networking
projects
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
40
OSG Activities
Blueprint
Defining principles and best practices for OSG
Deployment
Deployment of resources & services
Provisioning
Connected to deployment
Incidence response
Plans and procedures for responding to security
incidents
Integration
Testing & validating & integrating new services
and technologies
Data Resource
Management (DRM)
Deployment of specific Storage Resource
Management technology
Documentation
Organizing the documentation infrastructure
Accounting
Accounting and auditing use of OSG resources
Interoperability
Primarily interoperability between
Operations
Operating Grid-wide services
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
41
Connections to European Projects:
LCG and EGEE
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
42
The Path to the OSG Operating
Grid
Readiness
plan adopted
Readiness
plan
Software &
packaging
Effort
Service
deployment
Resources
VO
Application
Software
Installation
Middleware
Interoperability
Functionality &
Scalability
Tests
feedback
PASI: Mendoza, Argentina (May 17, 2005)
Release
Description
Paul Avery
Application
validation
Metrics &
Certification
Release
Candidate
OSG Operations-Provisioning
Activity
OSG Deployment Activity
OSG Integration Activity
43
OSG Integration Testbed
Brazil
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
44
Status of OSG Deployment

OSG infrastructure release accepted for deployment.




US CMS MOP “flood testing” successful
D0 simulation & reprocessing jobs running on selected OSG sites
Others in various stages of readying applications & infrastructure
(ATLAS, CMS, STAR, CDF, BaBar, fMRI)
Deployment process underway: End of July?


Open OSG and transition resources from Grid3
Applications will use growing ITB & OSG resources during
transition
http://osg.ivdgl.org/twiki/bin/view/Integration/WebHome
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
45
Interoperability & Federation
 Transparent
use of Federated Grid infrastructures a goal
 There
are sites that appear as part of “LCG” as well as part
of OSG/Grid3
 D0
bringing reprocessing to LCG sites through adaptor node
 CMS and ATLAS can run their jobs on both LCG and OSG
 Increasing
interaction with TeraGrid
 CMS
and ATLAS sample simulation jobs are running on TeraGrid
 Plans for TeraGrid allocation for jobs running in Grid3 model: with
group accounts, binary distributions, external data management, etc
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
46
Networks
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
47
Evolving Science Requirements for Networks
(DOE High Perf. Network Workshop)
End2End
Throughput
5 years
End2End
Throughput
High Energy
Physics
Climate (Data &
Computation)
SNS
NanoScience
0.5 Gb/s
100 Gb/s
5-10 Years
End2End
Throughput
1000 Gb/s
0.5 Gb/s
160-200 Gb/s
N x 1000 Gb/s
Not yet
started
1 Gb/s
1000 Gb/s +
QoS for Control
Channel
Fusion Energy
0.066 Gb/s
(500 MB/s
burst)
0.013 Gb/s
(1 TB/week)
0.2 Gb/s
(500MB/
20 sec. burst)
N*N multicast
N x 1000 Gb/s
Time critical
throughput
1000 Gb/s
0.091 Gb/s
(1 TB/day)
100s of users
1000 Gb/s +
QoS for Control
Channel
Computational
steering and
collaborations
High throughput
and steering
Science Areas
Astrophysics
Genomics Data
& Computation
Today
Remarks
High bulk
throughput
High bulk
throughput
Remote control
and time critical
throughput
See http://www.doecollaboratory.org/meetings/hpnpw
PASI: Mendoza,
Argentina (May 17, 2005)
Paul Avery
/
48
UltraLight: Advanced Networking
in Applications
Funded by ITR2004
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
10 Gb/s+ network
• Caltech, UF, FIU, UM, MIT
• SLAC, FNAL
• Int’l partners
49
• Level(3), Cisco, NLR
UltraLight: New Information
System
A
new class of integrated information systems
 Includes
networking as a managed resource for the first time
 Uses “Hybrid” packet-switched and circuit-switched optical
network infrastructure
 Monitor, manage & optimize network and Grid Systems in realtime
 Flagship
applications: HEP, eVLBI, “burst” imaging
 “Terabyte-scale”
data transactions in minutes
 Extend Real-Time eVLBI to the 10 – 100 Gb/s Range
 Powerful
testbed
 Significant
 Strong
storage, optical networks for testing new Grid services
vendor partnerships
 Cisco,
Calient, NLR, CENIC, Internet2/Abilene
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
50
Education and Outreach
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
51
iVDGL, GriPhyN Education/Outreach
Basics




$200K/yr
Led by UT Brownsville
Workshops, portals, tutorials
New partnerships with QuarkNet,
CHEPREO, LIGO E/O, …
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
52
June 2004 Grid Summer School
 First
of its kind in the U.S. (South Padre Island, Texas)
 36
students, diverse origins and types (M, F, MSIs, etc)
 Marks
new direction for U.S. Grid efforts
 First
attempt to systematically train people in Grid technologies
 First attempt to gather relevant materials in one place
 Today: Students in CS and Physics
 Next:
Students, postdocs, junior & senior scientists
 Reaching
a wider audience
 Put
lectures, exercises, video, on the web
 More tutorials, perhaps 2-3/year
 Dedicated resources for remote tutorials
 Create “Grid Cookbook”, e.g. Georgia Tech
 Second
 South
workshop: July 11–15, 2005
Padre Island again
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
53
QuarkNet/GriPhyN e-Lab Project
http://quarknet.uchicago.edu/elab/cosmic/home.jsp
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
54
Student Muon Lifetime Analysis in
GriPhyN/QuarkNet
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
55
CHEPREO: Center for High Energy Physics Research
and Educational Outreach
Florida International University




Physics Learning Center
CMS Research
iVDGL Grid Activities
AMPATH network (S. America)
 Funded September 2003
 $4M initially (3 years)
 MPS, CISE, EHR, INT
Grids and the Digital Divide
Rio de Janeiro, Feb. 16-20, 2004
NEWS:
Bulletin: ONE TWO
WELCOME BULLETIN
General Information
Registration
Travel Information
Hotel Registration
Participant List
How to Get UERJ/Hotel
Computer Accounts
Useful Phone Numbers
Program
Contact us:
Secretariat
Chairmen
Background
 World Summit on Information Society
 HEP Standing Committee on Inter-regional
Connectivity (SCIC)
Themes
 Global collaborations, Grids and addressing
the Digital Divide
 Focus on poorly connected regions
Next meeting: Daegu, Korea
 May 23-27, 2005
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
57
Partnerships Drive Success
 Integrating
Grids in scientific research
 “Lab-centric”:
 “Team-centric”:
 “Knowledge-centric”:
 Strengthening
Activities center around large facility
Resources shared by distributed teams
Knowledge generated/used by a community
the role of universities in frontier research
 Couples
universities to frontier data intensive research
 Brings front-line research and resources to students
 Exploits intellectual resources at minority or remote institutions
 Driving
advances in IT/science/engineering
 Domain
sciences
 Universities
 Scientists
 NSF projects
 NSF
 Research communities
PASI: Mendoza, Argentina (May 17, 2005)






Computer Science
Laboratories
Students
NSF projects
DOE
IT industry
Paul Avery
58
Fulfilling the Promise of
Next Generation Science
 Supporting
permanent, national-scale Grid infrastructure
 Large
CPU, storage and network capability crucial for science
 Support personnel, equipment maintenance, replacement, upgrade
 Tier1 and Tier2 resources a vital part of infrastructure
 Open Science Grid a unique national infrastructure for science
 Supporting
the maintenance, testing and dissemination of
advanced middleware
 Long-term
support of the Virtual Data Toolkit
 Vital for reaching new disciplines & for supporting large
international collaborations
 Continuing
support for HEP as a frontier challenge driver
 Huge
challenges posed by LHC global interactive analysis
 New challenges posed by remote operation of Global Accelerator
Network
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
59
Fulfilling the Promise (2)
 Creating
even more advanced cyberinfrastructure
 Integrating
databases in large-scale Grid environments
 Interactive analysis with distributed teams
 Partnerships involving CS research with application drivers
 Supporting
the emerging role of advanced networks
 Reliable,
high performance LANs and WANs necessary for
advanced Grid applications
 Partnering
to enable stronger, more diverse programs
 Programs
supported by multiple Directorates, a la CHEPREO
 NSF-DOE joint initiatives
 Strengthen ability of universities and labs to work together
 Providing
opportunities for cyberinfrastructure training,
education & outreach
 Grid
tutorials, Grid Cookbook
 Collaborative tools for student-led projects & research
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
60
Summary
 Grids
enable 21st century collaborative science
 Linking
research communities and resources for scientific discovery
 Needed by global collaborations pursuing “petascale” science
 Grid3
was an important first step in developing US Grids
 Value
of planning, coordination, testbeds, rapid feedback
 Value of learning how to operate a Grid as a facility
 Value of building & sustaining community relationships
 Grids
drive need for advanced optical networks
 Grids
impact education and outreach
 Providing
technologies & resources for training, education, outreach
 Addressing the Digital Divide
 OSG:
a scalable computing infrastructure for science?
 Strategies
needed to cope with increasingly large scale
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
61
Grid Project References
Open
Science Grid
UltraLight
 www.opensciencegrid.org
Grid3
 ultralight.cacr.caltech.edu
Globus
 www.ivdgl.org/grid3
Virtual
Data Toolkit
 www.griphyn.org/vdt
GriPhyN
 www.griphyn.org
iVDGL
 www.ivdgl.org
PPDG
 www.ppdg.net
 www.globus.org
Condor
 www.cs.wisc.edu/condor
LCG
 www.cern.ch/lcg
EU DataGrid
 www.eu-datagrid.org
EGEE
 www.eu-egee.org
CHEPREO
 www.chepreo.org
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
62
Extra Slides
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
63
GriPhyN Goals
 Conduct
CS research to achieve vision
 Virtual
Data as unifying principle
 Planning, execution, performance monitoring
 Disseminate
A
through Virtual Data Toolkit
“concrete” deliverable
 Integrate
into GriPhyN science experiments
 Common
 Educate,
Grid tools, services
involve, train students in IT research
 Undergrads,
grads, postdocs,
 Underrepresented groups
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
64
iVDGL Goals
 Deploy
a Grid laboratory
 Support
research mission of data intensive experiments
 Provide computing and personnel resources at university sites
 Provide platform for computer science technology development
 Prototype and deploy a Grid Operations Center (iGOC)
 Integrate
 Into
Grid software tools
computing infrastructures of the experiments
 Support
delivery of Grid technologies
 Hardening
of the Virtual Data Toolkit (VDT) and other middleware
technologies developed by GriPhyN and other Grid projects
 Education
and Outreach
 Lead
and collaborate with Education and Outreach efforts
 Provide tools and mechanisms for underrepresented groups and
remote regions to participate in international science projects
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
65
Analysis
Client
•Discovery
•ACL management
•Cert. based access
•ROOT (analysis tool)
•Python
•Cojac (detector viz)/
IGUANA (cms viz)
HTTP, SOAP, XMLRPC
Clarens
CMS: Grid Enabled
Analysis Architecture
Analysis
Client
 Clients talk standard
protocols to “Grid
Services Web Server”
Grid Services
Web Server
Scheduler
Sphinx
MCRunjob
Catalogs
FullyAbstract
Planner
Metadata
FullyConcrete
Planner
 Typical clients: ROOT,
Web Browser, ….
RefDB
PartiallyAbstract
Planner
Chimera
 Simple Web service API
allows simple or
complex analysis clients
Data
Management
Virtual
Data
MonALISA
MOPDB
Monitoring
Replica
BOSS
ORCA
Applications
ROOT
FAMOS
POOL
Execution
Priority
Manager
 Key features: Global
Scheduler, Catalogs,
Monitoring, Grid-wide
Execution service
VDT-Server
Grid Wide
Execution
PASI: Mendoza, Argentina (May
17, 2005)
Service
 Clarens portal hides
complexity
Paul Avery
66
“Virtual Data”: Derivation &
Provenance
 Most
scientific data are not simple “measurements”
 They
are computationally corrected/reconstructed
 They can be produced by numerical simulation
 Science
& eng. projects are more CPU and data intensive
 Programs
are significant community resources (transformations)
 So are the executions of those programs (derivations)
 Management
 Derivation:
 Provenance:
of dataset dependencies critical!
Instantiation of a potential data product
Complete history of any existing data product
Previously: Manual methods
GriPhyN: Automated, robust tools
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
67
Virtual Data Example: HEP Analysis
decay = bb
decay = WW
WW  leptons
decay = ZZ
mass = 160
decay = WW
Other cuts
Other cuts
PASI: Mendoza, Argentina (May 17, 2005)
decay = WW
WW  e
Other cuts
Paul Avery
Scientist adds a new
derived data branch
& continues analysis
decay = WW
WW  e
Pt > 20
Other cuts
68
Packaging of Grid Software: Pacman
 Language: define software environments
 Interpreter: create, install, configure, update, verify environments
 Version 3.0.2 released Jan. 2005
 LCG/Scram
 ATLAS/CMT
 CMS DPE/tar/make
 LIGO/tar/make
 Globus/GPT
 NPACI/TeraGrid/tar/make
 D0/UPS-UPD
 Commercial/tar/make
 OpenSource/tar/make
Combine and
manage software
from arbitrary
sources.
“1 button install”:
Reduce burden
on administrators
% pacman –get iVDGL:Grid3
LIGO
VDTVDT UCHEP
iVDGL
%
pacman
PASI: Mendoza, Argentina (May 17, 2005)
DZero
ATLA
CMS/DPE S
NPAC
Paul Avery
I
Remote experts
define installation/
config/updating for
everyone at once
69
Virtual Data Motivations
“I’ve found some interesting
data, but I need to know
exactly what corrections were
applied before I can trust it.”
“I’ve detected a muon calibration
error and want to know which
derived data products need to be
recomputed.”
Describe
Discover
VDC
Reuse
Validate
“I want to search a database for 3
muon events. If a program that does
this analysis exists, I won’t have to
write one from scratch.”
PASI: Mendoza, Argentina (May 17, 2005)
“I want to apply a forward jet
analysis to 100M events. If the
results already exist, I’ll save
weeks of computation.”
Paul Avery
70
Background: Data Grid Projects
Driven primarily by HEP applications
U.S.
Funded Projects
EU,
 GriPhyN
(NSF)
 iVDGL (NSF)
 Particle Physics Data Grid (DOE)
 UltraLight
 TeraGrid (NSF)
 DOE Science Grid (DOE)
 NEESgrid (NSF)
 NSF Middleware Initiative (NSF)
Asia projects
 EGEE
(EU)
 LCG (CERN)
 DataGrid
 EU national Projects
 DataTAG (EU)
 CrossGrid (EU)
 GridLab (EU)
 Japanese, Korea Projects
Many projects driven/led by HEP + CS
 Many 10s x $M brought into the field
 Large impact on other sciences, education

PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
71
“Virtual Data”: Derivation &
Provenance
 Most
scientific data are not simple “measurements”
 They
are computationally corrected/reconstructed
 They can be produced by numerical simulation
 Science
& eng. projects are more CPU and data intensive
 Programs
are significant community resources (transformations)
 So are the executions of those programs (derivations)
 Management
 Derivation:
 Provenance:
of dataset dependencies critical!
Instantiation of a potential data product
Complete history of any existing data product
Previously: Manual methods
GriPhyN: Automated, robust tools
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
72
Muon Lifetime Analysis Workflow
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
73
(Early) Virtual Data Language
pythia_input
pythia.exe
cmsim_input
cmsim.exe
writeHits
writeDigis
CMS “Pipeline”
begin v /usr/local/demo/scripts/cmkin_input.csh
file i ntpl_file_path
file i template_file
file i num_events
stdout cmkin_param_file
end
begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe
pre
cms_env_var
stdin cmkin_param_file
stdout cmkin_log
file o ntpl_file
end
begin v /usr/local/demo/scripts/cmsim_input.csh
file i ntpl_file
file i fz_file_path
file i hbook_file_path
file i num_trigs
stdout cmsim_param_file
end
begin v /usr/local/demo/binaries/cms121.exe
condor copy_to_spool=false
condor getenv=true
stdin cmsim_param_file
stdout cmsim_log
file o fz_file
file o hbook_file
end
begin v /usr/local/demo/binaries/writeHits.sh
condor getenv=true
pre orca_hits
file i fz_file
file i detinput
file i condor_writeHits_log
file i oo_fd_boot
file i datasetname
stdout writeHits_log
file o hits_db
end
begin v /usr/local/demo/binaries/writeDigis.sh
pre orca_digis
file i hits_db
file i oo_fd_boot
file i carf_input_dataset_name
file i carf_output_dataset_name
file i carf_input_owner
file i carf_output_owner
file i condor_writeDigis_log
stdout writeDigis_log
file o digis_db
end
QuarkNet Portal Architecture
Simpler interface for non-experts
 Builds on Chiron portal

PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
75
Integration of GriPhyN and IVDGL
 Both
funded by NSF large ITRs, overlapping periods
 GriPhyN:
 iVDGL:
 Basic
composition
 GriPhyN:
 iVDGL:
 Expts:
 GriPhyN
12 universities, SDSC, 4 labs
18 institutions, SDSC, 4 labs
CMS, ATLAS, LIGO, SDSS/NVO
(~80 people)
(~100 people)
(Grid research) vs iVDGL (Grid deployment)
 GriPhyN:
 iVDGL:
 Many
CS Research, Virtual Data Toolkit (9/2000–9/2005)
Grid Laboratory, applications
(9/2001–9/2006)
2/3 “CS” + 1/3 “physics”
1/3 “CS” + 2/3 “physics”
( 0% H/W)
(20% H/W)
common elements
 Common
Directors, Advisory Committee, linked management
 Common Virtual Data Toolkit (VDT)
 Common Grid testbeds
 Common Outreach effort
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
76
GriPhyN Overview
Virtual Production
Data params
discovery
Science
Review
sharing
exec.
discovery
data
Researcher
composition
Applications
instrument
Chimera
virtual data
system
Planning
Planning
planning
Production
Manager
storage
element
Services
Services
storage
element
storage
element
Grid
Grid Fabric
Pegasus
planner
DAGman
Globus Toolkit
Condor
Ganglia, etc.
Virtual Data
Toolkit
Execution
Analysis
Chiron/QuarkNet Architecture
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
78
Cyberinfrastructure
“A new age has dawned in scientific & engineering research,
pushed by continuing progress in computing, information, and
communication technology, & pulled by the expanding
complexity, scope, and scale of today’s challenges. The
capacity of this technology has crossed thresholds that now
make possible a comprehensive “cyberinfrastructure” on which
to build new types of scientific & engineering knowledge
environments & organizations and to pursue research in new
ways & with increased efficacy.”
[NSF Blue Ribbon Panel report, 2003]
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
79
Fulfilling the Promise of
Next Generation Science
Our multidisciplinary partnership of physicists,
computer scientists, engineers, networking specialists
and education experts, from universities and
laboratories, has achieved tremendous success in
creating and maintaining general purpose
cyberinfrastructure supporting leading-edge science.
But these achievements have occurred in the context
of overlapping short-term projects. How can we
ensure the survival of valuable existing cyberinfrastructure while continuing to address new
challenges posed by frontier scientific and engineering
endeavors?
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
80
Production Simulations on Grid3
US-CMS Monte Carlo Simulation
Used = 1.5  US-CMS resources
Non-USCMS
USCMS
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
81