Integrating Universities and Laboratories in
Download
Report
Transcript Integrating Universities and Laboratories in
Integrating Universities and Laboratories
In National Cyberinfrastructure
Paul Avery
University of Florida
[email protected]
PASI Lecture
Mendoza, Argentina
May 17, 2005
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
1
Outline of Talk
Cyberinfrastructure
Data
The
intensive disciplines and Data Grids
Trillium Grid collaboration
GriPhyN,
The
iVDGL, PPDG
LHC and its computing challenges
Grid3
A
and Grids
and the Open Science Grid
bit on networks
Education
and Outreach
Challenges
for the future
Summary
Presented from a physicist’s perspective!
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
2
Cyberinfrastructure (cont)
Software
programs, services, instruments, data,
information, knowledge, applicable to specific projects,
disciplines, and communities.
Cyberinfrastructure
layer of enabling hardware,
algorithms, software, communications, institutions, and
personnel. A platform that empowers researchers to
innovate and eventually revolutionize what they do, how
they do it, and who participates.
Base
technologies: Computation, storage, and
communication components that continue to advance in
raw capacity at exponential rates.
[Paraphrased from NSF Blue Ribbon Panel report, 2003]
Challenge: Creating and operating advanced cyberinfrastructure and
integrating
it in science
and engineering
PASI: Mendoza, Argentina
(May 17, 2005)
Paul Averyapplications.
3
Cyberinfrastructure and Grids
Grid:
Geographically distributed computing resources
configured for coordinated use
Fabric:
Physical resources & networks provide raw capability
Ownership: Resources controlled by owners and shared w/ others
Middleware: Software ties it all together: tools, services, etc.
Enhancing
collaboration via transparent resource sharing
US-CMS
“Virtual Organization”
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
4
Data Grids & Collaborative
Research
Team-based
21st century scientific discovery
Strongly
dependent on advanced information technology
People and resources distributed internationally
Dominant
2000
2005
2010
2015-7
Drives
factor: data growth (1 Petabyte = 1000 TB)
~0.5 Petabyte
~10 Petabytes
~100 Petabytes
~1000 Petabytes?
How to collect, manage,
access and interpret this
quantity of data?
need for powerful linked resources: “Data Grids”
Computation
Data
storage and access
Data movement
Collaborative
Data
Massive, distributed CPU
Distributed hi-speed disk and tape
International optical networks
research and Data Grids
discovery, resource sharing, distributed analysis, etc.
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
5
Examples of Data Intensive
Disciplines
High
energy & nuclear physics
Belle,
BaBar, Tevatron, RHIC, JLAB
Primary driver
Large Hadron Collider (LHC)
Astronomy
Digital
sky surveys, “Virtual” Observatories
VLBI arrays: multiple- Gb/s data streams
Gravity
wave searches
LIGO,
Earth
GEO, VIRGO, TAMA, ACIGA, …
and climate systems
Earth
Biology,
Observation, climate modeling, oceanography, …
medicine, imaging
Genome
databases
Proteomics (protein structure & interactions, drug delivery, …)
High-resolution brain scans (1-10m, time dependent)
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
6
Our Vision & Goals
Develop
the technologies & tools needed to
exploit a Grid-based cyberinfrastructure
End-to-end
Apply
and evaluate those technologies & tools in
challenging scientific problems
Develop
the technologies & procedures to
support a permanent Grid-based cyberinfrastructure
Create
and operate a persistent Grid-based
cyberinfrastructure in support of discipline-specific
research goals
PASI: Mendoza,
Argentina (May
2005) Particle Physics
Paul Avery
7
GriPhyN
+ iVDGL
+ 17,
DOE
Data Grid (PPDG) = Trillium
Our Science Drivers
at Large Hadron Collider
New
High
Energy & Nuclear Physics expts
Top
quark, nuclear matter at extreme density
~1 Petabyte (1000 TB)
1997 – present
LIGO
(gravity wave search)
Search
for gravitational waves
100s of Terabytes
2002 – present
Sloan
2007
2005
2003
2001
Digital Sky Survey
Data growth
fundamental particles and forces
100s of Petabytes
2007 - ?
2009
Community growth
Experiments
Systematic
survey of astronomical objects
10s of Terabytes
2001 – present
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
8
Grid Middleware: Virtual Data Toolkit
VDT
NMI
Sources
(CVS)
Build & Test
Condor pool
22+ Op. Systems
Build
Binaries
Test
Pacman cache
Package
Patching
RPMs
Build
Binaries
GPT src
bundles
Test
Build
Binaries
Many Contributors
A unique laboratory for testing, supporting, deploying, packaging, upgrading, &
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
9
troubleshooting
complex sets of software!
VDT Growth Over 3 Years
www.griphyn.org/vdt/
35
VDT 1.1.x
25
VDT 1.2.x
VDT 1.3.x
VDT 1.1.8
First real use by LCG
20
VDT 1.0
Globus 2.0b
Condor 6.3.1
15
10
VDT 1.1.11
Grid3
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
Apr-05
Jan-05
Oct-04
Jul-04
Apr-04
Jan-04
Oct-03
Jul-03
Jan-03
Oct-02
Apr-02
0
Jul-02
5
Apr-03
VDT 1.1.7
Switch to Globus 2.2
Jan-02
# of components
30
10
Components of VDT 1.3.5
Globus
3.2.1
Condor 6.7.6
RLS 3.0
ClassAds 0.9.7
Replica 2.2.4
DOE/EDG CA certs
ftsh 2.0.5
EDG mkgridmap
EDG CRL Update
GLUE Schema 1.0
VDS 1.3.5b
Java
Netlogger 3.2.4
Gatekeeper-Authz
MyProxy1.11
KX509
PASI: Mendoza, Argentina (May 17, 2005)
System
Profiler
GSI OpenSSH 3.4
Monalisa 1.2.32
PyGlobus 1.0.6
MySQL
UberFTP 1.11
DRM 1.2.6a
VOMS 1.4.0
VOMS Admin 0.7.5
Tomcat
PRIMA 0.2
Certificate Scripts
Apache
jClarens 0.5.3
New GridFTP Server
GUMS 1.0.1
Paul Avery
11
Collaborative Relationships:
A CS + VDT Perspective
Partner science projects
Partner networking projects
Partner outreach projects
Requirements
Prototyping
& experiments
Other linkages
Work
force
CS researchers
Industry
U.S.Grids
Int’l
Outreach
Production
Deployment
Computer
Virtual
Larger
Techniques
Tech
Science
Data
Science
& software
Research
Toolkit Transfer Community
Globus, Condor, NMI, iVDGL, PPDG
EU DataGrid, LHC Experiments,
QuarkNet, CHEPREO, Dig. Divide
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
12
U.S. “Trillium” Grid Partnership
Trillium
= PPDG + GriPhyN + iVDGL
Particle
Physics Data Grid: $12M (DOE)
GriPhyN:
$12M (NSF)
iVDGL:
$14M (NSF)
Basic
(1999 – 2006)
(2000 – 2005)
(2001 – 2006)
composition (~150 people)
PPDG:
4 universities, 6 labs
GriPhyN: 12 universities, SDSC, 3 labs
iVDGL:
18 universities, SDSC, 4 labs, foreign partners
Expts:
BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO, SDSS/NVO
Coordinated
internally to meet broad goals
GriPhyN:
CS research, Virtual Data Toolkit (VDT) development
iVDGL:
Grid laboratory deployment using VDT, applications
PPDG:
“End to end” Grid services, monitoring, analysis
Common use of VDT for underlying Grid middleware
Unified entity when collaborating internationally
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
13
Goal: Peta-scale Data Grids for
Global Science
Production Team
Single Researcher
Workgroups
Interactive User Tools
Virtual Data Tools
Request Planning &
Scheduling Tools
Resource
Management
Services
Security and
Policy
Services
PetaOps
Petabytes
Performance
Other Grid
Services
Transforms
Distributed resources
Raw data
source
PASI: Mendoza, Argentina (May 17, 2005)
Request Execution &
Management Tools
(code, storage, CPUs,
networks)
Paul Avery
14
Sloan Digital Sky Survey (SDSS)
Using Virtual Data in GriPhyN
Sloan Data
100000
Galaxy cluster
size distribution
Number of Clusters
10000
1000
100
10
1
1
PASI: Mendoza, Argentina (May 17, 2005)
10
Number of Galaxies
Paul Avery 100
15
The LIGO Scientific Collaboration (LSC)
and the LIGO Grid
LIGO Grid: 6 US sites + 3 EU sites (Cardiff/UK, AEI/Germany)
iVDGL has enabled LSC to establish a persistent production grid
Birmingham•
Cardiff
AEI/Golm •
* LHO, LLO: observatory sites
* LSC - LIGO Scientific Collaboration - iVDGL supported
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
16
Large Hadron Collider &
its Frontier Computing
Challenges
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
17
Large Hadron Collider (LHC)
@ CERN
27 km Tunnel in Switzerland & France
TOTEM
CMS
ALICE
LHCb
Search for
Origin of Mass
New fundamental forces
Supersymmetry
Other new particles
PASI: –
Mendoza,
2007
? Argentina (May 17, 2005)
ATLAS
Paul Avery
18
CMS: “Compact” Muon Solenoid
Inconsequential humans
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
19
LHC Data Rates: Detector to
Storage
40 MHz
Physics filtering
~TBytes/sec
Level 1 Trigger: Special Hardware
75 GB/sec
75 KHz
Level 2 Trigger: Commodity CPUs
5 GB/sec
5 KHz
Level 3 Trigger: Commodity CPUs
0.15 – 1.5 GB/sec
100 Hz
Raw Data to storage
(+ simulated data)
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
20
Complexity: Higgs Decay to 4 Muons
(+30 minimum bias events)
All charged tracks with pt > 2 GeV
Reconstructed tracks with pt > 25 GeV
109 collisions/sec, selectivity: 1 in 1013
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
21
LHC: Petascale Global Science
Complexity:
Millions of individual detector channels
Scale:
PetaOps (CPU), 100s of Petabytes (Data)
Distribution:
Global distribution of people & resources
BaBar/D0 Example - 2004
700+ Physicists
100+ Institutes
35+ Countries
CMS Example- 2007
5000+ Physicists
250+ Institutes
60+ Countries
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
22
LHC Global Data Grid (2007+)
5000 physicists, 60 countries
10s of Petabytes/yr by 2008
1000 Petabytes in < 10 yrs?
CMS Experiment
Online
System
Tier 0
Tier 1
CERN Computer
Center
150 - 1500 MB/s
Korea
Russia
UK
10-40 Gb/s
USA
>10 Gb/s
U Florida
Tier 2
Caltech
UCSD
2.5-10 Gb/s
Tier 3
Tier 4
FIU
Physics caches
PASI: Mendoza, Argentina (May 17, 2005)
Iowa
Maryland
PCs
Paul Avery
23
University Tier2 Centers
Tier2
facility
Essential
university role in extended computing infrastructure
20 – 25% of Tier1 national laboratory, supported by NSF
Validated by 3 years of experience (CMS, ATLAS, LIGO)
Functions
Perform
physics analysis, simulations
Support experiment software
Support smaller institutions
Official
role in Grid hierarchy (U.S.)
Sanctioned
by MOU with parent organization (ATLAS, CMS, LIGO)
Selection by collaboration via careful process
Local P.I. with reporting responsibilities
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
24
Grids and Globally Distributed Teams
Non-hierarchical: Chaotic analyses + productions
Superimpose significant random data flows
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
25
Grid3 and
Open Science Grid
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
26
Grid3: A National Grid Infrastructure
32
sites, 4000 CPUs: Universities + 4 national labs
Part of LHC Grid, Running since October 2003
Sites in US, Korea, Brazil, Taiwan
Applications in HEP, LIGO, SDSS, Genomics, fMRI, CS
Brazil
PASI: Mendoza, Argentina (May 17, 2005)
http://www.ivdgl.org/grid3
Paul Avery
27
Grid3 World Map
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
28
Grid3 Components
Computers
Uniform
& storage at ~30 sites: 4000 CPUs
service environment at each site
Globus
Toolkit: Provides basic authentication, execution
management, data movement
Pacman: Installs numerous other VDT and application services
Global
& virtual organization services
Certification
& registration authorities, VO membership services,
monitoring services
Client-side
tools for data access & analysis
Virtual
data, execution planning, DAG management, execution
management, monitoring
IGOC:
Grid
iVDGL Grid Operations Center
testbed: Grid3dev
Middleware
development and testing, new VDT versions, etc.
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
29
Grid3 Applications
CMS experiment
p-p collision simulations & analysis
ATLAS experiment
p-p collision simulations & analysis
BTEV experiment
p-p collision simulations & analysis
LIGO
Search for gravitational wave sources
SDSS
Galaxy cluster finding
Bio-molecular analysis Shake n Bake (SnB) (Buffalo)
Genome analysis
GADU/Gnare
fMRI
Functional MRI (Dartmouth)
CS Demonstrators
Job Exerciser, GridFTP, NetLogger
www.ivdgl.org/grid3/applications
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
30
Usage: CPUs
Grid3 Shared Use Over 6 months
ATLAS
DC2
CMS DC04
Sep 10
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
31
Grid3 Production Over 13 Months
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
32
U.S. CMS 2003 Production
10M
p-p collisions; largest ever
2x
simulation sample
½ manpower
Multi-VO
sharing
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
33
1500
1000
FNAL_CMS
IU_ATLAS_Tier2
0
22-Jan 23-Jan 24-Jan 25-Jan 26-Jan 27-Jan 28-Jan
PASI: Mendoza, Argentina (May 17, 2005)
500
0
Paul Avery
CalTech-PG
10
2000
ANL_HEP
20
Cost Function
Vanderbilt
30
At Data
UM_ATLAS
40
Least Loaded
UFlorida-Grid3
50
2500
UCSanDiegoPG
60
0
UBuffalo-CCR
70
200
BNL_ATLAS
Average number of Idle CPUs
80
400
IU_ATLAS_Tier2
comparisons
with simulations
Total I/O traffic (MBytes)
Enables
600
UFlorida-PG
data placement
in a realistic environment
(K. Ranganathan)
800
CalTech-Grid3
Adaptive
Total Response Time (seconds)
Grid3 as CS Research Lab:
E.g., Adaptive Scheduling
34
Grid3 Lessons Learned
How
to operate a Grid as a facility
Tools,
services, error recovery, procedures, docs, organization
Delegation of responsibilities (Project, VO, service, site, …)
Crucial role of Grid Operations Center (GOC)
How
to support people people relations
Face-face
How
to test and validate Grid tools and applications
Vital
How
role of testbeds
to scale algorithms, software, process
Some
How
meetings, phone cons, 1-1 interactions, mail lists, etc.
successes, but “interesting” failure modes still occur
to apply distributed cyberinfrastructure
Successful
production runs for several applications
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
35
Grid3 Open Science Grid
Iteratively
build & extend Grid3
OSG-0 OSG-1 OSG-2 …
Shared resources, benefiting broad set of disciplines
Grid middleware based on Virtual Data Toolkit (VDT)
Emphasis on “end to end” services for applications
Grid3
OSG
collaboration
Computer
and application scientists
Facility, technology and resource providers (labs, universities)
Further
develop OSG
Partnerships
and contributions from other sciences, universities
Incorporation of advanced networking
Focus on general services, operations, end-to-end performance
Aim
for Summer 2005 deployment
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
36
http://www.opensciencegrid.org
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
37
OSG Organization
Advisory Committee
Universities,
Labs
Service
Providers
Technical
Groups
Executive Board
(8-15 representatives
Chair, Officers)
Sites
Researchers
VOs
activity
activity
1
activity
1
1
Activities
Research
Grid Projects
Enterprise
PASI: Mendoza, Argentina (May 17, 2005)
Core OSG Staff
OSG Council
(few FTEs, manager)
(all members above
a certain threshold,
Chair, officers)
Paul Avery
38
OSG Technical Groups & Activities
Technical
Groups address and coordinate technical areas
Propose
and carry out activities related to their given areas
Liaise & collaborate with other peer projects (U.S. & international)
Participate in relevant standards organizations.
Chairs participate in Blueprint, Integration and Deployment activities
Activities
are well-defined, scoped tasks contributing to OSG
Each
Activity has deliverables and a plan
… is self-organized and operated
… is overseen & sponsored by one or more Technical Groups
TGs and Activities are where the real work gets done
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
39
OSG Technical Groups
Governance
Charter, organization, by-laws, agreements,
formal processes
Policy
VO & site policy, authorization, priorities,
privilege & access rights
Security
Common security principles, security
infrastructure
Monitoring and
Information Services
Resource monitoring, information services,
auditing, troubleshooting
Storage
Storage services at remote sites, interfaces,
interoperability
Infrastructure and services for user support,
helpdesk, trouble ticket
Training, interface with various E/O projects
Support Centers
Education / Outreach
Networks (new)
Including interfacing with various networking
projects
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
40
OSG Activities
Blueprint
Defining principles and best practices for OSG
Deployment
Deployment of resources & services
Provisioning
Connected to deployment
Incidence response
Plans and procedures for responding to security
incidents
Integration
Testing & validating & integrating new services
and technologies
Data Resource
Management (DRM)
Deployment of specific Storage Resource
Management technology
Documentation
Organizing the documentation infrastructure
Accounting
Accounting and auditing use of OSG resources
Interoperability
Primarily interoperability between
Operations
Operating Grid-wide services
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
41
Connections to European Projects:
LCG and EGEE
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
42
The Path to the OSG Operating
Grid
Readiness
plan adopted
Readiness
plan
Software &
packaging
Effort
Service
deployment
Resources
VO
Application
Software
Installation
Middleware
Interoperability
Functionality &
Scalability
Tests
feedback
PASI: Mendoza, Argentina (May 17, 2005)
Release
Description
Paul Avery
Application
validation
Metrics &
Certification
Release
Candidate
OSG Operations-Provisioning
Activity
OSG Deployment Activity
OSG Integration Activity
43
OSG Integration Testbed
Brazil
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
44
Status of OSG Deployment
OSG infrastructure release accepted for deployment.
US CMS MOP “flood testing” successful
D0 simulation & reprocessing jobs running on selected OSG sites
Others in various stages of readying applications & infrastructure
(ATLAS, CMS, STAR, CDF, BaBar, fMRI)
Deployment process underway: End of July?
Open OSG and transition resources from Grid3
Applications will use growing ITB & OSG resources during
transition
http://osg.ivdgl.org/twiki/bin/view/Integration/WebHome
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
45
Interoperability & Federation
Transparent
use of Federated Grid infrastructures a goal
There
are sites that appear as part of “LCG” as well as part
of OSG/Grid3
D0
bringing reprocessing to LCG sites through adaptor node
CMS and ATLAS can run their jobs on both LCG and OSG
Increasing
interaction with TeraGrid
CMS
and ATLAS sample simulation jobs are running on TeraGrid
Plans for TeraGrid allocation for jobs running in Grid3 model: with
group accounts, binary distributions, external data management, etc
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
46
Networks
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
47
Evolving Science Requirements for Networks
(DOE High Perf. Network Workshop)
End2End
Throughput
5 years
End2End
Throughput
High Energy
Physics
Climate (Data &
Computation)
SNS
NanoScience
0.5 Gb/s
100 Gb/s
5-10 Years
End2End
Throughput
1000 Gb/s
0.5 Gb/s
160-200 Gb/s
N x 1000 Gb/s
Not yet
started
1 Gb/s
1000 Gb/s +
QoS for Control
Channel
Fusion Energy
0.066 Gb/s
(500 MB/s
burst)
0.013 Gb/s
(1 TB/week)
0.2 Gb/s
(500MB/
20 sec. burst)
N*N multicast
N x 1000 Gb/s
Time critical
throughput
1000 Gb/s
0.091 Gb/s
(1 TB/day)
100s of users
1000 Gb/s +
QoS for Control
Channel
Computational
steering and
collaborations
High throughput
and steering
Science Areas
Astrophysics
Genomics Data
& Computation
Today
Remarks
High bulk
throughput
High bulk
throughput
Remote control
and time critical
throughput
See http://www.doecollaboratory.org/meetings/hpnpw
PASI: Mendoza,
Argentina (May 17, 2005)
Paul Avery
/
48
UltraLight: Advanced Networking
in Applications
Funded by ITR2004
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
10 Gb/s+ network
• Caltech, UF, FIU, UM, MIT
• SLAC, FNAL
• Int’l partners
49
• Level(3), Cisco, NLR
UltraLight: New Information
System
A
new class of integrated information systems
Includes
networking as a managed resource for the first time
Uses “Hybrid” packet-switched and circuit-switched optical
network infrastructure
Monitor, manage & optimize network and Grid Systems in realtime
Flagship
applications: HEP, eVLBI, “burst” imaging
“Terabyte-scale”
data transactions in minutes
Extend Real-Time eVLBI to the 10 – 100 Gb/s Range
Powerful
testbed
Significant
Strong
storage, optical networks for testing new Grid services
vendor partnerships
Cisco,
Calient, NLR, CENIC, Internet2/Abilene
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
50
Education and Outreach
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
51
iVDGL, GriPhyN Education/Outreach
Basics
$200K/yr
Led by UT Brownsville
Workshops, portals, tutorials
New partnerships with QuarkNet,
CHEPREO, LIGO E/O, …
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
52
June 2004 Grid Summer School
First
of its kind in the U.S. (South Padre Island, Texas)
36
students, diverse origins and types (M, F, MSIs, etc)
Marks
new direction for U.S. Grid efforts
First
attempt to systematically train people in Grid technologies
First attempt to gather relevant materials in one place
Today: Students in CS and Physics
Next:
Students, postdocs, junior & senior scientists
Reaching
a wider audience
Put
lectures, exercises, video, on the web
More tutorials, perhaps 2-3/year
Dedicated resources for remote tutorials
Create “Grid Cookbook”, e.g. Georgia Tech
Second
South
workshop: July 11–15, 2005
Padre Island again
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
53
QuarkNet/GriPhyN e-Lab Project
http://quarknet.uchicago.edu/elab/cosmic/home.jsp
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
54
Student Muon Lifetime Analysis in
GriPhyN/QuarkNet
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
55
CHEPREO: Center for High Energy Physics Research
and Educational Outreach
Florida International University
Physics Learning Center
CMS Research
iVDGL Grid Activities
AMPATH network (S. America)
Funded September 2003
$4M initially (3 years)
MPS, CISE, EHR, INT
Grids and the Digital Divide
Rio de Janeiro, Feb. 16-20, 2004
NEWS:
Bulletin: ONE TWO
WELCOME BULLETIN
General Information
Registration
Travel Information
Hotel Registration
Participant List
How to Get UERJ/Hotel
Computer Accounts
Useful Phone Numbers
Program
Contact us:
Secretariat
Chairmen
Background
World Summit on Information Society
HEP Standing Committee on Inter-regional
Connectivity (SCIC)
Themes
Global collaborations, Grids and addressing
the Digital Divide
Focus on poorly connected regions
Next meeting: Daegu, Korea
May 23-27, 2005
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
57
Partnerships Drive Success
Integrating
Grids in scientific research
“Lab-centric”:
“Team-centric”:
“Knowledge-centric”:
Strengthening
Activities center around large facility
Resources shared by distributed teams
Knowledge generated/used by a community
the role of universities in frontier research
Couples
universities to frontier data intensive research
Brings front-line research and resources to students
Exploits intellectual resources at minority or remote institutions
Driving
advances in IT/science/engineering
Domain
sciences
Universities
Scientists
NSF projects
NSF
Research communities
PASI: Mendoza, Argentina (May 17, 2005)
Computer Science
Laboratories
Students
NSF projects
DOE
IT industry
Paul Avery
58
Fulfilling the Promise of
Next Generation Science
Supporting
permanent, national-scale Grid infrastructure
Large
CPU, storage and network capability crucial for science
Support personnel, equipment maintenance, replacement, upgrade
Tier1 and Tier2 resources a vital part of infrastructure
Open Science Grid a unique national infrastructure for science
Supporting
the maintenance, testing and dissemination of
advanced middleware
Long-term
support of the Virtual Data Toolkit
Vital for reaching new disciplines & for supporting large
international collaborations
Continuing
support for HEP as a frontier challenge driver
Huge
challenges posed by LHC global interactive analysis
New challenges posed by remote operation of Global Accelerator
Network
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
59
Fulfilling the Promise (2)
Creating
even more advanced cyberinfrastructure
Integrating
databases in large-scale Grid environments
Interactive analysis with distributed teams
Partnerships involving CS research with application drivers
Supporting
the emerging role of advanced networks
Reliable,
high performance LANs and WANs necessary for
advanced Grid applications
Partnering
to enable stronger, more diverse programs
Programs
supported by multiple Directorates, a la CHEPREO
NSF-DOE joint initiatives
Strengthen ability of universities and labs to work together
Providing
opportunities for cyberinfrastructure training,
education & outreach
Grid
tutorials, Grid Cookbook
Collaborative tools for student-led projects & research
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
60
Summary
Grids
enable 21st century collaborative science
Linking
research communities and resources for scientific discovery
Needed by global collaborations pursuing “petascale” science
Grid3
was an important first step in developing US Grids
Value
of planning, coordination, testbeds, rapid feedback
Value of learning how to operate a Grid as a facility
Value of building & sustaining community relationships
Grids
drive need for advanced optical networks
Grids
impact education and outreach
Providing
technologies & resources for training, education, outreach
Addressing the Digital Divide
OSG:
a scalable computing infrastructure for science?
Strategies
needed to cope with increasingly large scale
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
61
Grid Project References
Open
Science Grid
UltraLight
www.opensciencegrid.org
Grid3
ultralight.cacr.caltech.edu
Globus
www.ivdgl.org/grid3
Virtual
Data Toolkit
www.griphyn.org/vdt
GriPhyN
www.griphyn.org
iVDGL
www.ivdgl.org
PPDG
www.ppdg.net
www.globus.org
Condor
www.cs.wisc.edu/condor
LCG
www.cern.ch/lcg
EU DataGrid
www.eu-datagrid.org
EGEE
www.eu-egee.org
CHEPREO
www.chepreo.org
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
62
Extra Slides
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
63
GriPhyN Goals
Conduct
CS research to achieve vision
Virtual
Data as unifying principle
Planning, execution, performance monitoring
Disseminate
A
through Virtual Data Toolkit
“concrete” deliverable
Integrate
into GriPhyN science experiments
Common
Educate,
Grid tools, services
involve, train students in IT research
Undergrads,
grads, postdocs,
Underrepresented groups
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
64
iVDGL Goals
Deploy
a Grid laboratory
Support
research mission of data intensive experiments
Provide computing and personnel resources at university sites
Provide platform for computer science technology development
Prototype and deploy a Grid Operations Center (iGOC)
Integrate
Into
Grid software tools
computing infrastructures of the experiments
Support
delivery of Grid technologies
Hardening
of the Virtual Data Toolkit (VDT) and other middleware
technologies developed by GriPhyN and other Grid projects
Education
and Outreach
Lead
and collaborate with Education and Outreach efforts
Provide tools and mechanisms for underrepresented groups and
remote regions to participate in international science projects
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
65
Analysis
Client
•Discovery
•ACL management
•Cert. based access
•ROOT (analysis tool)
•Python
•Cojac (detector viz)/
IGUANA (cms viz)
HTTP, SOAP, XMLRPC
Clarens
CMS: Grid Enabled
Analysis Architecture
Analysis
Client
Clients talk standard
protocols to “Grid
Services Web Server”
Grid Services
Web Server
Scheduler
Sphinx
MCRunjob
Catalogs
FullyAbstract
Planner
Metadata
FullyConcrete
Planner
Typical clients: ROOT,
Web Browser, ….
RefDB
PartiallyAbstract
Planner
Chimera
Simple Web service API
allows simple or
complex analysis clients
Data
Management
Virtual
Data
MonALISA
MOPDB
Monitoring
Replica
BOSS
ORCA
Applications
ROOT
FAMOS
POOL
Execution
Priority
Manager
Key features: Global
Scheduler, Catalogs,
Monitoring, Grid-wide
Execution service
VDT-Server
Grid Wide
Execution
PASI: Mendoza, Argentina (May
17, 2005)
Service
Clarens portal hides
complexity
Paul Avery
66
“Virtual Data”: Derivation &
Provenance
Most
scientific data are not simple “measurements”
They
are computationally corrected/reconstructed
They can be produced by numerical simulation
Science
& eng. projects are more CPU and data intensive
Programs
are significant community resources (transformations)
So are the executions of those programs (derivations)
Management
Derivation:
Provenance:
of dataset dependencies critical!
Instantiation of a potential data product
Complete history of any existing data product
Previously: Manual methods
GriPhyN: Automated, robust tools
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
67
Virtual Data Example: HEP Analysis
decay = bb
decay = WW
WW leptons
decay = ZZ
mass = 160
decay = WW
Other cuts
Other cuts
PASI: Mendoza, Argentina (May 17, 2005)
decay = WW
WW e
Other cuts
Paul Avery
Scientist adds a new
derived data branch
& continues analysis
decay = WW
WW e
Pt > 20
Other cuts
68
Packaging of Grid Software: Pacman
Language: define software environments
Interpreter: create, install, configure, update, verify environments
Version 3.0.2 released Jan. 2005
LCG/Scram
ATLAS/CMT
CMS DPE/tar/make
LIGO/tar/make
Globus/GPT
NPACI/TeraGrid/tar/make
D0/UPS-UPD
Commercial/tar/make
OpenSource/tar/make
Combine and
manage software
from arbitrary
sources.
“1 button install”:
Reduce burden
on administrators
% pacman –get iVDGL:Grid3
LIGO
VDTVDT UCHEP
iVDGL
%
pacman
PASI: Mendoza, Argentina (May 17, 2005)
DZero
ATLA
CMS/DPE S
NPAC
Paul Avery
I
Remote experts
define installation/
config/updating for
everyone at once
69
Virtual Data Motivations
“I’ve found some interesting
data, but I need to know
exactly what corrections were
applied before I can trust it.”
“I’ve detected a muon calibration
error and want to know which
derived data products need to be
recomputed.”
Describe
Discover
VDC
Reuse
Validate
“I want to search a database for 3
muon events. If a program that does
this analysis exists, I won’t have to
write one from scratch.”
PASI: Mendoza, Argentina (May 17, 2005)
“I want to apply a forward jet
analysis to 100M events. If the
results already exist, I’ll save
weeks of computation.”
Paul Avery
70
Background: Data Grid Projects
Driven primarily by HEP applications
U.S.
Funded Projects
EU,
GriPhyN
(NSF)
iVDGL (NSF)
Particle Physics Data Grid (DOE)
UltraLight
TeraGrid (NSF)
DOE Science Grid (DOE)
NEESgrid (NSF)
NSF Middleware Initiative (NSF)
Asia projects
EGEE
(EU)
LCG (CERN)
DataGrid
EU national Projects
DataTAG (EU)
CrossGrid (EU)
GridLab (EU)
Japanese, Korea Projects
Many projects driven/led by HEP + CS
Many 10s x $M brought into the field
Large impact on other sciences, education
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
71
“Virtual Data”: Derivation &
Provenance
Most
scientific data are not simple “measurements”
They
are computationally corrected/reconstructed
They can be produced by numerical simulation
Science
& eng. projects are more CPU and data intensive
Programs
are significant community resources (transformations)
So are the executions of those programs (derivations)
Management
Derivation:
Provenance:
of dataset dependencies critical!
Instantiation of a potential data product
Complete history of any existing data product
Previously: Manual methods
GriPhyN: Automated, robust tools
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
72
Muon Lifetime Analysis Workflow
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
73
(Early) Virtual Data Language
pythia_input
pythia.exe
cmsim_input
cmsim.exe
writeHits
writeDigis
CMS “Pipeline”
begin v /usr/local/demo/scripts/cmkin_input.csh
file i ntpl_file_path
file i template_file
file i num_events
stdout cmkin_param_file
end
begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe
pre
cms_env_var
stdin cmkin_param_file
stdout cmkin_log
file o ntpl_file
end
begin v /usr/local/demo/scripts/cmsim_input.csh
file i ntpl_file
file i fz_file_path
file i hbook_file_path
file i num_trigs
stdout cmsim_param_file
end
begin v /usr/local/demo/binaries/cms121.exe
condor copy_to_spool=false
condor getenv=true
stdin cmsim_param_file
stdout cmsim_log
file o fz_file
file o hbook_file
end
begin v /usr/local/demo/binaries/writeHits.sh
condor getenv=true
pre orca_hits
file i fz_file
file i detinput
file i condor_writeHits_log
file i oo_fd_boot
file i datasetname
stdout writeHits_log
file o hits_db
end
begin v /usr/local/demo/binaries/writeDigis.sh
pre orca_digis
file i hits_db
file i oo_fd_boot
file i carf_input_dataset_name
file i carf_output_dataset_name
file i carf_input_owner
file i carf_output_owner
file i condor_writeDigis_log
stdout writeDigis_log
file o digis_db
end
QuarkNet Portal Architecture
Simpler interface for non-experts
Builds on Chiron portal
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
75
Integration of GriPhyN and IVDGL
Both
funded by NSF large ITRs, overlapping periods
GriPhyN:
iVDGL:
Basic
composition
GriPhyN:
iVDGL:
Expts:
GriPhyN
12 universities, SDSC, 4 labs
18 institutions, SDSC, 4 labs
CMS, ATLAS, LIGO, SDSS/NVO
(~80 people)
(~100 people)
(Grid research) vs iVDGL (Grid deployment)
GriPhyN:
iVDGL:
Many
CS Research, Virtual Data Toolkit (9/2000–9/2005)
Grid Laboratory, applications
(9/2001–9/2006)
2/3 “CS” + 1/3 “physics”
1/3 “CS” + 2/3 “physics”
( 0% H/W)
(20% H/W)
common elements
Common
Directors, Advisory Committee, linked management
Common Virtual Data Toolkit (VDT)
Common Grid testbeds
Common Outreach effort
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
76
GriPhyN Overview
Virtual Production
Data params
discovery
Science
Review
sharing
exec.
discovery
data
Researcher
composition
Applications
instrument
Chimera
virtual data
system
Planning
Planning
planning
Production
Manager
storage
element
Services
Services
storage
element
storage
element
Grid
Grid Fabric
Pegasus
planner
DAGman
Globus Toolkit
Condor
Ganglia, etc.
Virtual Data
Toolkit
Execution
Analysis
Chiron/QuarkNet Architecture
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
78
Cyberinfrastructure
“A new age has dawned in scientific & engineering research,
pushed by continuing progress in computing, information, and
communication technology, & pulled by the expanding
complexity, scope, and scale of today’s challenges. The
capacity of this technology has crossed thresholds that now
make possible a comprehensive “cyberinfrastructure” on which
to build new types of scientific & engineering knowledge
environments & organizations and to pursue research in new
ways & with increased efficacy.”
[NSF Blue Ribbon Panel report, 2003]
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
79
Fulfilling the Promise of
Next Generation Science
Our multidisciplinary partnership of physicists,
computer scientists, engineers, networking specialists
and education experts, from universities and
laboratories, has achieved tremendous success in
creating and maintaining general purpose
cyberinfrastructure supporting leading-edge science.
But these achievements have occurred in the context
of overlapping short-term projects. How can we
ensure the survival of valuable existing cyberinfrastructure while continuing to address new
challenges posed by frontier scientific and engineering
endeavors?
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
80
Production Simulations on Grid3
US-CMS Monte Carlo Simulation
Used = 1.5 US-CMS resources
Non-USCMS
USCMS
PASI: Mendoza, Argentina (May 17, 2005)
Paul Avery
81