CI to DD Retreat

Download Report

Transcript CI to DD Retreat

CyberInfrastructure
and
Office of CyberInfrastructure
(OCI)
to: SURA Information Technology Committee Meeting
José Muñoz , Ph.D.
Office of CyberInfrastructure
Deputy Office Director
Senior Science Advisor
Outline






OCI
NSF organizational changes
CyberInfrastructure (CI) at NSF
Strategic Planning
New HPC Acquisition
Related OCI activities
Summary
Muñoz
2
CyberInfrastructure (CI)
Governance
• Dr. Dan Atkins (University of Michigan) selected as new OCI
Director
• served as Chair of NSF's Blue-Ribbon Advisory Panel on
Cyberinfrastructure: Revolutionizing Science and Engineering Through
Cyberinfrastructure (2003)
• tenure begins June 2006
• CyberInfrastructure Council (CIC) created
• NSF ADs and ODs, chaired by Dr. Bement (NSF Dir.)
• CIC responsible for shared stewardship and ownership of NSF’s
cyberinfrastructure portfolio
• Advisory Committee for NSF’s CI activities and portfolio
created
• candidate members have been identified.
• first meeting June 2006?
OCI
Muñoz
3
CI Governance
• NSF High-End Computing Coordinating Group
• representatives from each NSF Directorate
•
•
Chaired by OCI
Meets every other week
• SCI to OCI Realignment
SCI was moved from CISE to Office of the Director and
became Office of CyberInfrastructure (OCI)
• Budget transferred
• Ongoing projects and personnel transferred
•
• OCI is focused on provision of “production-quality” CI
for research and education
CISE remains focused on basic CS research and
education mission
• CI software and hardware
• Collaborate with other NSF Directorates
•
•
CI Strategic Planning Process Underway
•
OCI
NSF CyberInfrastructure Vision document
Muñoz
4
Office of CyberInfrastructure
Debra Crawford
Office Director
(Interim)
Judy Hayden
Priscilla Bezdek
Mary Daley
Irene Lombardo
Allison Smith
José Muñoz
Dep. Office Dir.
ANL RP
IU RP
PU RP
ORNL RP
TACC RP
MRI
REU Sites
STI
NMI Dev.
CyberSecurity
CI-TEAM
EPSCOR
GriPhyN
Disun
CCG NMI
SDSC Core
SDSC RP
ETF GIG
EIN
IRNC
Condor
NMI Integ.
Optiputer
HPC Acq.
NCSA Core
NCSA RP
PSC RP
Guy Almes
Doug Gatchell
Miriam Heller
Fillia Makedon
Steve Meacham
Kevin Thompson
Program Director
Program Director
Program Director
Program Director
Program Director
Program Director
Frank Scioli
Program Director
OCI
SBE CyberTools
SBE POC
Vittal Rao
Program Director
Muñoz
(Vacancy)
Program Director
(Software)
5
CI Vision Document
4 Interrelated Plans
Collaboratories,
Observatories &
Virtual
Organizations
High Performance Data, Data
Analysis &
Computing
Visualization
Learning &
Workforce
Development
OCI
Muñoz
6
Strategic Plan
(FY 2006 – 2010)
• Ch. 1: Call to Action
Strategic Plans for:
• Ch. 2: High Performance Computing
•Ch. 3: Data, Data Analysis &
Visualization
• Ch. 4: Colaboratories, Observatories
and Virtual Organizations
• Ch. 5: Learning and Workforce
Development
Completed in Summer 2006
OCI
Muñoz
7
CyberInfrastructure Budgets
NSF 2006 CI Budget
OCI Budget: $127M (FY06)
1%
1%1%
HPC ACQ.
25%
NMI
5%
1%
TESTBEDS
5%
26%
4%
Research
directorates
OCI
IRNC
FY07: $182.42 (Request)
8%
75%
CI-TEAM
34%
14%
CORE
ETF
OCI
HPC hardware acquisitions, O&M, and user
support as a fraction of NSF’s overall CI budget
Muñoz
8
Recent CI Activities

HPC Solicitation Released September 27, 2005
Performance Benchmarks Identified November 9
Proposals were due 10Feb06

Continuing Work on CI Vision Document
Four, NSF-wide SWOT Teams developing Strategic and
Implementation Plans covering various aspects of CI
http://www.nsf.gov/dir/index.jsp?org=OCI

OCI
Ongoing Interagency Discussions
Committee on Science
Office of Science and Technology Policy
Agency-to-agency (DARPA, HPCMOD, DOE/OS, NNSA,
Muñoz
NIH)
9
Acquisition Strategy
Science and engineering capability
(logrithmic scale)
10
Muñoz
OCI
FY10
FY09
FY08
FY07
FY06
HPC Acquisition Activities



HPC acquisition will be driven by the
needs of the S&E community
RFI held for interested Resource Providers
and HPC vendors on 9 Sep 2005
First in a series of HPC S&E requirements
workshops held 20-21 Sep 2005
Generated Application Benchmark Questionnaire
 Attended by 77 scientists and engineers
OCI
Muñoz
11
TeraGrid: What is It?
TeraGrid:
(1)
(2)
Provides a unified user environment to
support high-capability, production-quality
cyberinfrastructure services for science and
engineering research.
Provides new S&E opportunities – by
making new ways of using distributed
resources and services
• Integration of services provided by grid
technologies
• Distributed, open architecture.
• GIG responsible for integration:
• Software integration (including the
common software stack, CTSS)
• Base infrastructure (security,
networking, and operations)
• User support
• Community engagement (including
the Science Gateways activities)
• 8 Resource Providers (with separate
awards):
• PSC, TACC, NCSA, SDSC, ORNL,
Indiana, Purdue, Chicago/ANL
• Several other institutions
participate in TeraGrid as a subawardees of the GIG
• New sites may join as Resource Partners
Examples of services include:
• HPC
• Data collections
• Visualization servers
• Portals
OCI
Muñoz
12
NSF Middleware Initiative (NMI)
Develop, deploy and sustain a set of reusable and expandable
middleware functions that benefit many science and engineering
applications in a networked environment.
o "open standards“
o international collaboration
o sustainable
o scalable and securable

Program Solicitations between 2001-2004 funded over 40
development awards and a series of integration awards
 Integration award highlights include NMI Grids Center (e.g.
Build and Test), Campus Middleware Services (e.g.
Shibolleth) and Nanohub

OCI
OCI made a major award in middleware in November
2005 to Foster/Kesselman:
 "Community Driven Improvement of Globus Software",
$13.3M award over 5 years
Muñoz
13
2005 IRNC Awards
International Research Network Connections

Awards
TransPAC2 (U.S. – Japan and beyond)
GLORIAD (U.S. – China – Russia – Korea)
Translight/PacificWave (U.S. – Australia)
TransLight/StarLight (U.S. – Europe)
WHREN (U.S. – Latin America)
OCI
Muñoz
14
Learning and Our 21st Century CI Workforce
CI-TEAM: Demonstration Projects
Input: 70 projects / 101 proposals / 17 (24%) projects were collaborative
 Outcomes:




OCI
15.7% success rate: 11 projects (14 proposals) awarded up to $250,000 over 1-2 years
related to BIO, CISE, EHR, ENG, GEO, MPS
Broadening Participation for CI Workforce Development

Alvarez (FIU) – CyberBridges

Crasta (VaTech) – Project-Centric Bioinformatics

Fortson (Adler) – CI-Enabled 21st c. Astronomy Training for HS Science Teachers

Fox (IU) - Bringing Minority Serving Institution Faculty into CI & e-Science Communities

Gordon (OSU) – Leveraging CI to Scale-Up a Computational Science U/G Curriculum

Panoff (Shodor) – Pathways to CyberInfrastructure : CI through Computational Science

Takai (SUNY Stonybrook) – High School Distributed Search for Cosmic Rays
(MARIACHI)
Developing and Implementing CI Resources for CI Workforce Development

DiGiano (SRI) – Cybercollaboration Between Scientists and Software Developers

Figueiredo (UFl) – In-VIGO/Condor-G MW for Coastal & Estuarine Science CI Training

Regli (Drexel) – CI for Creation and Use of Multi-Disciplinary Engineering Models

Simpson (PSU) – CI-Based Engineering Repositories for Undergraduates (CIBER-U)
Muñoz
15
How it all fits together…
ETF
CORE
NMI
CI-TEAM
ETF
IRNC
OCI
HPC
DATA
LWD
COVO
Muñoz
ETF
CORE
CI-TEAM
16
NSF HECURA 2004

FY 2004 NSF/DARPA/DOE activity focused on research
in
 Languages
 Compilers
 Libraries

100 proposals submitted in July 2005
 82 projects submitted by 57 US academic institutions and nonprofit organizations
• Includes no-cost national lab and industrial lab collaborators

Nine projects were awarded
 Tools and libraries for high-end computing
 Resource management
 Reliability of high-end systems
OCI
Muñoz
17
NSF HECURA – 2005/2006
FOCUS


OCI
I/O, file and storage systems design for
efficient, high throughput data storage,
retrieval and management in the HEC
environment.
hardware and software tools for design,
simulation, benchmarking, performance
measurement and tuning of file and storage
systems.
Muñoz
18
HECURA – 2005/2006 SCOPE
(CISE)











OCI
File Systems Research
Future File Systems related protocols
I/O middleware
Quality of Service
Security
Management, reliability, and availability at scale
Archives/Backups as extensions to file systems
Novel storage devices for the IO stack
I/O Architectures
Hardware and software tools for design, simulation of I/O,
file and storage systems.
Efficient benchmarking, tracing, performance measurement
and tuning tools of I/O, file and storage systems
Muñoz
19
Benchmarking


Broad inter-agency interest
Use of benchmarking for performance prediction
 valuable when target systems are not readily available either
because
• Inaccessible (e.g. secure)
• Does not exist at sufficient scale
• In various stages of design

Useful for “what-if” analysis
 Suppose I double the memory on my Redstorm?

Nirvana (e.g. Snavely/SDSC):
 Abstract away the application: application signatures
• Platform independent
 Abstract away the hardware: platform signature
 Convolve the signatures to provide an assessment
OCI
Muñoz
20
HPC Benchmarking

OCI
HPC Challenge Benchmarks (http://icl.cs.utk.edu/hpcc/)
1. HPL - the Linpack TPP benchmark which measures the floating
point rate of execution for solving a linear system of equations.
2. DGEMM - measures the floating point rate of execution of double
precision real matrix-matrix multiplication.
3. STREAM - a simple synthetic benchmark program that measures
sustainable memory bandwidth (in GB/s) and the corresponding
computation rate for simple vector kernel.
4. PTRANS (parallel matrix transpose) - exercises the communications
where pairs of processors communicate with each other
simultaneously. It is a useful test of the total communications
capacity of the network.
5. RandomAccess - measures the rate of integer random updates of
memory (GUPS).
6. FFTE - measures the floating point rate of execution of double
precision complex one-dimensional Discrete Fourier Transform
(DFT).
7. Communication bandwidth and latency - a set of tests to measure
latency and bandwidth of a number of simultaneous
communication patterns; based on b_eff (effective bandwidth
Muñoz
benchmark).
21
DARPA HPCS

Partitioned Global Address Space
(PGAS) programming paradigm
Intended to support scaling to 1000s of
processors
Co-Array Fortran
Unified Parallel C
Cray’s Chapel
?
IBM’s X10
Sun’s Fortress

DARPA HPCS productivity activities
 HPC specific programming environments?
OCI
Muñoz
22
OCI INVESTMENT HIGHLIGHTS
Leadership Class High-Performance
Computing System Acquistion ($50M)
 Data- and Collaboration-Intensive Software
Services ($25.7M)

Conduct applied research and development
Perform scalability/reliability tests to explore
tool viability
Develop, harden and maintain software tools
and services
Provide software interoperability
OCI
Muñoz
23
HPC Acquisition - Track 1
Increased funding will support first
phase of a petascale system acquisition
 Over four years NSF anticipates
investing $200M
 Acquisition is crititical to NSF’s multiyear plan to deploy and support worldclass HPC environment
 Collaborating with sister agencies with a
stake in HPC

OCI
DARPA, HPCMOD, DOE/OS, DOE/NNSA,
NIH
Muñoz
24
CI – Summary
“The Tide that Raises All Boats”
 CI impacts and enables the broad spectrum of

science and engineering activities
NSF CI deployment/acquisition activities must be
driven by the needs of the science, engineering and
education communities
 CI is more than just “big iron”


Many opportunities to work with other federal
agencies that develop, acquire and/or use various CI
resources
Work required in all aspects of CI software
 application and middleware for petascale systems
 systems software

OCI
CI has been, and will continue to be, an effective
mechanism for broadening participation
Muñoz
25
In the end…
It’s all about the SCIENCE
OCI
Muñoz
26
OCI
Muñoz
27