2outreach - Louisiana State University

Download Report

Transcript 2outreach - Louisiana State University

National Science Foundation
Blue Ribbon Panel on
Cyberinfrastructure
Introduction, Context &
Charge
Dan Atkins, Chair
[email protected]
University of Michigan
April 19, 2002
National Science Foundation
Panel Members
• Daniel E. Atkins, Chair, Univ. of Michigan, EECS and SI,
[email protected]
• Kelvin K. Droegemeier, Center for Analysis and Prediction of
Storms, University of Oklahoma, [email protected]
• Stuart I. Feldman, IBM Research, [email protected]
• Hector Garcia-Molina, CS Dept., Stanford University,
[email protected]
• Michael Klein, Center for Molecular Modeling, University of
Pennsylvania, [email protected]
• Paul Messina, Cal Tech, [email protected]
• David G. Messerschmitt, UC-Berkeley, EECS & SIMS,
[email protected]
• Jeremiah P. Ostriker, Princeton University, [email protected].
• Margaret H. Wright, Computer Science Department, Courant
Institute of Mathematical Sciences, New York University,
[email protected]
National Science Foundation
Meeting Agenda: April 19, 2002,
NSF, 1-4 pm
• 1. Review of status of the panel's activities and goals for
this meeting.
• 2. Reports from the authoring sub-committees.
• 3. Review and discussion of the working draft of the
report.
• 4. Discussion of primary recommendations.
• 5. Stewardship and additional use of the material
gathered by the Panel.
• 6. Summary of additional activities to create final version
of report.
• 7. Matters arising.
National Science Foundation
Historical Schematic
CSE research
elsewhere in NSF
Support for an array of small, medium,
and large CISE basic research projects
CISE
Directorate
Provision of advanced
scientific computing
Lax ->Curtis/Bardon Reports
Computational
Science init.;
Expanded equip. program.
1995
Hayes
Report
5 Supercomputer
Centers, NSFnet,
1984
OUR
REPORT
PACI: NCSA
& NPACI
1993 BRP:“Desktop to Teraflop”
Terascale
Computing
Initiatives
National Science Foundation
OUR
REPORT
Charge
WRT meeting needs of the scientific and
engineering research community:
• A) Evaluate the current PACI programs.
• B) Recommend new areas of emphasis
for CISE Directorate,
• C) Recommend an implementation plan
to enact recommended changes.
National Science Foundation
Process
• Web survey
• Hearings
• Reviewing prior reports
• Random input
• Knowledge and expertise of the
Panel members.
National Science Foundation
Epigraph
• Cyberinfrastructure is the sine qua non
for true progress in much of the
mathematical and physical sciences –
And progress in CI is often driven by
real-world problems.
– Robert Eisenstein, AD for MPS,
11/30/01
National Science Foundation
Revolutionizing Science and Engineering through
Cyberinfrastructure:
Table of Contents
• 1. The Vision
• 2. Background and Charge
• 3. Challenges and Opportunities for the Scientific
Research Community
• 4. The New Cyberinfrastructure: What Changed in
Computing
• 5. The Landscape of Related Activities
• 6. Partnerships for Advanced Computational
Infrastructure: Past and Future Roles
• 7. Achieving the Vision
• 8. Scope and Budget Estimates
National Science Foundation
Draft Report Available in pdf at
worktools.si.umich.edu/workspaces/datkins/001.nsf
Please send comments
by May 1, 2002 to
[email protected]
National Science Foundation
Revolutionizing Science and Engineering through
Cyberinfrastructure:
Table of Contents
•
1. The Vision 2. Feldman
•
2. Background and Charge 1. Atkins
•
3. Challenges and Opportunities for the Scientific Research
Community 3. Droegemeier
•
4. The New Cyberinfrastructure: What Changed in Computing 2.
Feldman
•
5. The Landscape of Related Activities 2. Feldman
•
6. Partnerships for Advanced Computational Infrastructure: Past
and Future Roles 6. Wright
•
7. Achieving the Vision 4. Messerschmidt
•
8. Scope and Budget Estimates 5. Messina
•
Summary and Discussion - Atkins
National Science Foundation
Blue Ribbon Panel on
Cyberinfrastructure
Vision
Stuart I. Feldman
IBM
April 19, 2002
National Science Foundation
Recommendations
•
•
New INITIATIVE to revolutionize science and engineering research at NSF and
worldwide to capitalize on new computing and communications opportunities 21st
Century Cyberinfrastructure includes supercomputing, but also massive storage,
networking, software, collaboration, visualization, and human resources
– Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE
– Budget estimate: incremental $650 M/year (continuing)
An INITIATIVE OFFICE with a highly placed, credible leader empowered to
– Initiate competitive, discipline-driven path-breaking applications within NSF of
cyberinfrastructure which contribute to the shared goals of the INITIATIVE
– Coordinate policy and allocations across fields and projects. Participants
across NSF directorates, Federal agencies, and international e-science
– Develop high quality middleware and other software that is essential and
special to scientific research
– Manage individual computational, storage, and networking resources at least
100x larger than individual projects or universities can provide.
National Science Foundation
Science and Engineering Research
Depends on Computing and
Communications
•
•
•
•
Online fast publication (and archives too)
New collections accessible
Raw data and digital libraries
Collaboration (Collaboratories, Access Grid,
etc.)
• In silico science
National Science Foundation
Furthering the Revolution
•
•
•
•
Saving raw data
Cross-disciplinary collections
Richer publications
Grander simulations (cells and organisms;
entire earth system)
• Breadth and depth of collaborations, routinely
international
National Science Foundation
Thresholds and Opportunities
• Internet and Web use almost universal
– Activity would stop without e-mail and WWW
• Expectations rising with generations and for all
disciplines
• Supercomputers and terabytes in the lab
• Simulation required to do new science
• Standardized formats, software
National Science Foundation
Risks and Costs
•
•
•
•
•
Inconsistent formats across fields and sites
Data loss
Field boundaries
Duplicative moderate quality software
Falling behind on computing technologies
National Science Foundation
Proposals for the INITIATIVE
• Large incremental budget
• Drive applications that revolutionize the way that research
is done
– Fund competitive discipline-driven projects
– With cyberinfrastructure contribution and standards
and participation by computing experts
• Supply shared resources
– Supercomputers and data farms that provide
100-1000x what can be found locally
– New shared middleware, content standards, basic
applications
– New research (emphasizing computation, social
science,
– New education and outreach
• Central organization with authority
National Science Foundation
The New Cyberinfrastructure
National Science Foundation
Hardware Trends
• Processor speeds and memory increasing with
Moore’s Law
• Cluster sizes – now 1000s, soon even larger
– Largest sites at 10TF, moving toward PF
• Disk capacity increasing with areal density
(60%-100%/year)
– Terabytes typical, petabytes coming
• Wide area networking moving to Gb/s
• Large and high-resolution displays
National Science Foundation
Software
• Information networking – applications,
messages, self-describing content, not just bit
streams
– The Grids
• Content management – metadata, searches,
persistence
• Collaboration
• Middleware
National Science Foundation
Ecology of Scientific Computing
• Computing industry
– Commercial requirements drive basic
hardware and software
– Important additional needs for scientific
computing
• Computing Research
• Other sciences
• Other federal agencies
• Non-US activities
National Science Foundation
Blue Ribbon Panel on Cyber Infrastructure
Science & Engineering
Community Needs and
Challenges
Kelvin K. Droegemeier
University of Oklahoma
April 19, 2002
National Science Foundation
Goals
• Engage the broadest elements of the
science and engineering communities as
a means for critically assessing needs
and challenges
– Scientific
– Technological
– Sociological
• Identify barriers and opportunities
National Science Foundation
The Communities
•
•
•
•
•
•
Domestic and International
Academia
Private Industry
Government Agencies
Laboratories
State, Regional, and National Centers
National Science Foundation
Methodology
• Community-wide web survey
– Widely publicized
– >700 responses
– Quantitative comparisons with the Hayes Report
• Oral public testimony (3 sessions)
– 62 participants selected from: research scientists and
engineers; computer and computational scientists;
center directors; agency and corporate leaders;
system administrators; educators; students and young
scientists; technicians and consultants
– Emphasis given to traditionally underrepresented
groups and the physically challenged
– Written transcripts and A/V materials assembled
• Existing reports and planning documents
• Ad hoc communications
• Personal experiences and expertise
National Science Foundation
Analysis
• Results from all 5 methodologies have
been synthesized
• Remarkable consistency among
individual responses and within and
among disciplines
• No prioritization of findings: all summary
issues are viewed as critically important
• Categorization
– Philosophy and Process
– Current Resources
– Future Infrastructure
– Emerging Paradigms and Activities
National Science Foundation
Philosophy and Process
• Cyber infrastructure lies at the heart of
revolutionary science and engineering
• NSF should take the lead in charting a
national course for cyber infrastructure
• NSF should consider human capital and
software as co-equals with traditional
physical infrastructure
• Cyber infrastructure requires continuity,
consistency, and sufficient funding; NSF
should consider the consequences of
periodic full re-competition of CI centers
National Science Foundation
Philosophy and Process
• NSF needs to
– Provide a framework, motivation, and
clear direction for building and
sustaining linkages between academia
and industry
– Give attention to the sociological,
economic, and cultural issues
associated with cyber infrastructure
– Continue supporting open source
software strategies
National Science Foundation
Current Resources
• The entry barrier into high
performance computing continues
to be high
• Effective use of parallel computers
is becoming increasingly complex
• Greater investments are needed in
– Software development
– Training and support
National Science Foundation
Current Resources
• The PACI centers have successfully
– brought high performance computing to
the masses;
– broadened the spectrum of users; and
– responded to dramatic changes in the
user base, technology, and applications
• However, the PACI centers remain a largely
batch oriented environment and are not
configured or funded to deliver significant
resources in novel ways (dedicated, ondemand) to large numbers of users
National Science Foundation
Current Resources
• The NRAC allocation process no longer
is effective
– Double jeopardy
– Yearly resource allocations not
congruent with multi-year agency
grants
– Proposal development process is timeconsuming
– Reviewer base insufficiently broad
– Need flexibility to accommodate future
resources (e.g., data repositories)
National Science Foundation
Current Resources
• The PACI centers have been highly
successful in developing visionary,
innovative technologies and prototype
tools
• However, insufficient funding and the
lack of selective investment has
hampered transition to full deployment
National Science Foundation
Future Infrastructure
• The “last mile problem” continues and is
especially serious for HBCUs, Tribal Colleges
and Universities, and Hispanic institutions
• Research-group and departmental-scale
facilities (100 to 1000x less powerful than
national centers) are becoming increasingly
important; thus, national centers need to be a
factor of 100 to 1000x more capable
• High speed networks with high quality of service
continue to be foundational to research and
education at all levels
• On-demand (not pre-scheduled) and
instantaneous access is becoming increasingly
important (computers, data bases, networks)
National Science Foundation
Future Infrastructure
• Comprehensive environments are needed for
linking models from multiple disciplines and for
synthesizing results in interoperable
frameworks
• The Grid represents an important opportunity
for the future and should receive high priority
for support
• Inexpensive and reliable tools are needed to
support distance collaborations
• Higher levels of security are needed
National Science Foundation
Emerging Paradigms and
Activities
• Cyber infrastructure is becoming the essential
lynchpin for research at the boundaries among
disciplines and should be driven by user needs
• The need for a new information technology
professional is emerging
– Expertise in one or more disciplines plus
computer science
– They will develop, maintain, and integrate
complex hardware and software systems
– They are an important bridge to users
– Educational institutions must develop
strategies for creating this computational
science workforce
National Science Foundation
Emerging Paradigms and
Activities
• Scientific and engineering applications
are becoming more multi-scale (both
space and time) and compute-intensive;
thus, the need for high-end resources
continues to grow. However, cyber
infrastructure research needs to span the
spectrum from small grants to large
centers
National Science Foundation
Emerging Paradigms and
Activities
• Significant need exists for access to
long-term, distributed, stable data and
meta data repositories and digital
libraries
• Legacy data likewise are important and
must be digitized and preserved
National Science Foundation
Knowledge Frontiers
• Several new projects provide a glimpse
of the future
National Science Foundation
Blue Ribbon Panel on
Cyberinfrastructure
Organization
David G. Messerschmitt
University of California at Berkeley
April 19, 2002
National Science Foundation
Layered structure of the INITIATIVE
Conduct of science and engineering research
Applications of information technology to science
and engineering research
Cyberinfrastructure supporting applications
Core technologies incorporated into
cyberinfrastructure
National Science Foundation
Some roles of cyberinfrastructure
• Processing, storage, connectivity
– Performance, sharing, integration, etc
• Make it easy to develop and deploy new
applications
– Tools, services, application commonality
• Interoperability enables future collaboration
across disciplines
• Best practices, assistance, expertise
• Greatest need is software and experienced
people
National Science Foundation
Classes of activities
Applications of information technology to science
and engineering research
Cyberinfrastructure supporting applications
Core technologies incorporated into
cyberinfrastructure
Research in
technologies,
systems, and
applications
Development
or acquisition
Operations in
support of end
users
National Science Foundation
Defining applications
• Only domain science and engineering
researchers can create a vision and implement
the methodology and process changes
• Information technologists need to be deeply
involved
– What technology can be, not what it is
– Conduct research to advance the supporting
technologies and systems
– Applications inform research
• Shared responsibility
National Science Foundation
Mapping onto disciplines
All science (natural and social) and engineering disciplines
Applications (discipline specific)
Applications (multi-disciplinary)
Technological (CISE) and social systems (CISE, SBE)
Core information technologies (CISE, E)
National Science Foundation
Who delivers
Long-term
and applied
researchers
(applications,
systems, core
technologies)
Commercial
suppliers,
development
centers,
community
development,
integrators
Research in
technologies,
systems, and
applications
Development
or acquisition
End-user staff
support,
operational
centers,
service
providers
Operations in
support of end
users
National Science Foundation
Evaluation and assessment
Ideas:
outcomes
Research in
technologies,
systems, and
applications
Plans:
Users:
impact and
use
impact and
satisfaction
Development
or acquisition
Operations in
support of end
users
National Science Foundation
Responsibility for applications
All science (natural and social) and
engineering disciplines
Other
Directorates
Applications (discipline specific)
Close coordination and collaboration
(matrix organization)
Applications (multi-disciplinary)
CISE
National Science Foundation
Responsibility for cyberinfrastructure
All science (natural and social) and
engineering disciplines
Other
Directorates
Applications (discipline specific)
Applications (multi-disciplinary)
CISE
Close coordination and collaboration
(matrix organization)
Technological systems
CISE
Social systems
CISE and SBE
National Science Foundation
OFFICE for the INITIATIVE
• Headed by a leader with experience, credibility,
commitment, persuasiveness, accountability
• Complex matrix organization spaning all
Directorates needs central direction
• Vision and coordination
• Manage INITIATIVE budget (competitive and
community input)
• Outreach to agencies, international
National Science Foundation
Blue Ribbon Panel on
Cyberinfrastructure
Scope and Budget
Paul Messina
California Institute of Technology
April 19, 2002
National Science Foundation
To achieve its goals, the INITIATIVE
should include funding for
software and people
• Long-term research in IT and CI
• Applied research in IT and CI, with deep
involvement by applications projects
• Developing new applications enabled by IT
and CI
• Enhancing existing applications to take
advantage of the new facilities and
capabilities
• Transforming research software into robust
products
National Science Foundation
To achieve its goals, the INITIATIVE
should include funding for data
• Creating and operating data repositories in
many disciplines
– taking existing data collections and making them
conveniently accessible
• Establishing discipline-specific coordination
centers to guide and coordinate software
and data format choices for the repositories
• Establishing STCs for addressing common
issues that arise in creation and use of data
collections, especially across disciplines
National Science Foundation
To achieve its goals, the INITIATIVE should
include funding for
physical infrastructure and its operation
• Acquiring and operating high-end
computers, visualization facilities, data
archives, and networks of much greater
power and in substantially greater
quantity
– in particular, multiple computers that
are among the world’s most powerful
• Establishing production data libraries
National Science Foundation
Basis for budget estimates
• Our estimates are based on
– current and previous NSF activities
– testimonies
– other agencies’ programs in related areas
– activities in other countries
National Science Foundation
Preliminary Budget Overview
(Incremental)
Research in IT and its applications and social context
Funding Level
in millions
20
Applications of IT in science and engineering
100
Cyberinfrastructure supporting applications
High-end general-purpose centers
280
Networks
50
Data repositories
120
Coord center for data repositories (discipline specific)
20
STCs for data collections
20
Electronic Service Centers
Digital Libraries
Core technologies incorporated into cyberinfrastructure
TBD
TBD
40
System software and tools research and development
INITIATIVE Total
650
National Science Foundation
Is this enough to support a revolution?
• Not by itself
• However, there are activities in CISE, in
other parts of NSF, and in the world at
large that will complement the funding
we recommend for this INITIATIVE
National Science Foundation
Ongoing NSF CISE-funded activities that
would be folded into the INITIATIVE
Activity
ACIR
Funding relevant to
INITIATIVE (FY2002 level)
$85M
ANIR
$70M
ITR (principally the large $60M – 120M (out of
projects)
$180M)
Terascale MRE
$35M
Total
$250M - $310M
National Science Foundation
There are other NSF activities that
would contribute to and benefit from
the INITIATIVE
• NCAR
• Network for Earthquake Engineering
Simulation (NEES)
• National Ecological Observatory Network
(NEON)
• and others
National Science Foundation
Related activities supported by other
governmental entities
•
•
•
•
•
•
•
•
NASA IPG
NIH BIRN
DOE Science Grid
DOE SciDAC
DOE/NNSA ASCI
UK e-Science
EU Grid projects (9)
All of the above (and others) support Research,
Development, and Deployment activities that
will bolster the NSF INITIATIVE
National Science Foundation
And the private sector is also making
investments
• Most high-end computer manufacturers have
announced substantial efforts in grid software
– and are participating in Global Grid Forum
• Twelve companies announced support of
Globus last November
• End-user companies in aerospace,
pharmaceuticals are using or investigating grid
approaches
National Science Foundation
Open issues
• Is the funding level high enough for the system
software and tools R&D?
– Taking into consideration the number of
people who could and would engage in those
activities
• Is the funding level high enough for the
development of production-quality software?
– With same consideration, but note that work
not necessarily done in universities
• Funding level for production digital libraries
National Science Foundation
Blue Ribbon Panel on
Cyberinfrastructure
PACIs: Past and Future Roles
Margaret H. Wright
New York University
April 19, 2002
National Science Foundation
The PAST
•
NSF Supercomputer Centers (1986-87)
•
Multiple reports (Branscomb, BrooksSutherland, Hayes)  PACI program (1997)
•
Two PACI partnerships (NCSA, NPACI)
National Science Foundation
The PRESENT
Multiple functions within PACI program
•
Provision of high-end resources (cycles,
networking, data, …)
•
Discipline-specific codes and infrastructure
•
Generic tools and infrastructure for users of
high-end computing
•
Education, outreach, and training
National Science Foundation
Part A of our charge:
Assessment of PACI program
• Our interpretation: the potential roles for the
PACIs and PSC in a GREATLY expanded context
• Annual evaluations of PACIs: positive overall
• Repeated concerns: effectiveness of enabling
and application technology projects in serving
the science, engineering, and computer science
communities who use high-end computing
National Science Foundation
Rationale for the Future
• Insatiable demand for highest-end cycles,
networking, data (quantity, speed)
• Need for sustained work on industrial-strength
discipline-specific codes and infrastructure,
generic software tools and infrastructure
– Effort at least one order of magnitude greater
than high-quality prototypes
National Science Foundation
Within the INITIATIVE
• Disaggregation of PACI functions
• Augmented centralized high-end resources
• Enabling/application infrastructure projects
peer-reviewed
• Expanded, peer-reviewed education, outreach,
and training
National Science Foundation
Future of PACI within the INITIATIVE
• Two-year extension of current PACI program requested
• Until 2007, PACI’s and PSC should receive stable funding
to provide high-end resources and associated operations
• 2004: INITIATIVE funding begins
– Important to retain skilled PACI staff and successful
collaborations
– PACI’s can compete for all aspects of the larger
INITIATIVE funding
– Separate peer-reviewed enabling and application
infrastructure projects
National Science Foundation
Blue Ribbon Panel on
Cyberinfrastructure
Summary recommendations
April 19, 2002
National Science Foundation
Recommendations
•
•
New INITIATIVE to revolutionize science and engineering research at
NSF and worldwide to capitalize on new computing and communications
opportunities
– 21st Century Cyberinfrastructure includes supercomputing, but also
massive storage, networking, software, collaboration, visualization,
and human resources
– Current centers (NCSA, SDSC, PSC) are a key resource
– Budget estimate: incremental $650M/year (continuing)
INITIATIVE OFFICE with a highly placed, credible leader empowered to
– Initiate competitive, discipline-driven path-breaking applications
within NSF of cyberinfrastructure which contribute to the shared
goals of the INITIATIVE
– Coordinate policy and allocations across fields and projects.
Participants across NSF directorates, Federal agencies, and
international e-science
– Develop high quality middleware and other software that is essential
and special to scientific research
– Manage individual computational, storage, and networking
resources at least 100x larger than individual projects or universities
can provide.