Transcript Document

Building a Community Through
GEON
Dr. Fran Berman
Director, San Diego Supercomputer Center
Professor and High Performance Computing Endowed Chair, UCSD
GEON Advisory Committee
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Redefining Computer
• Today’s “computer” is
a coordinated set of
hardware, software,
and services
providing an “end-toend” resource.
• Cyberinfrastructure
captures how the S&E
community has
redefined “computer”
wireless
sensors
field
computer
computer
data
network
network
computer
data
data
storage
computer
viz
field
instrument
network
The “computer” as an
integrated set of resources
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Cyberinfrastructure
Cyberinfrastructure is the
coordinated aggregate of
software, hardware and other
technologies, as well as human
expertise, required to support
current and future discoveries
in science and engineering.
“Thanks to Cyberinfrastructure and information systems,
today’s scientific tool kit includes distributed systems
of hardware, software, databases and expertise that
can be accessed in person or remotely.”
Arden Bement, NSF Director
February, 2005
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
National Science
Foundation’s
Cyberinfrastructure
NSF Blue Ribbon
Panel (Atkins)
Report provided
compelling and
comprehensive
vision of an
integrated
Cyberinfrastructure
Bootstrapping as an Enabling Paradigm for
Cyberinfrastructure
New
infrastructure
capabilities
motivate
New
application
goals
enable
New
infrastructure
capabilities
motivate
Application
goals
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Community Projects
Focused Efforts in Developing Cyberinfrastructure
TeraGrid: National
Grid infrastructure with
Science Gateways
NEES: Earthquake
Engineering
Cyberinfrastructure
PRAGMA: Pacific Rim Grid
Middleware Consortium
BIRN: Biomedical Informatics
Research Network
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
NVO: Data
Cyberinfrastructure
for Astronomy
GEON: Geosciences
Grid infrastructure
Implementing CI: 5 Basic Principles
1.
Science and engineering research and education must be the
drivers
•
2.
Useful and usable Cyberinfrastructure requires “bootstrapping”
•
3.
Independent evaluation important
Cyberinfrastructure should be treated as infrastructure
•
5.
Targeted “project-driven” tools/technologies drive the development of
common infrastructure which enable new project tools/technologies
The “customers” should evaluate the “products”
•
4.
Community needs and requirements must directly drive the
development, deployment, and use of CI
Usefulness, usability, reliability, responsiveness to customer needs
critical
A functional organizational framework is key for success.
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Measuring Success
•
•
Number of users and other usage metrics
Broad research impact enabled by the use
of Cyberinfrastructure
•
•
Deep research impact enabled by the use
of Cyberinfrastructure
•
•
measured by community awards and
recognition, and landmark publications with a
very large number of citations
measured by the breadth and depth of
courses, training efforts, and other educational
vehicles using CI
•
•
measured by independent user surveys,
feedback from user advisory committees,
projects during site reviews, etc.
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
measured by the number or percentage of
times that software packages are used
together, etc.
Tech Transfer to gauge the evolution of
CI’s technological infrastructure
•
•
measured by the number of new CI users
who utilize resources, tools, and
technologies greater than (an introductory)
threshold number of times
Software coordination
•
User satisfaction of Cyberinfrastructure
tools and technologies
•
Broadening use of Cyberinfrastructure
resources, tools, and technologies
•
Educational impact enabled by the use of
Cyberinfrastructure
•
•
measured by publications and the number of
distinct research disciplines, conferences,
journals spanned by users
•
measured by the evolutionary path from the
academic community to the commercial
sector, etc.
Workforce evolution to gauge the
development of CI’s human
infrastructure
•
measured by the number of individuals
involved in CI-related professions, and other
workforce metrics)
Building Cyberinfrastructure Communities
•
Community building is a social enabler in the
same way that technology is a discovery
enabler – both allow researchers to do more
than they can accomplish on their own
•
Community success greatly enabled by
focused on well-articulated goals
•
•
•
•
High energy physicists want to search for new
particles such as the Higgs boson
Astrophysicists want to develop models of the
origin of the universe
Preservationists want to ensure long-term
sustainability of valued digital assets
Computer science theorists want to prove or
disprove P=NP
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
The Gretzky Rule:
“Skate to where the
puck will be”
Key Questions for Cyberinfrastructure
Communities
• What should the key
contributions be in 10
years?
• What are the research
goals?
• What technologies and
tools are needed to get
there?
• Who should be involved?
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Sustaining Community Efforts -- 10 Year
Issues
• Data preservation
• What data needs to be preserved over the long-term? How will the
community support and sustain key collections?
• Tool maintenance and evolution
• What tools and technologies are critical?
• What is the plan for deploying, maintaining, evolving and retargeting
these tools over the long-term?
• Community evolution
• What will keep the community together?
• How will the next generation of participants be engaged?
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
The GEON Project
Partners
NSF-funded IT Research Project,
2002-2007, $11.6M
•
•
•
•
•
•
•
•
•
•
•
PI Institutions
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Arizona State University
Bryn Mawr College
Penn State University
Rice University
San Diego State University
SDSC/UCSD
University of Arizona
University of Idaho
University of Missouri, Columbia
University of Texas at El Paso
University of Utah
Virginia Tech
UNAVCO
DLESE
•
•
•
•
•
•
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
ESRI
Cal-(IT)2
Chronos
CUAHSI-HIS
Geological Survey of Canada
Georeference Online
HP
IBM
Kansas Geological Survey
LLNL
NASA Goddard, Earth System
Division
SCEC
U.S. Geological Survey (USGS)
Purdue University
EarthScope
IRIS
VSTO
Some Current GEON Innovations
•
Making technology easier;
•
Extension of ROCKS to a distributed environment. I.e. ability to
"bootstrap" linux clusters using reference software images from a remote
server.
•
Training of Geo PI's to create ROCKS "rolls“, enabling them to contribute
their own, GEON-compliant software to the rest of the GEON network.
•
Reference portals which can be customized by GEON PIs
•
"Smart job routers", enhanced performance prediction capability and ondemand functionality for GEON jobs
•
Innovative knowledge-based integration of semantically different GIS
maps.
•
Enhancing Data Management and Capabilities
•
Integration shopping carts for on-the-fly data integration
•
Technologies for augmented reality in the field, e.g. the ability to
wear goggles and overlay database information on top of field
observational data
•
Development of workflow systems for ingesting and serving large
"point-cloud" data sets, e.g. LiDAR or hyperspectral imagery.
• 2TB allocation for database space from SDSC
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Building a Community Within GEON
GEON AHM 2003
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Project Communications and Outreach
• Website provides an active site
for project information (new
design enhancing functionality)
•
• Weekly project meetings at
SDSC are webcast and
archived
• Weekly project reports
(“blogs”) from C. Baru (archived
on website)
• 1 page monthly newsletter
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Recent Project Meetings,
Talks, and Workshops
•
2004 Cyberinfrastructure Summer
Institute
•
PI meeting in Idaho, 2004
•
GSA in Denver (talks, posters, booth,
Pardee Session)
•
AGU in San Francisco (talks, posters,
sessions, booth)
•
Hydrology ontology meeting at SDSC
•
Workshop on sample ID’s at SDSC
•
Workshop on Visualization at
SDSC/Cal-IT2 Synthesis Center
•
Workshop on Geo-ontology at SDSC
•
European Geophysical Union
Building a Broader GEON community
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Cyberinfrastructure Leverage and Synthesis
•
GEON and BIRN developing common grid software stack
•
GEON and SEEK developing common semantic integration services
•
GEON is an application driver for OptIPuter
•
Will field a common GIS and Viz/Synthesis center
•
GEON in active discussions with Earthscope
•
GEON participating in National Center for Hydrology Synthesis Computational
Hydrology and Informatics Working Group
•
USGS is a major partner of GEON
•
GEON is working with PRAGMA on education projects
•
Collaboration between SIO, WHOI, LDEO and GEON
•
GEON working with advisory committee of Long-Term Ecological Research Network
•
GEON working with advisory committees of 3 different CLEANER planning grants
•
GEON participating in renewal proposal of Chronos
•
GEON is a partner of CUAHSI (hydrologic information systems)
•
GEON working with DOE EarthSystem Grid (ES Grid)
•
GEON and LEAD coordinating efforts on education and outreach
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Cyberinfrastructure at Scale – Community
Building Challenges and Opportunities
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Ensuring a “safe” Cyberinfrastructure
for at-scale Communities
•
•
In 2003, the Slammer computer virus
exploited a weakness in SQL server
software to launch a “denial of service”
attack which
•
Shut down over 13,000 Bank of America
ATMS
•
Caused difficulties in Continental Airline’s
electronic reservation and ticketing
systems, causing cancellation of some
regional flights
•
Caused failure of Korea Telecom Freetel
and SK Telecom service, stranding
millions of South Korean Internet users.
When the virus hit, operations centers
were seeing between 200,000 and
300,000 attacks per hour
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
At-Scale Community Social Dynamics
•
Cyberinfrastructure
technologies are
providing new venues
for communication,
interaction,
collaboration, and
competition
•
What is the impact of
Cyberinfrastructure on
community development
and evolution?
•
How can
Cyberinfrastructure be
structured to facilitate
productive interactions?
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Community Resource Allocation at Scale
Cyberinfrastructur
e
Economics
• How to allocate
resources so that
• Aggregate user
behavior does not
destabilize the system
• Individuals can
optimize for
performance
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Community Organizations
• What organizational
frameworks best
promote efficient and
integrated
Cyberinfrastructure?
• What are useful ways to
resolve conflicts?
• How should decisions be
made?
• How do we promote
integration and
coordination?
The North American
Power Grid
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Translating Tools to Useful Infrastructure
What is the distribution and U/ Pb zircon
ages of A-type plutons in VA?
How does it relate to host rock structures?
Data
Integration
GeoGeologic
Chemical
Map
GeoPhysical
Complex
“multiple-worlds”
mediation
GeoChronologic
Integrating the “Data Stack”
Applications: Medical informatics,
Biosciences, Ecoinformatics,…
Foliation
Map
Visualization
How do we represent
data, information and
knowledge to the user?
Data Mining, Simulation Modeling,
Analysis, Data Fusion
How do we detect trends and
relationships in data?
Knowledge-Based Integration
Advanced Query Processing
Grid Storage
Filesystems, Database Systems
High speed networking
Data Integration in the Geosciences
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
How do we obtain
usable
information from data?
How do we collect, access
and organize data?
Networked Storage (SAN)
sensornets
How do we combine
data, knowledge
and information management
with simulation and modeling?
instruments
Storage hardware
How do we configure
computer architectures to
optimally support dataoriented computing?
Reliability
• Infrastructure must be there
when you need it.
• How can communities ensure
that
•
•
•
•
Data
Tools
Networks
Software
and other resources are in good
working order, and continue to
enable new discovery?
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Planning Ahead for Sustainable
Cyberinfrastructure-enabled Communities
• From the beginning,
Cyberinfrastructure must be
designed with its
beneficiaries in mind.
Social
Science
Computer
Science and
Engineering
• Attention must be paid to
• Social dynamics
• Organization
• Social mpact, etc.
Domain and Application Science
as well as technical issues
• Long-term and strategic
planning is critical
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman
Borromean Rings represent 3 key
components of Cyberinfrastructure
Dan Atkins
Thank You
www.sdsc.edu
SAN DIEGO SUPERCOMPUTER CENTER
Fran Berman