Transcript Document
Building a Community Through GEON Dr. Fran Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UCSD GEON Advisory Committee SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Redefining Computer • Today’s “computer” is a coordinated set of hardware, software, and services providing an “end-toend” resource. • Cyberinfrastructure captures how the S&E community has redefined “computer” wireless sensors field computer computer data network network computer data data storage computer viz field instrument network The “computer” as an integrated set of resources SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Cyberinfrastructure Cyberinfrastructure is the coordinated aggregate of software, hardware and other technologies, as well as human expertise, required to support current and future discoveries in science and engineering. “Thanks to Cyberinfrastructure and information systems, today’s scientific tool kit includes distributed systems of hardware, software, databases and expertise that can be accessed in person or remotely.” Arden Bement, NSF Director February, 2005 SAN DIEGO SUPERCOMPUTER CENTER Fran Berman National Science Foundation’s Cyberinfrastructure NSF Blue Ribbon Panel (Atkins) Report provided compelling and comprehensive vision of an integrated Cyberinfrastructure Bootstrapping as an Enabling Paradigm for Cyberinfrastructure New infrastructure capabilities motivate New application goals enable New infrastructure capabilities motivate Application goals SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Community Projects Focused Efforts in Developing Cyberinfrastructure TeraGrid: National Grid infrastructure with Science Gateways NEES: Earthquake Engineering Cyberinfrastructure PRAGMA: Pacific Rim Grid Middleware Consortium BIRN: Biomedical Informatics Research Network SAN DIEGO SUPERCOMPUTER CENTER Fran Berman NVO: Data Cyberinfrastructure for Astronomy GEON: Geosciences Grid infrastructure Implementing CI: 5 Basic Principles 1. Science and engineering research and education must be the drivers • 2. Useful and usable Cyberinfrastructure requires “bootstrapping” • 3. Independent evaluation important Cyberinfrastructure should be treated as infrastructure • 5. Targeted “project-driven” tools/technologies drive the development of common infrastructure which enable new project tools/technologies The “customers” should evaluate the “products” • 4. Community needs and requirements must directly drive the development, deployment, and use of CI Usefulness, usability, reliability, responsiveness to customer needs critical A functional organizational framework is key for success. SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Measuring Success • • Number of users and other usage metrics Broad research impact enabled by the use of Cyberinfrastructure • • Deep research impact enabled by the use of Cyberinfrastructure • • measured by community awards and recognition, and landmark publications with a very large number of citations measured by the breadth and depth of courses, training efforts, and other educational vehicles using CI • • measured by independent user surveys, feedback from user advisory committees, projects during site reviews, etc. SAN DIEGO SUPERCOMPUTER CENTER Fran Berman measured by the number or percentage of times that software packages are used together, etc. Tech Transfer to gauge the evolution of CI’s technological infrastructure • • measured by the number of new CI users who utilize resources, tools, and technologies greater than (an introductory) threshold number of times Software coordination • User satisfaction of Cyberinfrastructure tools and technologies • Broadening use of Cyberinfrastructure resources, tools, and technologies • Educational impact enabled by the use of Cyberinfrastructure • • measured by publications and the number of distinct research disciplines, conferences, journals spanned by users • measured by the evolutionary path from the academic community to the commercial sector, etc. Workforce evolution to gauge the development of CI’s human infrastructure • measured by the number of individuals involved in CI-related professions, and other workforce metrics) Building Cyberinfrastructure Communities • Community building is a social enabler in the same way that technology is a discovery enabler – both allow researchers to do more than they can accomplish on their own • Community success greatly enabled by focused on well-articulated goals • • • • High energy physicists want to search for new particles such as the Higgs boson Astrophysicists want to develop models of the origin of the universe Preservationists want to ensure long-term sustainability of valued digital assets Computer science theorists want to prove or disprove P=NP SAN DIEGO SUPERCOMPUTER CENTER Fran Berman The Gretzky Rule: “Skate to where the puck will be” Key Questions for Cyberinfrastructure Communities • What should the key contributions be in 10 years? • What are the research goals? • What technologies and tools are needed to get there? • Who should be involved? SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Sustaining Community Efforts -- 10 Year Issues • Data preservation • What data needs to be preserved over the long-term? How will the community support and sustain key collections? • Tool maintenance and evolution • What tools and technologies are critical? • What is the plan for deploying, maintaining, evolving and retargeting these tools over the long-term? • Community evolution • What will keep the community together? • How will the next generation of participants be engaged? SAN DIEGO SUPERCOMPUTER CENTER Fran Berman The GEON Project Partners NSF-funded IT Research Project, 2002-2007, $11.6M • • • • • • • • • • • PI Institutions • • • • • • • • • • • • • • Arizona State University Bryn Mawr College Penn State University Rice University San Diego State University SDSC/UCSD University of Arizona University of Idaho University of Missouri, Columbia University of Texas at El Paso University of Utah Virginia Tech UNAVCO DLESE • • • • • • SAN DIEGO SUPERCOMPUTER CENTER Fran Berman ESRI Cal-(IT)2 Chronos CUAHSI-HIS Geological Survey of Canada Georeference Online HP IBM Kansas Geological Survey LLNL NASA Goddard, Earth System Division SCEC U.S. Geological Survey (USGS) Purdue University EarthScope IRIS VSTO Some Current GEON Innovations • Making technology easier; • Extension of ROCKS to a distributed environment. I.e. ability to "bootstrap" linux clusters using reference software images from a remote server. • Training of Geo PI's to create ROCKS "rolls“, enabling them to contribute their own, GEON-compliant software to the rest of the GEON network. • Reference portals which can be customized by GEON PIs • "Smart job routers", enhanced performance prediction capability and ondemand functionality for GEON jobs • Innovative knowledge-based integration of semantically different GIS maps. • Enhancing Data Management and Capabilities • Integration shopping carts for on-the-fly data integration • Technologies for augmented reality in the field, e.g. the ability to wear goggles and overlay database information on top of field observational data • Development of workflow systems for ingesting and serving large "point-cloud" data sets, e.g. LiDAR or hyperspectral imagery. • 2TB allocation for database space from SDSC SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Building a Community Within GEON GEON AHM 2003 SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Project Communications and Outreach • Website provides an active site for project information (new design enhancing functionality) • • Weekly project meetings at SDSC are webcast and archived • Weekly project reports (“blogs”) from C. Baru (archived on website) • 1 page monthly newsletter SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Recent Project Meetings, Talks, and Workshops • 2004 Cyberinfrastructure Summer Institute • PI meeting in Idaho, 2004 • GSA in Denver (talks, posters, booth, Pardee Session) • AGU in San Francisco (talks, posters, sessions, booth) • Hydrology ontology meeting at SDSC • Workshop on sample ID’s at SDSC • Workshop on Visualization at SDSC/Cal-IT2 Synthesis Center • Workshop on Geo-ontology at SDSC • European Geophysical Union Building a Broader GEON community SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Cyberinfrastructure Leverage and Synthesis • GEON and BIRN developing common grid software stack • GEON and SEEK developing common semantic integration services • GEON is an application driver for OptIPuter • Will field a common GIS and Viz/Synthesis center • GEON in active discussions with Earthscope • GEON participating in National Center for Hydrology Synthesis Computational Hydrology and Informatics Working Group • USGS is a major partner of GEON • GEON is working with PRAGMA on education projects • Collaboration between SIO, WHOI, LDEO and GEON • GEON working with advisory committee of Long-Term Ecological Research Network • GEON working with advisory committees of 3 different CLEANER planning grants • GEON participating in renewal proposal of Chronos • GEON is a partner of CUAHSI (hydrologic information systems) • GEON working with DOE EarthSystem Grid (ES Grid) • GEON and LEAD coordinating efforts on education and outreach SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Cyberinfrastructure at Scale – Community Building Challenges and Opportunities SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Ensuring a “safe” Cyberinfrastructure for at-scale Communities • • In 2003, the Slammer computer virus exploited a weakness in SQL server software to launch a “denial of service” attack which • Shut down over 13,000 Bank of America ATMS • Caused difficulties in Continental Airline’s electronic reservation and ticketing systems, causing cancellation of some regional flights • Caused failure of Korea Telecom Freetel and SK Telecom service, stranding millions of South Korean Internet users. When the virus hit, operations centers were seeing between 200,000 and 300,000 attacks per hour SAN DIEGO SUPERCOMPUTER CENTER Fran Berman At-Scale Community Social Dynamics • Cyberinfrastructure technologies are providing new venues for communication, interaction, collaboration, and competition • What is the impact of Cyberinfrastructure on community development and evolution? • How can Cyberinfrastructure be structured to facilitate productive interactions? SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Community Resource Allocation at Scale Cyberinfrastructur e Economics • How to allocate resources so that • Aggregate user behavior does not destabilize the system • Individuals can optimize for performance SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Community Organizations • What organizational frameworks best promote efficient and integrated Cyberinfrastructure? • What are useful ways to resolve conflicts? • How should decisions be made? • How do we promote integration and coordination? The North American Power Grid SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Translating Tools to Useful Infrastructure What is the distribution and U/ Pb zircon ages of A-type plutons in VA? How does it relate to host rock structures? Data Integration GeoGeologic Chemical Map GeoPhysical Complex “multiple-worlds” mediation GeoChronologic Integrating the “Data Stack” Applications: Medical informatics, Biosciences, Ecoinformatics,… Foliation Map Visualization How do we represent data, information and knowledge to the user? Data Mining, Simulation Modeling, Analysis, Data Fusion How do we detect trends and relationships in data? Knowledge-Based Integration Advanced Query Processing Grid Storage Filesystems, Database Systems High speed networking Data Integration in the Geosciences SAN DIEGO SUPERCOMPUTER CENTER Fran Berman How do we obtain usable information from data? How do we collect, access and organize data? Networked Storage (SAN) sensornets How do we combine data, knowledge and information management with simulation and modeling? instruments Storage hardware How do we configure computer architectures to optimally support dataoriented computing? Reliability • Infrastructure must be there when you need it. • How can communities ensure that • • • • Data Tools Networks Software and other resources are in good working order, and continue to enable new discovery? SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Planning Ahead for Sustainable Cyberinfrastructure-enabled Communities • From the beginning, Cyberinfrastructure must be designed with its beneficiaries in mind. Social Science Computer Science and Engineering • Attention must be paid to • Social dynamics • Organization • Social mpact, etc. Domain and Application Science as well as technical issues • Long-term and strategic planning is critical SAN DIEGO SUPERCOMPUTER CENTER Fran Berman Borromean Rings represent 3 key components of Cyberinfrastructure Dan Atkins Thank You www.sdsc.edu SAN DIEGO SUPERCOMPUTER CENTER Fran Berman