Geo-Data Informatics: Exploring the Life Cycle, Citation and Integration of Geo-Data Dr. Timothy Killeen Assistant Director for Geosciences March 2, 2011 With thanks to: Cliff Jacobs.
Download ReportTranscript Geo-Data Informatics: Exploring the Life Cycle, Citation and Integration of Geo-Data Dr. Timothy Killeen Assistant Director for Geosciences March 2, 2011 With thanks to: Cliff Jacobs.
Geo-Data Informatics: Exploring the Life Cycle, Citation and Integration of Geo-Data Dr. Timothy Killeen Assistant Director for Geosciences March 2, 2011 With thanks to: Cliff Jacobs and Eva Zanzerkia Talk Outline • • • • • The Challenges Responding to the Challenge GEO roadmap Partnerships Workshop Context Administration Priorities “We need to out-innovate, out-educate, and outbuild the rest of the world.” President Barack Obama State of the Union Address January 25, 2011 3 Priorities • Presidential Priorities – Protecting our nation from the serious economic and strategic risks associated with our reliance on foreign oil and the destabilizing effects of a changing climate – Advancing energy and climate security via promoting economic recovery efforts, accelerating job creation, and driving clean energy manufacturing • Priority Guidance for NSF (from NSB): The National Science Foundation (NSF) should continue to increase emphasis on innovation in sustainable energy technologies and education as a top priority. From presentation by Shere Abbott, Associate Director for Environment, 1 February 2010 The Challenge: Science and Society is Transformed by Data Modern science Multi-disciplinary Collaborations for Complexity Data- and compute-intensive Integrative, multiscale Individuals, groups, teams, communities “Sea of Data” “Age of Observation” Distributed, central repositories, sensor- driven, diverse, etc 5 Research Vessel Sikuliaq Era of Observation: Arctic Sea Ice Era of Observation: Oceans Era of Observation: National Ecological Observatory Network Era of Observation: Water Era of Observation: Satellites NCAR-Wyoming Supercomputer Center Opening, June 2012 >1 Petaflop, 150 petabyte. LEED Gold Era of Simulation Data Challenges Exa Bytes Square Kilometer Array Climate, Environment Volume of data Bytes per day Genomics Peta Bytes Blue Waters Climate, Environment LHC Tera Bytes LSST LHC Genomics Distribution of data Giga Bytes 2012 2016 2020 Interoperability of Data 12 New thrust outlined in the FY 2012 President’s Budget Request NSF RESPONDS FY12 NSF Budget Request has twin Foci on: Sustainability and CyberInfrastructure • NSF is well positioned to contribute to the Administration’s priorities through basic and semi-applied research • Budget thrusts into Sustainability and Cyberinfrastructure are interconnected and can be accelerated through technological innovation in Geoinformatics • Geoscientists must play a leading role Science, Engineering, and Education for Sustainability (SEES) • Goal: Generate discoveries Economy and build capacity to achieve an environmentally and economically sustainable future • FY 2012 priorities: – – – – SEES Environment Energy Advance a clean energy future Nurture the emerging SEES workforce Expand research, education, and knowledge dissemination Engage with global partners • Environment, energy, and economy nexus • Increase of $338 million over FY 2010 enacted level (GEO increase $87.2M) SEES – Geosciences Foci • Sustainable Energy Pathways – characterize and understand existing energy systems and their limitations (e.g. wind, geothermal, hydro) – understand risks and stressors associated with new and emerging energy sources (e.g. tidal, clean coal, carbon sequestration) • Sustainability Research Networks – interdisciplinary research and education partnerships involving government, academe, and the private sector – address fundamental issues of use in improving policy and practices with regard to energy, the environment, and human well-being Cyberinfrastructure Framework for 21st Century Science and Engineering (CIF21) • Comprehensive and integrated cyberinfrastructure to transform research, innovation and education • Focus on computational and data-intensive science to address complex problems • Four major components – – – – Data-enabled science New computational infrastructure Community research networks Access and connections to cyberinfrastructure facilities • Increase of $117 million over FY 2010 enacted level (GEO increase $16M) Broad Principles to Lead CIF21 • Builds and sustains national infrastructure for Science, Engineering and Education • Leverages common methods, approaches, and applications – focus on interoperability • Catalyzes other CI investments across NSF – Provides focus and is a vehicle for coordinating efforts and programs – Is a “force multiplier” across NSF • Shared governance; embedded into every NSF directorate and office 18 Thrust Areas for CIF21 in FY12 Community Research Networks Data-Enabled Science Education: integral and embedded New Computational Resources Access and Connections to CI Resources 19 A vision for the future GEO ROADMAP FOR CIF21 GEO Will Build on a Substantial Investment in CI • NSF Budget (FY 2010) $6,926.5 M • Geosciences (GEO) Budget $889.64 M • GEO 2010 investments in CI ~$103 M New investment: NCAR/Wyoming Super Computer Center FY 2012 With Partners in Wyoming: Center Construction Computer Systems ~$70M ~$30M NCAR-WY Supercomputer Center DATA-INTENSIVE COMPUTING The new computing facility capable of over more than 1 PetaFLOP will be available in June 2012 and will be designed for Data-Intensive Computing FOCUS ON SUSTAINABILITY Maximum energy efficiency, LEED Gold certification, and achievement of the smallest possible carbon footprint are all goals of the NWSC project. Overall Vision: Building on the Internet Paradigm: An “Earth Cube” Internet for interoperability Interworkability for collaboration • The Internet provided a knowledge system that transformed the modality of science • CIF21 investments must provide a framework of integrated and interactive services that support understanding and prediction of the Earth system as a whole Elements of This Framework • Creates infrastructure of integrated and interactive services – transcend fields and accelerate discovery of a complex, multi-scale Earth System • Creates an interoperable digital access infrastructure framework – Provides a network that is open, extensible and sustainable – Includes Observations, Simulations, Collaborations, and Sharing of information • Facilitates data transfer from the field into data systems and applications • Integrates research and education – Training paradigms and new modes of learning and training to establish GEO savvy workforce and broader participation in understanding a sustainable Earth system CIF21 – Geosciences Foci GEO planned investment is $16 million in 2012 • New Computational Infrastructure – New and enhanced computational platforms, tools, data centers to analyze, manipulate, visualize, and share large and complex data sets • Data Enabled Science – Geoinformatics – a framework for open and easy access of all geoscience data – Hardware, software, and human capital infrastructure to increase the interoperability and interworkability of geosciences (and other domain) data sets. • Connection to Facilities – Infrastructure for sharing of observational data – Technology to retrieve data from the field • Networks – Sustained educational and training programs to create a computationally savvy workforce and serve multi-disciplinary science. Multiple Modes of Support Are Necessary to Create and Sustain CIF21 Infrastructure • “Modes of support” that are essential to build CIF21 infrastructure and to engage in CIF21 activities. – – – – – – – – Focused grants to individual PIs or small groups Focused programs that are community driven Small centers Large national centers Cyber-enhanced field programs Cyber-enhanced observing facilities and MREFC projects NSF-wide initiatives Education, outreach, and training activities (EOT) Leadership Team Presentation The Spiral Development of CIF21 Infrastructure Connected Facilities New Computational Infrastructure Networks Data-Enabled Science The Vision of GEODATA can only be achieved through National and International partnerships PARTNERSHIPS Sustaining Partnerships • Geosciences researchers and educators depend on data supported by US Gov. agencies as well as other nations • Productivity of researchers benefits from cooperation and collaborations among academics, government agencies, private industry, and international research enterprises New 10-Year International Effort Planned to substantially advance discussions and directions of data life cycle, data integration and data citation WORKSHOP CONTEXT (GEO-DATA INFORMATICS: EXPLORING THE LIFE CYCLE, CITATION AND INTEGRATION OF GEO-DATA) Workshop Foci within NSF Context • Addressing the full lifecycle of data – NSF encourages scientists to consider the full lifecycle of all data of interest to the geosciences and to identify best practices and common solutions. • A pathway to establish Centers and Networks of Excellence (real or virtual) – Communities of practice can help advance interoperability, datasharing, and trans-disciplinary research. • Strengthening partnerships – Federal agencies, international organizations, non-governmental organizations, and industry are essential for the long-term success of geosciences. A Challenge to the Workshop: Give us at least three ways to answer this hypothetical question a year from now: “You told us that this new cyberinfrastructure investment would transform both the practice of science and engineering itself and lead to significant advances in knowledge and understanding – advances that would not have happened otherwise – can you give us some concrete examples of this?” Where discoveries begin 34