Geo-Data Informatics: Exploring the Life Cycle, Citation and Integration of Geo-Data Dr. Timothy Killeen Assistant Director for Geosciences March 2, 2011 With thanks to: Cliff Jacobs.

Download Report

Transcript Geo-Data Informatics: Exploring the Life Cycle, Citation and Integration of Geo-Data Dr. Timothy Killeen Assistant Director for Geosciences March 2, 2011 With thanks to: Cliff Jacobs.

Geo-Data Informatics: Exploring
the Life Cycle, Citation and
Integration of Geo-Data
Dr. Timothy Killeen
Assistant Director for Geosciences
March 2, 2011
With thanks to: Cliff Jacobs and Eva Zanzerkia
Talk Outline
•
•
•
•
•
The Challenges
Responding to the Challenge
GEO roadmap
Partnerships
Workshop Context
Administration Priorities
“We need to out-innovate,
out-educate, and outbuild the rest of the
world.”
President Barack Obama
State of the Union Address
January 25, 2011
3
Priorities
• Presidential Priorities
– Protecting our nation from the serious economic
and strategic risks associated with our reliance
on foreign oil and the destabilizing effects of a
changing climate
– Advancing energy and climate security via
promoting economic recovery efforts,
accelerating job creation, and driving clean energy
manufacturing
• Priority Guidance for NSF (from NSB):
The National Science Foundation (NSF)
should continue to increase emphasis on
innovation in sustainable energy
technologies and education as a top
priority.
From presentation by Shere Abbott, Associate Director for Environment, 1 February 2010
The Challenge:
Science and Society is Transformed by Data

Modern science



Multi-disciplinary
Collaborations for
Complexity


Data- and compute-intensive
Integrative, multiscale
Individuals, groups, teams,
communities
“Sea of Data”


“Age of Observation”
Distributed, central
repositories, sensor- driven,
diverse, etc
5
Research Vessel Sikuliaq
Era of Observation: Arctic Sea Ice
Era of Observation: Oceans
Era of Observation: National Ecological
Observatory Network
Era of Observation: Water
Era of Observation: Satellites
NCAR-Wyoming Supercomputer Center
Opening, June 2012
>1 Petaflop, 150 petabyte. LEED Gold
Era of Simulation
Data Challenges
Exa
Bytes
Square
Kilometer
Array
Climate,
Environment
Volume of data
Bytes per day
Genomics
Peta
Bytes
Blue
Waters
Climate,
Environment
LHC
Tera
Bytes
LSST
LHC
Genomics
Distribution of data
Giga
Bytes
2012
2016
2020
Interoperability of Data
12
New thrust outlined in the FY 2012 President’s Budget Request
NSF RESPONDS
FY12 NSF Budget Request has twin Foci on:
Sustainability and CyberInfrastructure
• NSF is well positioned to contribute to the
Administration’s priorities through basic and
semi-applied research
• Budget thrusts into Sustainability and
Cyberinfrastructure are interconnected and
can be accelerated through technological
innovation in Geoinformatics
• Geoscientists must play a leading role
Science, Engineering, and Education
for Sustainability (SEES)
• Goal: Generate discoveries
Economy
and build capacity to achieve
an environmentally and
economically sustainable future
• FY 2012 priorities:
–
–
–
–
SEES
Environment
Energy
Advance a clean energy future
Nurture the emerging SEES workforce
Expand research, education, and knowledge dissemination
Engage with global partners
• Environment, energy, and economy nexus
• Increase of $338 million over FY 2010 enacted level
(GEO increase $87.2M)
SEES – Geosciences Foci
• Sustainable Energy Pathways
– characterize and understand existing energy systems
and their limitations (e.g. wind, geothermal, hydro)
– understand risks and stressors associated with new and
emerging energy sources (e.g. tidal, clean coal, carbon
sequestration)
• Sustainability Research Networks
– interdisciplinary research and education partnerships
involving government, academe, and the private sector
– address fundamental issues of use in improving policy
and practices with regard to energy, the environment,
and human well-being
Cyberinfrastructure Framework for 21st
Century Science and Engineering (CIF21)
• Comprehensive and integrated
cyberinfrastructure to transform research,
innovation and education
• Focus on computational and data-intensive
science to address complex problems
• Four major components
–
–
–
–
Data-enabled science
New computational infrastructure
Community research networks
Access and connections to cyberinfrastructure facilities
• Increase of $117 million over FY 2010 enacted level
(GEO increase $16M)
Broad Principles to Lead CIF21
• Builds and sustains national infrastructure for
Science, Engineering and Education
• Leverages common methods, approaches, and
applications – focus on interoperability
• Catalyzes other CI investments across NSF
– Provides focus and is a vehicle for coordinating
efforts and programs
– Is a “force multiplier” across NSF
• Shared governance; embedded into every NSF
directorate and office
18
Thrust Areas for CIF21 in FY12
Community
Research
Networks
Data-Enabled
Science
Education: integral and embedded
New Computational
Resources
Access and
Connections to
CI Resources
19
A vision for the future
GEO ROADMAP FOR CIF21
GEO Will Build on a Substantial Investment in CI
• NSF Budget (FY 2010)
$6,926.5 M
• Geosciences (GEO) Budget
$889.64 M
• GEO 2010 investments in CI
~$103 M
New investment: NCAR/Wyoming Super Computer
Center
FY 2012 With Partners in Wyoming:
Center Construction
Computer Systems
~$70M
~$30M
NCAR-WY Supercomputer Center
DATA-INTENSIVE COMPUTING
The new computing facility capable of over
more than 1 PetaFLOP will be available
in June 2012 and will be designed for
Data-Intensive Computing
FOCUS ON SUSTAINABILITY
Maximum energy efficiency,
LEED Gold certification, and
achievement of the smallest
possible carbon footprint are all
goals of the NWSC project.
Overall Vision:
Building on the Internet Paradigm: An “Earth Cube”
Internet for
interoperability
Interworkability
for collaboration
•
The Internet provided a knowledge system that transformed the modality of science
•
CIF21 investments must provide a framework of integrated and interactive services
that support understanding and prediction of the Earth system as a whole
Elements of This Framework
• Creates infrastructure of integrated and interactive services
– transcend fields and accelerate discovery of a complex, multi-scale Earth
System
• Creates an interoperable digital access infrastructure
framework
– Provides a network that is open, extensible and sustainable
– Includes Observations, Simulations, Collaborations, and Sharing of
information
• Facilitates data transfer from the field into data systems and
applications
• Integrates research and education
– Training paradigms and new modes of learning and training to establish
GEO savvy workforce and broader participation in understanding a
sustainable Earth system
CIF21 – Geosciences Foci
 GEO planned investment is $16 million in 2012
• New Computational Infrastructure
– New and enhanced computational platforms, tools, data centers to analyze,
manipulate, visualize, and share large and complex data sets
• Data Enabled Science
– Geoinformatics – a framework for open and easy access of all geoscience data
– Hardware, software, and human capital infrastructure to increase the
interoperability and interworkability of geosciences (and other domain) data sets.
• Connection to Facilities
– Infrastructure for sharing of observational data
– Technology to retrieve data from the field
• Networks
– Sustained educational and training programs to create a computationally savvy
workforce and serve multi-disciplinary science.
Multiple Modes of Support Are Necessary to
Create and Sustain CIF21 Infrastructure
• “Modes of support” that are essential to build CIF21
infrastructure and to engage in CIF21 activities.
–
–
–
–
–
–
–
–
Focused grants to individual PIs or small groups
Focused programs that are community driven
Small centers
Large national centers
Cyber-enhanced field programs
Cyber-enhanced observing facilities and MREFC projects
NSF-wide initiatives
Education, outreach, and training activities (EOT)
Leadership Team Presentation
The Spiral Development of CIF21
Infrastructure
Connected Facilities
New
Computational
Infrastructure
Networks
Data-Enabled Science
The Vision of GEODATA can only be achieved through National
and International partnerships
PARTNERSHIPS
Sustaining Partnerships
• Geosciences researchers and educators depend
on data supported by US Gov. agencies as well as
other nations
• Productivity of researchers benefits from
cooperation and collaborations among academics,
government agencies, private industry, and
international research enterprises
New 10-Year International Effort Planned
to substantially advance discussions and directions of data life
cycle, data integration and data citation
WORKSHOP CONTEXT
(GEO-DATA INFORMATICS: EXPLORING THE LIFE CYCLE, CITATION
AND INTEGRATION OF GEO-DATA)
Workshop Foci within NSF Context
• Addressing the full lifecycle of data
– NSF encourages scientists to consider the full lifecycle of all data
of interest to the geosciences and to identify best practices and
common solutions.
• A pathway to establish Centers and Networks of
Excellence (real or virtual)
– Communities of practice can help advance interoperability, datasharing, and trans-disciplinary research.
• Strengthening partnerships
– Federal agencies, international organizations, non-governmental
organizations, and industry are essential for the long-term
success of geosciences.
A Challenge to the Workshop:
Give us at least three ways to answer this hypothetical
question a year from now:
“You told us that this new cyberinfrastructure investment
would transform both the practice of science and
engineering itself and lead to significant advances in
knowledge and understanding – advances that would not
have happened otherwise – can you give us some
concrete examples of this?”
Where discoveries begin
34