Why is Digital Curation Important for Workforce and Economic Development? Alan Blatecky Office of Cyberinfrastructure Symposium on Digital Curation in the Era of Big.

Download Report

Transcript Why is Digital Curation Important for Workforce and Economic Development? Alan Blatecky Office of Cyberinfrastructure Symposium on Digital Curation in the Era of Big.

Why is Digital Curation Important for Workforce
and Economic Development?
Alan Blatecky
Office of Cyberinfrastructure
Symposium on Digital Curation in the Era of Big Data;
Career Opportunities and Education Requirements
NRC
July 19, 2012
1
1
Framing the Challenge:
Science and Society Transformed by Data

Modern science



Multi-disciplinary
Collaborations for
Complexity


Data- and computeintensive
Integrative, multiscale
Individuals, groups,
teams, communities
Sea of Data


Age of Observation
Distributed, central
repositories, sensordriven, diverse, etc
2
Data as a transforming agent

Enormous amounts of data are being generated by
modern experiments, sensors, observations and
social networks

New analysis tools including automatic extraction of
new knowledge continues to accelerate

Infusion of data-intensive computation into science,
engineering and education is revolutionizing
research

Research in science and education is an essential
pathway to prosperity and competitiveness and
thrives in an environment of shared data
Scientific Data Challenges
Exa
Bytes
Square
Kilometer
Array
Climate,
Environment
Volume/Growth
Bytes per day
Genomics
Peta
Bytes
TeraGrid,
Blue
Waters
Climate,
Environment
LHC
LHC
Tera
Bytes
LSST
Genomics
Giga
Bytes
Distribution
Many smaller datasets…
2012
2020
Data Access
4
NSF Data strategy




Establish a national data infrastructure to support
science, engineering and education
Ensure that this infrastructure stays at the most
advanced state of sophistication and is sustainable
Expand the development of the next generation of
compute and data intensive workforce
Develop of a suite of policies to support the full data
life cycle (data access, curation, object identifiers, etc)
5
Data Web Forum
The DWF will facilitate the exchange and interoperability
of data across disciplines and national boundaries by
producing high quality, relevant technical documents that
influence the way people store, use, and manage data

Linking top-down governance model with bottom-up IETF model
to catalyze this community-based activity
 Top-down focus on policy, permission . . .
 Bottom-up focus on operations, services . . .


Timeliness an important factor
Ability to respond quickly essential
DWF Principles
 Balanced representation of stakeholder communities
 Community-based; not a government organization, a
regulatory body or a commercial organization
 Products are free and open source
 Meetings are public, progress through consensus and
practice
 Focus on harmonization across standards, policies,
technologies, tools, and other data infrastructure
elements
Proposed Timeline: 2012




Initial Government Agency funding in process
Awards to Non Government Organizations in Aug/Sep
Teams flesh out organization and structure – Sep/Dec
Initial working groups identified and charged – Sep/Dec
 groups already sharing data across global boundaries
 Identify candidates for early deliverables, best practices
 Secure time-commitments to undertake proposed activities