How would you give guidance or prioritize how to address gaps in the lifecycle of data acquisition, curation and preservation? Are there new programs.
Download
Report
Transcript How would you give guidance or prioritize how to address gaps in the lifecycle of data acquisition, curation and preservation? Are there new programs.
How would you give guidance or
prioritize how to address gaps in the
lifecycle of data acquisition, curation
and preservation? Are there new
programs or community opportunities?
The Fir Group
Kerstin Lehnert, John Graybeal, Dmitri Mozzherin, Vivian Hutchison, Giri Palanisamy, Eric Wolf, Ron
Weaver, Jan Peters, Walt Snyder, Mary Marlino, Cheryl Morris, Benjamin D Branch, Steve Tessler, Lisa
Raymond, Jeanine Aquino, Scott Jensen, Percy Donaghay, Dave Folker, Sze-Ling Celine Chan
Data Lifecycle
Acquisition – curation – preservation
Data lifecycle starts with PLANNING
Consider ‘use and re-use’ as part of the
data lifecycle
Data Acquisition
Two different phases:
‘Data Creation’: When the data are generated: field, lab,
computation, …
‘Data Gathering’: When the data hits the data system
What is the definition of the "data system"? Many data sources have a
long path to the data system.
Difference between large science programs and small
investigator-based projects
Legacy data vs.new data
historical data submitted to the data archive years later - need to
develop/submit metadata after the fact
Data Acquisition Gaps
Standards for acquisition that make ingestion more efficient
Incentives to submit data to archive
Metadata that ensure proper use and reuse
Infrastructure, tools
Need to define metadata standards, etc that go across all
disciplines vs domain specific metadata
Data Curation Gaps
Lack of ability to discover the data being curated
Best Practices
Standards, e.g. uniform metadata
Funding for metadata collection, coordination,
Communication process to gather requirements
Infrastructure
Ability to document provenance
Data Preservation Gaps
Funding for data preservation
Access to the ‘original’ (raw?) data
Access to software/algorithms used to process the data, i.e.
metadata to reconstruct the data
metadata that help use of data and understanding the data
Ability to reuse data
harvest information from reuse/repurposing in other contexts
Add value to data during analysis and cycle that back to the
archive for others to benefit
Repositories for data
How would you give guidance or
prioritize how to address gaps in the
lifecycle of data acquisition, curation
and preservation? Are there new
programs or community opportunities?
The Fir Group
Guidance Needed
Plan
Partner with data management
Initial metadata in acquisition plan
Tools to assist with metadata entry & Data mgt plan
from funding agencies (or funding by them for development
of guidance)
how to know what not to keep (since we can't keep
everything)
Possible Steps
* Best data practice(s) award
* NSF program managers instruct all review panels to
evaluate all proposals by DM plan, in such a way that
reviewers realistically review DM resources
* Every data set has to have a DOI.
* Perform a community survey to determine what data
lifecycle looks like in different disciplines
To The Cloud…
http://etherpad.ooici.org/geodata-fir