ODE Opportunities for Data Exchange

Download Report

Transcript ODE Opportunities for Data Exchange

Josefine Nordling
CSC – IT Center for Science
LIBER 41st Annual Conference
27th of June 2012
Content Outline
•
•
•
•
•
•
•
•
Introduction
Stakeholder groups
Objectives
Phases of data re-use
Work phases
Key findings
Data pyramids
Final words
Background
• A FP7 project proposed by APA
• 9 partners: European Organization for Nuclear
Research (CERN, coordinators), Alliance for
Permanent Access (APA), Helmholtz Association
(HA), UK Science and Technology Funding Council
(STFC), British Library (BL), Association of
European Research Libraries (LIBER), German
National Library (DNB), International Association
of Scientific, Technical and Medical Pulishers
(STM) & IT Center for Science (CSC)
• Started 01/11/2010, ends 30/11/2012 (PM 1-25)
Stakeholder Groups
• 5 stakeholder groups:
 Libraries
 Data Centres
 Policy Makers & Funders
 Publishers
 Data Producers/Owners
How stakeholders
interact
Research
Institutes
Researcher
Publishers
Libraries
and
Datacenters
General objectives
• Best practices in data sharing, re-use,
preservation and citing
• Emerging best practices & lessons learned, but
also ”success stories”, ”near misses” &
”honourable failures”
• Challenges, drivers, barriers & enablers
Concrete objectives
• Evidence gathering enabling/providing:
 Key players to compare visions and explore
shared opportunities
 Different perspectives on data re-use
 Improved understanding of best practices
within RDM – more coherent national policies
and wider implementation of e-Infrastructure
 Information available for Horizon 2020
A vocabulary for data re-use
Research
Strategy
Preservation
Business Case
Project
Funding
Preservation
Planning
Data
collection/
simulation
Prearchive
phase
Data
Analysis
Scientific
Publication
Data
Preservation
Social &
Economic
Impact
Creation
Discover
data
Access
data
Talking, listening, engaging, influencing
• Communication with relevant stakeholder
groups – visibility for ODE
• Forum for all targeted audience – policy
discussions & compare visions
• Collaborations between projects – input and
feedback
• PR materials
Data sharing today
• Develop a broad understanding of the overall
issues to be addressed by ODE
• Identifying ”success stories”, ”near misses”,
”honourable failures ”, by conducting (21)
interviews, including:
 Attitudes within different scientific communities
on national and international level
 Researchers’ access to e-Infrastructures
• Ten tales of drivers and barriers in data sharing
Data enters scholarly communication
• The impact of data sharing, re-use and
preservation on scholarly communication
• Publishers’ role: stricter editorial policies,
enhancing articles, guidelines etc.
• Integration of datasets and publications –
libraries & data centres
• Informal interviews (researchers, authors,
editors, readers, data centres and libraries) &
(110 responses) surveys (libraries)
Drivers and barriers: questions and
answers
• Inform stakeholders of drivers and barriers on
data sharing
• Extension of use of data sharing beyond the
Member States
• Researcher’s benefits of data re-use – mapping
the stakeholders willing to enable this
• Revision of statements through consultation with
experts (workshops, interviews, structured
methods)
• Identify a set of key findings
The future of e-Infrastructures for data
sharing
• ”To demonstrate the value of information
gathered and destil the results from the two
conferences and the various areas
investigated in previous work packages in
order to ensure that each of the project’s
target audiences can make informed decisions
about the future of e-Infrastrucutres for data
sharing and preservation.”
The future of e-Infrstructures for data
sharing (continue)
• Categorisation of key findings - support eInfrastructure, describe possibilities and impact
of data sharing, re-use and preservation
• The roles of data in the future
• Publications on the findings tailored to each
stakeholder group – gathering together previous
results
• Still ahead: preparation of a thematic publication
and a final report
Challenges
• Delivery of information on benefits of data
• More training needed for researchers within RDM
• More cross-cutting international discussions are
needed
• The costs of data availability and re-use covered,
also after a project’s end
• Confidential and sensitive data acquires specific
access controls
• The data deluge in itself
Drivers
• Increased impact if data is used and cited by other
researchers
• Publishers are developing collaborations with
researchers and data centres
• Data regeneration is far more expensive than data
preservation
• Many publishers support data hosting and data linking
services
• Re-use of data in meta-studies to find hidden trends
• Authors are increasingly using publisher’s data services
Barriers
• Researcher’s hesitation to publish and share their data
• Patenting issues
• Lack of investment in libraries on supporting
development within RDM
• Publishing supplementary data alongside with articles
is expensive
• National reluctance in investing in global data
infrastructures
• Federal, national and institutional restrictions due to
strategic interests
Enablers
• Citation and recognition frameworks
• Clear instructions on data citation
• Easy processes for submission of data – lowering
the barriers for researchers
• Join functions with scholarly communication
• Working closely with researchers with
encouraging motives
• Engaging in establishing uniform data citation
standards
Enablers (continue)
• Expert knowledge for setting grown rules for
data re-use
• Acting based on requirements of the research
community
• Preservation of data to ensure continued
access to linked data
• Support of crosslink between publications and
datasets
The Pyramid’s likely short term reality:
(2) Risk that
supplements to
articles turn into
Data Dumping
places
(4) Estimates are
that at least 75 %
of research data
is never made
openly avaiable
Publ. with
Data
Processed &
Represent.
Data
Data Archives
(1) Top of the
pyramid is stable
but small
(3) Too many
disciplines lack a
community
endorsed data
archive
Data on Disks
and in Drawers
20
The Ideal Pyramid
Data
(2) Only if data
cannot be
integrated in
article, and only
relevant extra
explanations
(4) More Data
Journals that
describe
datasets, data
mgt plans and
data methods
In
(1) More integration
of text and data,
viewers and
seamless links to
interactive datasets
Publications
Article Supps
(3) Seamless links (bidirectional) between
publications and
data, interactive
viewers within the
articles
Data Archives
Data on Disks and in Drawers
21
Lastly
• Slowly moving in the right direction towards the
”best ways” of engaging in RDM
• Emerging awareness throughout the community
• Data centres, libraries and publishers are keen on
developing their services
• More and more collaborations are taking place
• Next step: convincing the reserchers of the
benefits of publishing, sharing and re-using data
• http://www.ode-project.eu/
Thank You!
Josefine Nordling
Project Coordinator, CSC
[email protected]