Dealing with Data: Roles, Rights, Responsibilities & Relationships in the European Context Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre ECRI2007, Hamburg,

Download Report

Transcript Dealing with Data: Roles, Rights, Responsibilities & Relationships in the European Context Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre ECRI2007, Hamburg,

Dealing with Data: Roles, Rights,
Responsibilities & Relationships
in the European Context
Dr Liz Lyon, Director, UKOLN
Associate Director, UK Digital Curation Centre
ECRI2007, Hamburg, June 2007.
UKOLN is supported by:
This work is licensed under a Creative Commons Licence
Attribution-ShareAlike 2.0
www.ukoln.ac.uk
A centre of expertise in digital information management
Overview
• Outcomes of a recent UK JISC-funded study
carried out by UKOLN, University of Bath
–
–
–
–
UK Institutions (repositories) and data centres
Roles, rights, responsibilities, relationships
High-level data-flow models
Recommendations
• Positioned in the European context
– 8 perspectives from Strategy to Practice
– Examples of best practice
– Actions?
Strategy & Co-ordination
• Synthesis
–
–
–
–
UK: funder support for data curation is variable
Gaps in UK infrastructure support
High level and strategic : build on ESFRI Roadmap
Operational level and practical : data services &
data centres
– Within and between institutions : in Member States
– Within and between disciplines : globally
• Actions?
–
–
–
–
Datasets Mapping & Gap Analysis
Data Curation & Preservation Strategy for Europe
Data Audit Framework for institutions
Data Networking Forum for data centre staff
Policy & Planning
• Synthesis
– UK: limited formal links between programme planning and
support infrastructure but examples of good practice
– Formal data policies are essential
– Web 2.0 influence: data sharing using social software
– Better joint planning for data management
• Actions?
– Funders should openly publish, implement and enforce a
Data Management, Preservation and Sharing Policy
– Research projects should submit a Data Management Plan
for peer-review
– Universities should implement an Institutional Data
Management, Preservation and Sharing Policy
January 2007
Data Management and Sharing Plan
required “if creating or developing a resource
for the research community as the primary
goal” or “involve the generation of a
significant quantity of data that could
potentially
www.ukoln.ac.ukbe shared for added benefit”
A centre of expertise in digital information management
NERC has:
• 7 designated data
centres
• Published policy
(under review)
• Data Management
Co-ordinator
• Developing
DataGrid
NATURAL
ENVIRONMENT
RESEARCH COUNCIL
General Data Selection
Criteria
• Usability
–
–
–
–
–
Quality of data
Usable data format
Conditions of Use
Reputable Author
Documentation
• Usefulness
NATURAL
ENVIRONMENT
RESEARCH COUNCIL
–
–
–
–
Data quality
Uniqueness of data
Potential Strategic Use
Usefulness of parameters
Practice
• Synthesis
– Data capture automatically at source from instruments,
in the lab, in the field
– Not much data in Institutional Repositories (IR)…. yet?
– Integrated architectures linking IRs and datacentres
– Models for sharing data?
– Barriers: lack of awareness, resistance to change
– Level of re-use of data?
• Actions?
– Data capture as part of end-to-end research workflow
– Evaluate re-purposing of datasets: identify the
significant properties which facilitate re-use
– Develop Disciplinary Case Studies
Technical Integration and Interoperability
• Synthesis
–
–
–
–
Data are highly complex and diverse
Data discovery to delivery
Standards, standards, standards, standards….
Value of generic data models, metadata application profiles?
• Actions?
–
–
–
–
Identifiers and data citation best practice
Version control of datasets
Annotation models and standards best practice
Bi-directional interdisciplinary linking between data objects
and derived resources
– Existing projects?
Microarray data to inform gene expression
• Consensus on community standards MIAME
• Data pipelines at source via Laboratory Information
Management Systems LIMS
• User tools MIAMExpress & value-added services
• Annotation of data using the Gene Ontology
• Submission & deposit is embedded in community
culture: requirement for publication
• Training programme, eLearning materials coming
This level of data curation is expensive!!
EMBL-Bank
DNA sequences
Reactome
Array-Express
Microarray
Expression Data
UniProt
Protein Sequences
EnsEMBL
Genome
Annotation
IntAct
Protein Interactions
EMSD
Macromolecular
Structure Data
Source: Graham
Cameron, EBI
Large resources in related disciplines
Specialist biomolecular data
resource examples
BRENDA
Medical data
resources
IMGT
Pasteur DBs
Core
biomolecular
resources
Biodiversity
data
resources
SGD
Flybase
Chemical
data
resources
Eumorphia/
Phenotypes
MGD
Mutants
Mouse Atlas
Source: Graham
Cameron, EBI
Model organism resource examples
Funder
Policy &
Advocacy
Community
standards
Scientist
Scientist
Scientist
Blogs,
wikis
Curate
Preserve
Create
Deposit
Scientist
Collaborate
Share
Link
Domain
Data
Standards
Centre
Scientist
Link
Domain
Data
Centre
Training
Advocacy
Link
Domain
Data
Centre
Publisher
Discover
Re-use
User
Domain Data Deposit Model
This work is licensed under a Creative Commons License
Attribution-ShareAlike 2.0
Link
© Liz Lyon (UKOLN, University of Bath) 2007
Institutions: eCrystals Federation (eBank Project)
Data creation
& capture in
“Smart lab”
Data discovery,
linking, citation
Presentation services / portals
Data discovery,
linking, citation
Publishers: peerreview journals,
conference
proceedings, etc
Aggregator
services
Search,
harvest
Search,
harvest
Publication
Deposit
Data analysis
Laboratory
repository
Institutional
data repositories
Validation
Search,
harvest
Subject
Repository
Deposit
Deposit ,
Validation
Deposit
Curation
Preservation
Deposit
Institution Library &
Information Services
Funder
Scientist
Scientist
Scientist
Blogs,
wikis
Harvest
Create
Deposit
IR
Aggregator ?
Data Centre
IR
Federation
Curate
Preserve
Standards
Collaborate
Share
IR
Policy
Advocacy
Training
IR
Scientist
Link
Publisher
Link
Discover
Re-use
User
Federation Data Deposit Model
This work is licensed under a Creative Commons License
Attribution-ShareAlike 2.0
Link
© Liz Lyon (UKOLN, University of Bath) 2007
Legal and Ethical Issues
• Synthesis
– IPR is a barrier to data sharing e.g. geospatial data, performing arts
– We need a better understanding of the issues
• Actions?
– Provide enhanced advice about data and IPR in Member States
– Develop model licences with other organisations
Sustainability
• Synthesis
– Are current economic models for preservation & data sharing
infrastructure a) appropriate? b) adequate? c) sustainable?
– Should inform research prioritisation and investment
• Actions?
– Cost-benefit study
– Construct new economic models
Advocacy
• Synthesis
UK Digital Curation Centre
– Programmes need to reach across sectors
– Harmonisation and consistent messages
– Researcher has some curatorial responsibility
http://www.dcc.ac.uk/
• Actions?
– Identify co-ordinating body and target at specific disciplines
Training and Skills
• Synthesis
– Leverage library & archive experience, EU projects DPE and PLANETS
– Data curators and “native data scientists”
• Actions?
– Co-ordination: pan-European level
– Review career development of data scientists
– Assess value of data handling and curation in the curriculum
Scientist : creation and use of data
Rights
Of first use.
To be acknowledged.
To expect IPR to be
honoured.
To receive data training
and advice.
Baroness Susan Greenfield, UK
Responsibilities
Manage data for life of project.
Relationships
Meet standards for good practice.
With institution as employee.
Comply with funder / institutional
data policies and respect IPR of
others.
With subject community
Work up data for use by others.
With data centre.
With funder of work.
Institution : curation of and access to data
Rights
To be offered a copy of data.
Responsibilities
Set internal data management policy.
Manage data in the short term.
http://www.flickr.com/photos/nrparmar/383549700/in/pool-bath-uni/
Meet standards for good practice.
Provide training and advice to support
scientists.
Promote the repository service.
Relationships
With scientist as employer.
With data centre through expert
staff.
Data centre : curation of and access to data
Rights
To be offered a copy of data.
To select data of long-term
value.
Responsibilities
Manage data for the long-term.
Relationships
Meet standards for good practice.
With scientist as “client”
Provide training for deposit.
With user communities.
Promote the repository service.
With institution
staff.
Protect rights of data contributors.
Provide tools for re-use of data.
through
With funder of service.
expert
User : use of 3rd party data
Rights
To re-use data (nonexclusive licence).
To access quality
metadata to inform
usability.
GridPP computing facilities
at Imperial College, London
Relationships
Responsibilities
Abide by licence conditions.
Acknowledge data creators / curators.
Manage derived data effectively.
With data centre as supplier.
With institution as supplier.
Funder : set/react to public policy drivers
Rights
Responsibilities
To implement data policies.
Consider wider public-policy perspective
& stakeholder needs.
To require those they fund to
meet policy obligations.
Participate in strategy co-ordination.
Develop policies with stakeholders.
Relationships
With scientist as funder.
Participate in policy co-ordination, joint
planning & fund service delivery.
With institution.
Monitor and enforce data policies.
With data centre as funder.
Resource post-project long-term data
management.
With other funders.
With other stakeholders as
policy-maker and funder of
services.
Act as advocate for data curation & fund
expert advisory service(s).
Support workforce capacity development
of data curators.
Publisher : maintain integrity of the
scientific record
Rights
To expect data are available to
support publication.
To request pre-publication data
deposit in long-term repository.
Responsibilities
Engage stakeholders in development of
publication standards.
Link to data to support publication
standards.
Monitor & enforce public. standards.
Relationships
With scientist as creator, author
and reader.
With data centres and institutions
as suppliers.
Dealing with Data Report
will be published shortly at
www.ukoln.ac.uk
www.ukoln.ac.uk
A centre of expertise in digital information management