Dealing with Data: Roles, Rights, Responsibilities & Relationships in the European Context Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre ECRI2007, Hamburg,
Download ReportTranscript Dealing with Data: Roles, Rights, Responsibilities & Relationships in the European Context Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre ECRI2007, Hamburg,
Dealing with Data: Roles, Rights, Responsibilities & Relationships in the European Context Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre ECRI2007, Hamburg, June 2007. UKOLN is supported by: This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 www.ukoln.ac.uk A centre of expertise in digital information management Overview • Outcomes of a recent UK JISC-funded study carried out by UKOLN, University of Bath – – – – UK Institutions (repositories) and data centres Roles, rights, responsibilities, relationships High-level data-flow models Recommendations • Positioned in the European context – 8 perspectives from Strategy to Practice – Examples of best practice – Actions? Strategy & Co-ordination • Synthesis – – – – UK: funder support for data curation is variable Gaps in UK infrastructure support High level and strategic : build on ESFRI Roadmap Operational level and practical : data services & data centres – Within and between institutions : in Member States – Within and between disciplines : globally • Actions? – – – – Datasets Mapping & Gap Analysis Data Curation & Preservation Strategy for Europe Data Audit Framework for institutions Data Networking Forum for data centre staff Policy & Planning • Synthesis – UK: limited formal links between programme planning and support infrastructure but examples of good practice – Formal data policies are essential – Web 2.0 influence: data sharing using social software – Better joint planning for data management • Actions? – Funders should openly publish, implement and enforce a Data Management, Preservation and Sharing Policy – Research projects should submit a Data Management Plan for peer-review – Universities should implement an Institutional Data Management, Preservation and Sharing Policy January 2007 Data Management and Sharing Plan required “if creating or developing a resource for the research community as the primary goal” or “involve the generation of a significant quantity of data that could potentially www.ukoln.ac.ukbe shared for added benefit” A centre of expertise in digital information management NERC has: • 7 designated data centres • Published policy (under review) • Data Management Co-ordinator • Developing DataGrid NATURAL ENVIRONMENT RESEARCH COUNCIL General Data Selection Criteria • Usability – – – – – Quality of data Usable data format Conditions of Use Reputable Author Documentation • Usefulness NATURAL ENVIRONMENT RESEARCH COUNCIL – – – – Data quality Uniqueness of data Potential Strategic Use Usefulness of parameters Practice • Synthesis – Data capture automatically at source from instruments, in the lab, in the field – Not much data in Institutional Repositories (IR)…. yet? – Integrated architectures linking IRs and datacentres – Models for sharing data? – Barriers: lack of awareness, resistance to change – Level of re-use of data? • Actions? – Data capture as part of end-to-end research workflow – Evaluate re-purposing of datasets: identify the significant properties which facilitate re-use – Develop Disciplinary Case Studies Technical Integration and Interoperability • Synthesis – – – – Data are highly complex and diverse Data discovery to delivery Standards, standards, standards, standards…. Value of generic data models, metadata application profiles? • Actions? – – – – Identifiers and data citation best practice Version control of datasets Annotation models and standards best practice Bi-directional interdisciplinary linking between data objects and derived resources – Existing projects? Microarray data to inform gene expression • Consensus on community standards MIAME • Data pipelines at source via Laboratory Information Management Systems LIMS • User tools MIAMExpress & value-added services • Annotation of data using the Gene Ontology • Submission & deposit is embedded in community culture: requirement for publication • Training programme, eLearning materials coming This level of data curation is expensive!! EMBL-Bank DNA sequences Reactome Array-Express Microarray Expression Data UniProt Protein Sequences EnsEMBL Genome Annotation IntAct Protein Interactions EMSD Macromolecular Structure Data Source: Graham Cameron, EBI Large resources in related disciplines Specialist biomolecular data resource examples BRENDA Medical data resources IMGT Pasteur DBs Core biomolecular resources Biodiversity data resources SGD Flybase Chemical data resources Eumorphia/ Phenotypes MGD Mutants Mouse Atlas Source: Graham Cameron, EBI Model organism resource examples Funder Policy & Advocacy Community standards Scientist Scientist Scientist Blogs, wikis Curate Preserve Create Deposit Scientist Collaborate Share Link Domain Data Standards Centre Scientist Link Domain Data Centre Training Advocacy Link Domain Data Centre Publisher Discover Re-use User Domain Data Deposit Model This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0 Link © Liz Lyon (UKOLN, University of Bath) 2007 Institutions: eCrystals Federation (eBank Project) Data creation & capture in “Smart lab” Data discovery, linking, citation Presentation services / portals Data discovery, linking, citation Publishers: peerreview journals, conference proceedings, etc Aggregator services Search, harvest Search, harvest Publication Deposit Data analysis Laboratory repository Institutional data repositories Validation Search, harvest Subject Repository Deposit Deposit , Validation Deposit Curation Preservation Deposit Institution Library & Information Services Funder Scientist Scientist Scientist Blogs, wikis Harvest Create Deposit IR Aggregator ? Data Centre IR Federation Curate Preserve Standards Collaborate Share IR Policy Advocacy Training IR Scientist Link Publisher Link Discover Re-use User Federation Data Deposit Model This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0 Link © Liz Lyon (UKOLN, University of Bath) 2007 Legal and Ethical Issues • Synthesis – IPR is a barrier to data sharing e.g. geospatial data, performing arts – We need a better understanding of the issues • Actions? – Provide enhanced advice about data and IPR in Member States – Develop model licences with other organisations Sustainability • Synthesis – Are current economic models for preservation & data sharing infrastructure a) appropriate? b) adequate? c) sustainable? – Should inform research prioritisation and investment • Actions? – Cost-benefit study – Construct new economic models Advocacy • Synthesis UK Digital Curation Centre – Programmes need to reach across sectors – Harmonisation and consistent messages – Researcher has some curatorial responsibility http://www.dcc.ac.uk/ • Actions? – Identify co-ordinating body and target at specific disciplines Training and Skills • Synthesis – Leverage library & archive experience, EU projects DPE and PLANETS – Data curators and “native data scientists” • Actions? – Co-ordination: pan-European level – Review career development of data scientists – Assess value of data handling and curation in the curriculum Scientist : creation and use of data Rights Of first use. To be acknowledged. To expect IPR to be honoured. To receive data training and advice. Baroness Susan Greenfield, UK Responsibilities Manage data for life of project. Relationships Meet standards for good practice. With institution as employee. Comply with funder / institutional data policies and respect IPR of others. With subject community Work up data for use by others. With data centre. With funder of work. Institution : curation of and access to data Rights To be offered a copy of data. Responsibilities Set internal data management policy. Manage data in the short term. http://www.flickr.com/photos/nrparmar/383549700/in/pool-bath-uni/ Meet standards for good practice. Provide training and advice to support scientists. Promote the repository service. Relationships With scientist as employer. With data centre through expert staff. Data centre : curation of and access to data Rights To be offered a copy of data. To select data of long-term value. Responsibilities Manage data for the long-term. Relationships Meet standards for good practice. With scientist as “client” Provide training for deposit. With user communities. Promote the repository service. With institution staff. Protect rights of data contributors. Provide tools for re-use of data. through With funder of service. expert User : use of 3rd party data Rights To re-use data (nonexclusive licence). To access quality metadata to inform usability. GridPP computing facilities at Imperial College, London Relationships Responsibilities Abide by licence conditions. Acknowledge data creators / curators. Manage derived data effectively. With data centre as supplier. With institution as supplier. Funder : set/react to public policy drivers Rights Responsibilities To implement data policies. Consider wider public-policy perspective & stakeholder needs. To require those they fund to meet policy obligations. Participate in strategy co-ordination. Develop policies with stakeholders. Relationships With scientist as funder. Participate in policy co-ordination, joint planning & fund service delivery. With institution. Monitor and enforce data policies. With data centre as funder. Resource post-project long-term data management. With other funders. With other stakeholders as policy-maker and funder of services. Act as advocate for data curation & fund expert advisory service(s). Support workforce capacity development of data curators. Publisher : maintain integrity of the scientific record Rights To expect data are available to support publication. To request pre-publication data deposit in long-term repository. Responsibilities Engage stakeholders in development of publication standards. Link to data to support publication standards. Monitor & enforce public. standards. Relationships With scientist as creator, author and reader. With data centres and institutions as suppliers. Dealing with Data Report will be published shortly at www.ukoln.ac.uk www.ukoln.ac.uk A centre of expertise in digital information management