Changing Roles, Responsibilities and Relationships Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 UKOLN is.

Download Report

Transcript Changing Roles, Responsibilities and Relationships Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 UKOLN is.

Changing Roles, Responsibilities and Relationships

Dr Liz Lyon,

Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0

UKOLN is supported by: www.ukoln.ac.uk

A centre of expertise in digital information management

Preliminary findings from a JISC study

• Terms of Reference for UKOLN

To define how institutions (collectively and individually) and scientific data centres can together effectively achieve

– – –

Preservation Access – Managed and open Re-use – Data citation, data mining and re-interpretation

• • October 2006 – March 2007

N.B. Work in progress!

Some of the data stakeholders?

Funders

• • • • • • • • Interviews: 4 Research Councils + 1 charity Support for data curation is (still) patchy Mixed approaches: proactive to passive Gaps in infrastructure support for data outputs Limited formal links between programme planning and support infrastructure Some Data management and sharing policies Some use of Data Management Plans Wellcome Trust – Policy + Q&A January 2007

January 2007

Data Management and Sharing Plan required “if creating or developing a resource for the research community as the primary goal” or “involve the generation of a significant quantity of data that could potentially be shared for added benefit”

A centre of expertise in digital information management

Funders 2

• • • • Limited advocacy work Funding models for infrastructure support vary Funding models for research programmes vary Some productive partnerships e.g. MRC and Wellcome Trust, CCLRC and Wellcome • Some examples of good practice

Hierarchy of drivers (for data sharing)

Acknowledgement: Mark Thorley, NERC • • • • • • Level 0: deliver project.

Level 1: meet ‘good scientific practice’.

Level 2: support own science.

Level 3: employer’s requirements.

Level 4: funder’s requirements.

Level 5: public policy requirements.

NERC has: 7 designated data centres Data Management Co-ordinator DataGrid

NATURAL ENVIRONMENT RESEARCH COUNCIL

MRC developing a data support plan Acknowledgement Alan Sudlow

Data centres & Data services

• • • • • • • • Interviews with 5 data services Deep levels of expertise and subject knowledge Exemplars of good practice: standards, policies, manuals, robust curation / preservation practice Limited sharing of expertise between centres Some effective partnerships: – – AHDS Stormont Papers with Queens Belfast BADC with CLADDIER Project Wide range of community awareness Use of licences but IPR issues: performing arts, Technical issues: complexity of data sets, version control, identifiers, application profiles

Data centres & Data services 2

• Exemplar of good practice – – – – European Bio-informatics Institute Microarray data to inform gene expression Consensus on community standards MIAME Data pipelines at source via Laboratory Information Management Systems LIMS – – – User tools MIAMExpress & value-added services Annotation of data using the Gene Ontology Submission & deposit is embedded in community culture: requirement for publication – – Training programme, eLearning materials coming This level of data curation is expensive!!

EMBL-Bank DNA sequences Reactome Array-Express Microarray Expression Data UniProt Protein Sequences EnsEMBL Genome Annotation IntAct Protein Interactions EMSD Macromolecular Structure Data Source: Graham Cameron, EBI

Large resources in related disciplines Medical data resources Biodiversity data resources Specialist biomolecular data resource examples BRENDA IMGT Pasteur DBs Core biomolecular resources SGD Flybase Chemical data resources Eumorphia / Phenotypes MGD Mutants Mouse Atlas Source: Graham Cameron, EBI Model organism resource examples

General Data Selection Criteria

• • Usability – – – – – Quality of data Usable data format Conditions of Use Reputable Author Documentation Usefulness – – – – Data quality Uniqueness of data Potential Strategic Use Usefulness of parameters

Institutions & Data Repositories

• • • • Not much data…. or duplication …… (yet?) Departmental audits of research data practice at University of Southampton to inform developing institutional data & curation policy Barriers to data sharing: – – – IPR and geospatial data Lack of awareness amongst researchers Cultural roots and resistance to change Exemplars of good practice: eBank Project

eCrystals ‘Global Federation’ Model

Data creation & capture in “Smart lab”

Data discovery, linking, citation

Presentation services / portals

Data discovery, linking, citation Deposit

Laboratory repository

Search, harvest

Aggregator services Publishers: peer review journals, conference proceedings, etc

Publication Data analysis

Institutional data repositories

Deposit , Validation Validation Search, harvest Deposit Search, harvest

Subject Repository

Deposit Curation Preservation Deposit

Institution Library & Information Services

Roles, Rights & Responsibilities

• • • • • ‘Scientist’: Creation and use of data .

‘Data centre’: Curation of and access to data.

‘User’: Use of 3 rd party data .

‘Funder’: Set / react to public policy drivers .

‘Publisher’: Maintain integrity of the scientific record.

Acknowledgement: Mark Thorley, NERC

NATURAL ENVIRONMENT RESEARCH COUNCIL

Closing thoughts

• • Co-ordination and join up – – High level and strategic : Funders Operational level and practical : JISC data services & research council data centres Funding – Are current economic models for preservation & data sharing infrastructure a) appropriate? b) adequate? c) sustainable?

– Should inform prioritisation and investment

Closing thoughts 2

• • Good Practice requirements – – – Data management and sharing Policies Data Management Plans (peer-reviewed) Institutional data curation policies & planning Technical interoperability and integration – – – Data are diverse and complex JISC IIE vision of discovery across repositories Contextual linking offers opportunity for data centres and institutional repositories to realise synergies and work more closely together

Closing thoughts 3

• • Advocacy – – – – Programmes to reach across sectors Harmonisation and consistent messages Tailored & targeted to disciplines Researcher has some curatorial responsibility Training – – – – Lack of skills eLearning opportunity Data scientists? Recognition and career development “Native” data scientists are coming….

“Dealing with the Data Deluge”

• • • • • • JISC Repositories Programme Supporting Institutions in the Digital Age Digital Repositories Conference 5-6 June 2007 University of Manchester Research Data Strand