Changing Roles, Responsibilities and Relationships Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 UKOLN is.
Download ReportTranscript Changing Roles, Responsibilities and Relationships Dr Liz Lyon, Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 UKOLN is.
Changing Roles, Responsibilities and Relationships
Dr Liz Lyon,
Director, UKOLN Associate Director, UK Digital Curation Centre Opening the research data lifecycle, JISC Conference 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
UKOLN is supported by: www.ukoln.ac.uk
A centre of expertise in digital information management
Preliminary findings from a JISC study
• Terms of Reference for UKOLN
To define how institutions (collectively and individually) and scientific data centres can together effectively achieve
– – –
Preservation Access – Managed and open Re-use – Data citation, data mining and re-interpretation
• • October 2006 – March 2007
N.B. Work in progress!
Some of the data stakeholders?
Funders
• • • • • • • • Interviews: 4 Research Councils + 1 charity Support for data curation is (still) patchy Mixed approaches: proactive to passive Gaps in infrastructure support for data outputs Limited formal links between programme planning and support infrastructure Some Data management and sharing policies Some use of Data Management Plans Wellcome Trust – Policy + Q&A January 2007
January 2007
Data Management and Sharing Plan required “if creating or developing a resource for the research community as the primary goal” or “involve the generation of a significant quantity of data that could potentially be shared for added benefit”
A centre of expertise in digital information management
Funders 2
• • • • Limited advocacy work Funding models for infrastructure support vary Funding models for research programmes vary Some productive partnerships e.g. MRC and Wellcome Trust, CCLRC and Wellcome • Some examples of good practice
Hierarchy of drivers (for data sharing)
Acknowledgement: Mark Thorley, NERC • • • • • • Level 0: deliver project.
Level 1: meet ‘good scientific practice’.
Level 2: support own science.
Level 3: employer’s requirements.
Level 4: funder’s requirements.
Level 5: public policy requirements.
NERC has: 7 designated data centres Data Management Co-ordinator DataGrid
NATURAL ENVIRONMENT RESEARCH COUNCIL
MRC developing a data support plan Acknowledgement Alan Sudlow
Data centres & Data services
• • • • • • • • Interviews with 5 data services Deep levels of expertise and subject knowledge Exemplars of good practice: standards, policies, manuals, robust curation / preservation practice Limited sharing of expertise between centres Some effective partnerships: – – AHDS Stormont Papers with Queens Belfast BADC with CLADDIER Project Wide range of community awareness Use of licences but IPR issues: performing arts, Technical issues: complexity of data sets, version control, identifiers, application profiles
Data centres & Data services 2
• Exemplar of good practice – – – – European Bio-informatics Institute Microarray data to inform gene expression Consensus on community standards MIAME Data pipelines at source via Laboratory Information Management Systems LIMS – – – User tools MIAMExpress & value-added services Annotation of data using the Gene Ontology Submission & deposit is embedded in community culture: requirement for publication – – Training programme, eLearning materials coming This level of data curation is expensive!!
EMBL-Bank DNA sequences Reactome Array-Express Microarray Expression Data UniProt Protein Sequences EnsEMBL Genome Annotation IntAct Protein Interactions EMSD Macromolecular Structure Data Source: Graham Cameron, EBI
Large resources in related disciplines Medical data resources Biodiversity data resources Specialist biomolecular data resource examples BRENDA IMGT Pasteur DBs Core biomolecular resources SGD Flybase Chemical data resources Eumorphia / Phenotypes MGD Mutants Mouse Atlas Source: Graham Cameron, EBI Model organism resource examples
General Data Selection Criteria
• • Usability – – – – – Quality of data Usable data format Conditions of Use Reputable Author Documentation Usefulness – – – – Data quality Uniqueness of data Potential Strategic Use Usefulness of parameters
Institutions & Data Repositories
• • • • Not much data…. or duplication …… (yet?) Departmental audits of research data practice at University of Southampton to inform developing institutional data & curation policy Barriers to data sharing: – – – IPR and geospatial data Lack of awareness amongst researchers Cultural roots and resistance to change Exemplars of good practice: eBank Project
eCrystals ‘Global Federation’ Model
Data creation & capture in “Smart lab”
Data discovery, linking, citation
Presentation services / portals
Data discovery, linking, citation Deposit
Laboratory repository
Search, harvest
Aggregator services Publishers: peer review journals, conference proceedings, etc
Publication Data analysis
Institutional data repositories
Deposit , Validation Validation Search, harvest Deposit Search, harvest
Subject Repository
Deposit Curation Preservation Deposit
Institution Library & Information Services
Roles, Rights & Responsibilities
• • • • • ‘Scientist’: Creation and use of data .
‘Data centre’: Curation of and access to data.
‘User’: Use of 3 rd party data .
‘Funder’: Set / react to public policy drivers .
‘Publisher’: Maintain integrity of the scientific record.
Acknowledgement: Mark Thorley, NERC
NATURAL ENVIRONMENT RESEARCH COUNCIL
Closing thoughts
• • Co-ordination and join up – – High level and strategic : Funders Operational level and practical : JISC data services & research council data centres Funding – Are current economic models for preservation & data sharing infrastructure a) appropriate? b) adequate? c) sustainable?
– Should inform prioritisation and investment
Closing thoughts 2
• • Good Practice requirements – – – Data management and sharing Policies Data Management Plans (peer-reviewed) Institutional data curation policies & planning Technical interoperability and integration – – – Data are diverse and complex JISC IIE vision of discovery across repositories Contextual linking offers opportunity for data centres and institutional repositories to realise synergies and work more closely together
Closing thoughts 3
• • Advocacy – – – – Programmes to reach across sectors Harmonisation and consistent messages Tailored & targeted to disciplines Researcher has some curatorial responsibility Training – – – – Lack of skills eLearning opportunity Data scientists? Recognition and career development “Native” data scientists are coming….
“Dealing with the Data Deluge”
• • • • • • JISC Repositories Programme Supporting Institutions in the Digital Age Digital Repositories Conference 5-6 June 2007 University of Manchester Research Data Strand