discovery.ucl.ac.uk

Download Report

Transcript discovery.ucl.ac.uk

UCL LIBRARY SERVICES
Science 2.0: Research
Data Management
Dr Paul Ayris
Director of UCL Library Services and UCL Copyright Officer
Chief Executive, UCL Press
President of LIBER (Association of European Research Libraries)
Chair, LERU Chief Information Officer Community League of European
Research Universities)
e-mail: [email protected]
UCL LIBRARY SERVICES
Contents
 The importance of Research Data
 LERU Research Data Roadmap
 Roles and Opportunities
 Text and Data Mining
 Next Steps
UCL LIBRARY SERVICES
Bibliography
 Science as an Open Enterprise (2012)
 http://royalsociety.org/policy/projects/science-publicenterprise/report/
 Susan Reilly, Opportunities for Data Exchange: optimising
the conditions for data sharing (2012). LERU Doctoral
Summer School, 9th July, 2012
 http://www.ub.edu/lerudss2012/en/material.html
 Opportunities for Data Exchange project website (2012)
 http://www.alliancepermanentaccess.org/index.php/community/curr
ent-projects/ode/
 UCL Research Data Management Policy (2013)
 http://www.ucl.ac.uk/isd/staff/research_services/research-data/
UCL LIBRARY SERVICES
Bibliography
 The Perfect Swell: defining the ideal conditions for the
growth of text and data mining in Europe. Report from a
workshop on Friday, September 27th 2013, organised by
LIBER Europe and held at the British Library (2013)
 http://www.libereurope.eu/sites/default/files/TDM%20Workshop%2
0Report%5B1%5D_0.pdf
 LERU Roadmap for Research Data (2014)
 http://www.leru.org/index.php/public/news/press-release-leruroadmap-for-research-data/
 A Scientist’s Take on the new Elsevier TDM Policy (2014)
 http://www.libereurope.eu/blog/a-scientists-take-on-the-newelsevier-tdm-policy
UCL LIBRARY SERVICES
Contents
 The importance of Research Data
 LERU Research Data Roadmap
 Roles and Opportunities
 Text and Data Mining
 Next Steps
UCL LIBRARY SERVICES
See Science as an open enterprise
http://royalsociety.org/policy/projects/scie
nce-public-enterprise/report/
UCL LIBRARY SERVICES
Technological change
 Modern computers permit massive datasets to be
assembled and explored in ways that reveal inherent but
unsuspected relationships. This data-led science is a
promising new source of knowledge (p. 7)
 The emergence of linked data technologies creates new
information through deeper integration of data across
different datasets with the potential to greatly enhance
automated approaches to data analysis (p. 7)
UCL LIBRARY SERVICES
Map of Interlinked Data
W3C (2012). Available at:
http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
UCL LIBRARY SERVICES
Open Data
 Open data is the idea that certain
data should be freely available to
everyone to use and republish as
they wish, without restrictions from
copyright, patents or other
mechanisms of control
Auer, S. R.; Bizer, C.; Kobilarov, G.; Lehmann, J.;
Cyganiak, R.; Ives, Z. (2007). "DBpedia: A Nucleus
for a Web of Open Data". The Semantic Web.
Lecture Notes in Computer Science 4825. p. 722.
doi:10.1007/978-3-540-76298-0_52. ISBN 978-3-54076297-3.
http://en.wikipedia.org/wiki/File:DNA_orbit_animated.gif
UCL LIBRARY SERVICES
Human Genome Project
 Aim: To determine the
sequence of chemical
base pairs which make
up human DNA, and to
identify and map the
total genes of the
human genome
See http://en.wikipedia.org/wiki/DNA
Benefits – felt from molecular medicine to human evolution
 Better understanding of disease
 Design of medication and prediction of their effects
 Commercial development of genomics research
UCL LIBRARY SERVICES
Richard III
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
Contents
 The importance of Research Data
 LERU Research Data Roadmap
 Roles and Opportunities
 Text and Data Mining
 Next Steps
UCL LIBRARY SERVICES
LERU Roadmap for Research Data
 Overseen by Research Data
Working Group
Pablo Achard (University of Geneva)
Paul Ayris (UCL, University College London)
Serge Fdida (UPMC, Paris)
Stefan Gradmann (University of Leuven)
Wolfram Horstmann (University of Oxford)
Ignasi Labastida (University of Barcelona)
Liz Lyon (University of Bath)
Katrien Maes (LERU)
Susan Reilly (LIBER)
Anja Smit (University of Utrecht)
UCL LIBRARY SERVICES
LERU Roadmap for Research Data
1. Policy and Leadership
2. Advocacy
3. Selection and Collection,
Curation, Description,
Citation, Legal Issues
4. Research Data Infrastructure
5. Costs
6. Roles, Responsibilities and
Skills
7. Recommendations to
different stakeholder groups
Cern, Geneva
UCL LIBRARY SERVICES
See http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/150.pdf
UCL LIBRARY SERVICES
Key Messages
 Each LERU university
needs a Research Data
Management Strategy
 Researchers should have
Research Data
Management Plans
 LERU universities need to
bring stakeholders together
 Benefits of ‘open data’ for
sharing and re-use should
be advocated and explored
 New role of Data Scientist
is emerging
King’s Cross, London
UCL LIBRARY SERVICES
Policy Development
 Case Study on Policy
development from UCL
 Drivers
 External funders
 Need to inform researchers
 Raise awareness of issues
facing UCL researchers
 Identifies roles and
responsibilities
 Data to be made open in
the most open manner
appropriate
 Researchers should have
Data Management Plans
 LERU slams lack of data
policies – Research Europe
UCL LIBRARY SERVICES
Open Data
 Open Data allows research data to be shared and re-used
 Avoids costly duplication of research activity
 Provides greater transparency in research activity
 Potential to speed discovery of solutions to societal Grand
Challenges, such as health care & environmental science
 Can all research data be open?
 Certain categories probably cannot
 National security
 Data protection
 Commercial Funder requirements
http://en.wikipedia.org/wiki/File:Open_Data_stickers.jpg
UCL LIBRARY SERVICES
Data
management
 Which of these
layers of
research data
need to be
 curated for a
fixed term?
 preserved for
the long term?
 thrown away?
 LERU Roadmap
identifies this as
an area for
future study
The ODE Data Publication Pyramid at
http://www.alliancepermanentaccess.org/wpcontent/uploads/downloads/2011/11/ODEReportOnIntegrationOfDataAndPublications-1_1.pdf
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
Collaboration a way forward
 LERU Rectors see this as an
area for study
 Collaboration between Dutch
institutions
 Focus is on research data
which lies behind
publications
 Each university and faculty
has its own Dataverse
installation
 Support services offered by
libraries in Dutch universities
Utrecht, Tilburg, Erasmus
University Rotterdam, Maastrict,
Groningen, 3TU Datacentrum and
Netherlands Institute of Ecology
UCL LIBRARY SERVICES
Collaboration – a
UCL Case Study
 UCL Research Data
Service
 Will curate outputs of ‘Big
Science’ funded by
projects
 Centrally funded by UCL
 Some cost recovery
 Add a preservation
service
 Advocacy for research
data management
Plaster Relief by John Flaxman, Flaxman Gallery, UCL
 UCL Library Services
 Will curate the outputs of
‘Small Science’
 Funded via the Library
 No cost recovery planned
 Oversee UCL policy
development
 Advocacy for research
25
data management
UCL LIBRARY SERVICES
Contents
 The importance of Research Data
 LERU Research Data Roadmap
 Roles and Opportunities
 Text and Data Mining
 Next Steps
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
7 areas of opportunity
 Availability
 Findability
 Interpretability
 Reusability
 Citability
 Curation
 Preservation
http://www.alliancepermanentaccess.org/index.php/community/currentprojects/ode/
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
Contents
 The importance of Research Data
 LERU Research Data Roadmap
 Roles and Opportunities
 Text and Data Mining
 Next Steps
UCL LIBRARY SERVICES
Text and Data Mining – What is it?
UCL LIBRARY SERVICES
European discussions on TDM
 Licences4Europe
 LIBER and research
stakeholder
organisations withdrew
from process
 LIBER’s TDM
Workshop in
September 2013
 Commission now
holding a copyright
consultation – until 5
March 2014
LIBER wants Exception in the
European Copyright and Database
Directives
UCL LIBRARY SERVICES
The view from a researcher
Dr Peter Murray-Rust (Cambridge)
 Elsevier is the sole author and controller of the policy –
there has been no Open discussion or agreement with
scholarly bodies
 Libraries have to – individually – sign agreements with
Elsevier. (Libraries have universally and unilaterally given
away all these rights over the last decade and support
publishers to forbid machine access to content)
 Researchers have to register as a developer (I think) and
ask permission of Elsevier for every project they wish to do
UCL LIBRARY SERVICES
And …
 Researchers can only mine text. Images are specifically
prohibited. This is useless for me – as I and colleagues are
mining chemical structure diagrams
 There is no indication of how current the material will be. I
shall be mining the literature an hour after it appears. Will
the API provide that?
 The amount that can be republished is often useless (“200
characters”). I want to build corpora (impossible);
vocabularies (essential to record precise words –
impossible); chemical names (often > 200 characters so
impossible). Figure captions (impossible)
UCL LIBRARY SERVICES
And …
 The researchers must commit to a CC-NC licence. This
effectively kills downstream use (I shall use CC0). It also
trains them into thinking CC-NC is a “good thing”. It isn’t
 If a researcher has a LEGITIMATE collection of papers
that they wish to mine (say on their hard disk) they are
forbidden. They have to go to each publisher (if this awful
protocol is promoted elsewhere) and find the API and mine
the individual papers. Absurd
UCL LIBRARY SERVICES
Contents
 The importance of Research Data
 LERU Research Data Roadmap
 Roles and Opportunities
 Text and Data Mining
 Next Steps
UCL LIBRARY SERVICES
UCL LIBRARY SERVICES
Next Steps in London
 Policy creation
 UCL has a Research Data Management policy at
http://www.ucl.ac.uk/isd/staff/research_services/research-data/
Do you?
 Advocacy
 Meetings with Research Committees at all 10 UCL Faculties to
raise awareness; Communications Strategy to follow
 Training
 UCL Library Services establishing a training programme for library
staff to be provided by the Library School, University of Sheffield
UCL LIBRARY SERVICES
Finally
 Breakout Groups
 Discussion
UCL LIBRARY SERVICES
Questions for
Breakout Groups
 What are the main
points your Research
Data Management
Policy should make?
 What are the drivers to
help engage with
researchers?