Making research findings visible – the future of the scientific paper Matthew Cockerill Publisher, BioMed Central.
Download
Report
Transcript Making research findings visible – the future of the scientific paper Matthew Cockerill Publisher, BioMed Central.
Making research findings
visible – the future of the
scientific paper
Matthew Cockerill
Publisher, BioMed Central
"There is nothing more amusing than
watching business interests work
themselves up into a righteous frenzy over
a threat to their monopoly profits from a
new technology or some upstart with a
different business model. Invariably, the
monopolists… try to present themselves as
champions of the consumer, or defenders of
a level playing field, as if they hadn't
become ridiculously rich by sticking it to
consumers and enjoying years in which the
playing field was tilted to their advantage."
Steven Pearlstein in the Washington Post, July 19 2006
Status of open access publishing
Momentum for transition to OA
We are seeing action (not just words) from
funding agencies and governments
– Wellcome and several UK research councils now require OA
deposit as a condition of grants
– Federal Research Public Access Act may do the same in US
OA journals continue to grow rapidly
Impressive impact factors demonstrate OA
and quality are absolutely compatible
Move to OA basically unstoppable
Rolling 28-day count of submissions to BioMed Central
Journals
Growth of OA
1400
1200
Submissions
1000
800
600
400
200
0
Jul-00
Jan-01
Jul-01
Jan-02
Jul-02
Jan-03
Jul-03
Jan-04
Jul-04
Jan-05
Jul-05
Jan-06
Jul-06
Impact factors
Genome Biology – IF 9.71
BMC Bioinformatics – IF 4.96
BMC Genomics – IF 4.09
Genome Biology is:
10th of 124 in GENETICS & HEREDITY
4th of 139 in BIOTECHNOLOGY & APPLIED MICROBIOLOGY
What does this mean for the
future of the scientific article?
Why did we start BioMed Central
as an open access publisher?
Limited access to research articles makes
further research needlessly inefficient
Barriers to access obstruct interdisciplinary
cross-fertilization
It is in the interest of researchers for their
research being read and cited as widely as
possible
Traditional scientific publishing is not an
effective market, and so high serials prices
mean a poor deal for the scientific community
The main reason we started
BioMed Central
Publications and data are a continuum
Publications include data
Publications are data
To make sense of data and publications
delivered by post-genomic science, we need
– The best possible tools
– The widest possible collection of raw material
Open access stimulates the creation of tools
by providing access to the raw material
The future of the scientific
article
Computers will be at least as
important as human readers
Text mining
Open access facilitates text mining
BioMed Central XML corpus of full
text articles is freely downloadable
The more semantics that are
captured in the XML, the richer the
possibilities for mining
Existing examples of automated
sifting of published research
Postgenomic
CiteULike
This is just
bibliographic information –
but it's a start
Semantic enrichment
Ensure that the rest of the knowledge
represented in scientific articles is
structured to be computer-readable
Ideally capture semantics
unambiguously at time of publication
Mining of free text is a stopgap/fall-back
It is not just articles that need semantic
enrichment, but data sets too
Appropriate standards are now emerging
RDF
Useful common technical standard
for expressing semantics
Subject-predicate-object triples
BioMed Central already exposes
bibliographic RDF for all articles
Tools like the PiggyBank can
capture RDF and then store it in
triple-stores (local or networked)
Semantic Laundry List
Scientific stuff
–
–
–
–
–
–
–
–
–
Genes
Proteins
Anatomy
Taxonomy
Small molecules/drugs
Macromolecules
Diseases
Experimental methodologies
Experimental data types
General stuff
– People, Places, Organizations, Relationships
NCBO
e.g. of enriched research
Neurocommons.org
A ScienceCommons project
Working with open access articles
from BioMed Central and PLoS
Attempting to define best
practices/gold standard for
semantic enrichment of articles
Text mining and enhanced
authoring tools both have role
The role of wikis
The challenge: Ontologies, to be useful,
must stay up-to-date and receive
ongoing maintenance and curation
Scope of problem is enormous - every
entity and relationship of relevance to
science
Wikis provide a promising approach perhaps the only viable approach
e.g. AuthorIDs
Projects at BioMed Central to
capture structured info
Case reports
Clinical trials
Biological processes
Chemical structures
Taxonomic descriptions
Publishing research articles in a more structured form
allows the results to be treated as a database
Structured authoring
Publicon – an experiment in
structured authoring
Benefits of structure
Live maths in articles
Live maths in articles
Problem – adding structure is a
hassle
Incentivize authors
Ideally, create structured
authoring tools that remove work
rather than add it (e.g. EndNote)
If you do create extra work for
authors, find a way to provide the
author with an immediate return
on investment
Reduce work - smart authoring
e.g. auto suggest
Standard way to disambiguate contacts
Why not chemicals, genes, species too?
– Unambiguously capture semantics
– Increase accuracy, save time, encourage uptake
Return on investment
Automatic update of meta-analysis
based on clinical trial data
Automatic list of closely-related
case reports from database
Automatic deposit of taxonomic
information in registry (Zoobank)
Q&A