One Scientist’s Wish List for STM Publishers Philip E. Bourne University of California San Diego [email protected] http://www.sdsc.edu/pb (see presentations and publications)

Download Report

Transcript One Scientist’s Wish List for STM Publishers Philip E. Bourne University of California San Diego [email protected] http://www.sdsc.edu/pb (see presentations and publications)

One Scientist’s Wish List for STM
Publishers
Philip E. Bourne
University of California San Diego
[email protected]
http://www.sdsc.edu/pb
(see presentations and publications)
My Perspective …
• Background in both IT and science (chemistry,
computational biology)
• My lab. distributes for free data equivalent to ¼ the
Library of Congress every month
• I am a supporter of open access (provided there is a
business model) and editor in chief of PLoS
Computational Biology
• I am Co-founder of SciVee Inc.
• I am becoming increasingly interested in scholarly
communication
I Readily Acknowledge Each Discipline is Different
Your Reaction to My
Viewpoint
Motivators: What is Wrong with
Me?
Well My Lab Anyway
We Cannot Possibly Read a Fraction of the
Papers We Should
Drivers of Change
Renear & Palmer 2009 Science 325:828-832
Hence We Are Scanning More Reading Less
Drivers of Change
Renear & Palmer 2009 Science 325:828-832
The Truth About the Scientific
eLaboratory
• I have ?? mail folders!
• The intellectual
memory of my
laboratory is in those
folders
• This is an unhealthy hub
and spoke mentality
Drivers of Change
The Truth About the Scientific
eLaboratory
• I generate way more negative that
positive data, but where is it?
• Content management is a mess
– Slides, posters…..
– Data, lab notebooks ….
– Collaborations, Journal clubs …
• Software is open but where is it?
• Farewell is for the data too
Drivers of Change
Computational Biology Resources Lack
Persistence and Usability. PLoS Comp. Biol.
4(7): e1000136
The Not so Hidden Truth About
Science
• Scientists place more emphasis on writing and
less on reading
• We are H factor obsessed, but interested in
other metrics
• We are driven by (in order):
– Grants
– Papers
– Teaching
– Community service
Drivers of Change
Enough About Me – What About
You?
Drivers of Change
Data and the Publication Are Disjoint
• PubMed contains
18,792,257 entries
• ~100,000 papers indexed
per month
• In Feb 2009:
– 67,406,898 interactive
searches were done
– 92,216,786 entries were
viewed
Drivers of Change
• 1078 databases
reported in NAR 2008
• MetaBase
http://biodatabase.org
reports 2,651 entries
edited 12,587 times
Biosciences Data as of April 14, 2009
Publishing Limitations
• A paper is an artifact of a previous era
• It is not the logical end product of eScience,
hence:
– Work is omitted
– Article vs supplement is a mess
– Visualization may be limited
– Interaction and enquiry are non-existent
– Rich media can help, but are rarely used
Drivers of Change
The Traditional PDF is an
Inferior Way to Convey the
Science
The Traditional PDF is not
the Natural End Product of
the Research Enterprise
A paper when
complete is
thrown over a high
wall to a publisher
and essentially
forgotten –
Perhaps it is time
to climb the wall?
Drivers of Change
uzar.wordpress.com
The Game is Afoot
It is being driven from the
top down and the bottom up
Database & Literature Integration
www.rcsb.org/pdb/explore/literature.do?structureId=1TIM
Context
Drivers of Change
BMC Bioinformatics Accepted
Semantic Tagging of Database Content
Drivers of Change
http://www.pdb.org
PLoS Comp. Biol. 6(2) e1000673
Interactive PDFs etc..
Article of the Future
Drivers of Change
Post-publication of Video and Paper
www.scivee.tv
Drivers of Change
Pubcast – Video Integrated with the
Full Text of the Paper
Pubcasts - A Unique Technology
Pubcasts - A Blend of Video, text, tables,
figures, PowerPoints, comments, ratings…
ALL SYNCHRONIZED FOR RAPID LEARNING
Don’t understand what
you are reading? Click
and have the author
pop-up and explain it!
See the scientists and the
experiments behind the
research papers and
textbooks
Mashups – www.scivee.tv
Postercasts
Drivers of Change
More on Semantic Tagging
Semantic Tagging
http://biolit.ucsd.edu
ICTP Trieste, December 10, 2007
Drivers of Change
26
http://biolit.ucsd.edu
This is Literature Post-processing
Better to Get the Authors Involved
• Authors are the absolute experts on the
content
• More effective distribution of labor
• Add metadata before the article enters the
publishing process
Drivers of Change
BMC Bioinformatics 2010 11:103
Word 2007 Add-in for Authors
• Allows authors to add metadata as they write, before they
submit the manuscript
• Authors are assisted by automated term recognition
– OBO ontologies
– Database IDs
• Metadata are embedded directly into the manuscript
document via XML tags, OOXML format
– Open
– Machine-readable
• Open source, Microsoft Public License
Drivers of Change
http://www.codeplex.com/ucsdbiolit
Automatic Knowledge Discovery for Those
with No Time to Read
Cardiac Disease
Literature
Immunology Literature
Shared Function
Drivers of Change
The Knowledge and Data Cycle
0. Full text of PLoS papers stored
in a database
4. The composite view has
links to pertinent blocks
of literature text and back to the PDB
4.
1.
1. A link brings up figures
from the paper
2.
3. A composite view of
journal and database
content results
3.
2. Clicking the paper figure retrieves
data from the PDB which is
analyzed
Here is What I
Want
1. User clicks on thumbnail
2. Metadata and a
webservices call provide
a renderable image that
can be annotated
3. Selecting a features
provides a
database/literature
mashup
4. That leads to new
papers
PLoS Comp. Biol. 1(3) e34
Let Us Summarize Where We Are
• Scientists (aka authors, consumers) have
problems at home (aka lab.)
• Publishers have problems at home (changing
business models, demands etc.)
• Change is afoot, both top down and bottom
up
Lets Catch Our Breath
So What Do I Think We Should Do To
Solve My Problems and Your
Problems?
What Should We Do?
Consider Today’s Academic Workflow
Reviews
Curation
Feds
Research
[Grants]
Journal
Article
Publishers
Poster
Session
Conference
Paper
Community Service/Data
What Should We Do?
Societies
Blogs
Consider Tomorrow’s Academic Workflow
Reviews
Curation
Feds
Ideas, Data, Hypotheses
Research
[Grants]
Journal
Article
Publishers
Poster
Session
Conference
Paper
Community Service/Data
What Should We Do?
Societies
Blogs
“We have an interaction with the publisher that
does not begin when the scientific process ends,
but begins at the beginning of the scientific
process itself. “
What Should We Do?
PLoS Comp Biol End of May
Maybe The Line is Somewhere Else?
Scientist
Laboratory
Idea
Experiment
Data
Conclusions
What Should We Do?
Publish
Publisher
Maybe The Line is Somewhere Else?
Laboratory
Scientist
Idea
Experiment
Institution
Data
Lab Notebook
What Should We Do?
Conclusions
Publish
Publisher
Problems with Publishing Workflows
•
•
•
•
•
•
•
Workflows are not linear
Workflow : paper is not 1:1
Confidentiality
Peer review
Infrastructure
Community acceptance
Reward system
What Should We Do?
Solutions to Publishing Workflows?
• New organizations (university as publisher?)
• Appropriate reward system
• Shared governance
– author, institution, publisher
• Crowd sourcing the electronic printing press
What Should We Do?
Crowd Sourcing the Electronic Printing Press
(aka Workshop: Beyond the PDF)
• Proposal to the US National Science
Foundation:
• Aims:
– Define user requirements
– Establish a specification document
– Open source the development effort
– Have a commitment from a publisher to publish a
research object using the system
– Act as an exemplar for what can be done
Logistics
• UC San Diego
• Sometime in the
Fall/winter 2010
• Under the auspices of
W3C
• FoRC will have a follow
on meeting
Those Interested Thus Far
Virginia Barbour
Paul Ginsparg
Colin Batchelor
Carole Goble
richard k belew
Alexander Griekspoor
Tanya Beradini
Timo Hannay
Geoffrey Bilder
Eduard Hovy
Peter Binfield
Peter Jerram
Theodora Bloom
Heather Joseph
Katy Borner
Julia Lane
Philip Bourne
Barend Mons
Jean-Claude Bradley
peter murray-rust
Patrick Brown
Catherine Nancarrow
Todd Carpenter
Cameron Neylon
Richard Cave
David Patterson
Tim Clark
Mark Patterson
Matthew Cockerill
Tracy Pelon
Matt Day
Dan Pollock
Lee Dirks
[email protected]
Jonathan Eisen
Brian Schottlaender
Michael Eisen
Borya Shakhnovich
Lynn Fink
David Shotton
Marc Friedman
Elliot Siegel
Pascale Gaudet
[email protected]
[email protected]
Herbert Van de Sompel
Question: What if Everyone Had An
Electronic Printing Press?
•
•
•
•
•
•
Peer review might change?
Bibliometrics might change?
Business models will likely change?
What happens to the database/literature divide?
Societies might do more self publishing?
We might have improved the dissemination of
science, but will we have improved the
comprehension?
General References
• What Do I Want from the Publisher of the
Future PLoS Comp Biol (in Press)
http://www.sdsc.edu/pb
• Fourth Paradigm: Data Intensive Scientific
Discovery
http://research.microsoft.com/enus/collabora
tion/fourthparadigm/
References to Exemplars
• Semantic Biochemical Journal - 2010: Using Utopia
• Article of the Future, Cell, 2009:
• Prospect, Royal Society of Chemistry, 2009:
• Adventures in Semantic Publishing, Oxford U, 2009:
• The Structured Digital Abstract, Seringhaus/Gerstein, 2008
• CWA Nanopublications – 2010
Acknowledgements
• BioLit Team
–
–
–
–
–
Lynn Fink
Parker Williams
Marco Martinez
Rahul Chandran
Greg Quinn
• Microsoft Scholarly
Communications
–
–
–
–
–
Pablo Fernicola
Lee Dirks
Savas Parastitidas
Alex Wade
Tony Hey
http://biolit.ucsd.edu
http//www.pdb.org
http://www.codeplex.com/ucsdbiolit
• wwPDB team
• SciVee Team
– Apryl Bailey
– Tim Beck
–
–
–
–
–
–
http://www.scivee.tv
Leo Chalupa
Lynn Fink
Marc Friedman (CEO)
Ken Liu
Alex Ramos
Willy Suwanto
[email protected]
Questions?