NeuroLex.org: A comprehensive semantic wiki for neuroscience Stephen D. Larson, Fahim Imam, Maryann E.

Download Report

Transcript NeuroLex.org: A comprehensive semantic wiki for neuroscience Stephen D. Larson, Fahim Imam, Maryann E.

NeuroLex.org: A comprehensive semantic wiki for neuroscience
Stephen D. Larson, Fahim Imam, Maryann E. Martone
Department of Neuroscience, University of California, San Diego, CA, USA
INTRODUCTION
COLLABORATIVE ONTOLOGY MANAGEMENT SYSTEM
Bridging the domain knowledge of a scientific community and the knowledge engineering skills of the ontology community
is still an imperfect practice. Within the field of neuroscience, we have tried to close this gap by presenting an ontology
through the medium of a wiki where each page corresponds to a class. By opening it to the World Wide Web, we have
made the process of maintaining a ~20,000 concept neuroscience ontology (NIFSTD), more collaborative.
The neuroscience community needs its basic domain concepts organized into a coherent framework. Ontologies provide
an important medium for reconciling knowledge into a portable and machine readable form. For many years we have
been building community ontologies for neuroscience, first through the Biomedical Informatics Research Network and
now through the Neuroscience Information Framework projects (http://neuinfo.org). These projects resulted in the
construction of a large modular ontology, constructed by importing existing ontologies where possible, called NIF Standard
Ontology (NIFSTD; Bug et al, 2008). One of the largest roadblocks that we encountered was the lack of tools for domain
experts to view, edit, and contribute their knowledge to NIFSTD. Existing editing tools were difficult to use or required
expert knowledge to employ. By combining several open source technologies related to semantic wikis and NIFSTD, we
have created NeuroLex.org, a semantic wiki for neuroscience.
TECHNOLOGIES USED:
Example of a category page (above) with inferred relationships
Example of a category page (above) for “Neuron” with subcategories..
RELATED TOOLS & SYSTEMS
The current incarnation of the NeuroLex draws significantly from interactions with other groups and experiences using
other tools and systems listed below. Early conversations with the BiomedGT group were particularly helpful.
The Biomedical Grid Terminology (BiomedGT) is an open, collaboratively developed terminology for
translational research. This wiki is being developed by the National Cancer Institute Center for
Bioinformatics and the Mayo Clinic Division of Biomedical Informatics with contributions from Apelon, Inc.,
Northrup Grumman and Dionne-Associates Inc.
LexWiki Distributed Terminology Development Platform. This is a collaborative content development
platform based on Semantic MediaWiki. Most similar to NeuroLex. Mayo Clinic Informatics group
BioPortal is a Web-based application for accessing and sharing biomedical ontologies. The National Center
for Biomedical Ontology.
OwlSight is an OWL ontology browser that runs in any web browser. OwlSight is the client component and
uses Pellet as its OWL reasoner. Clark & Parsia
Dynamic, auto-generated table displaying subcategories of “Neuron” that are related to “Glutamate”
via the “has Neurotransmitter” property on NeuroLex.org
Dynamic, auto-generated tree representation of the “part of” relationships
available on NeuroLex.org (above).
ONLINE EDITING / STATISTICS
Below: At a glance statistics for usage
of the NeuroLex website
An OWL Ontology Browser. University of Manchester, Stanford University, and other UK institutions
Web based version of the Protégé ontology editor. Stanford University
Total edits since launch
~112,000
(July 2008)
Total views since launch
2,077,482
(July 2008)
Class (Category) pages
9,443
Average edits per weekday (human)
~25
Active registered users making edits (includes
contributing neuroscientists, curators,
administrators)
11
Average hits per weekday
~250
A version of the Protégé ontology editor desktop application that allows multiple users to edit the same
OWL file. Stanford University.
NIF STANDARD ONTOLOGY
Neuroscience Information Framework Standard
Ontology (NIFSTD)
Left: Example of the form displayed
when editing a category page
THE NIF STANDARD / NEUROLEX WORKFLOW
At a glance guide to the differences
between NeuroLex and NIFSTD
An ontology engineer’s workflow to transition knowledge between the
NeuroLex and the NIFSTD OWL files
The semantic domains covered in the NIFSTD v0.5 OWL ontology (Bug et al., 2008)
ONTOLOGY CONSTRUCTION: TOP-DOWN VS BOTTOM-UP
NeuroLex
2. Bulk Upload Request of
Terms
NIFSTD
A semantic mediawiki based
website containing the content of
the NIFSTD plus additional
community contributions
1. Add/Edit NeuroLex
Terms/Categories
NIFSTD
• Maximizes consistency of terms with each other
• Making changes requires approval and re-publishing
• Works best when domain to be organized has: small corpus, formal categories, stable entities, restricted entities,
clear edges.
• Works best with participants who are: expert catalogers, coordinated users, expert users.
Categories
NeuroLex Wiki
Classes
Content is fluid and can be
updated at any time
Versions are centrally controlled
by an ontology engineer / curator
• Multiple participants can edit the ontology instantly.
• Semantics are limited to what is convenient for the domain.
• Not a replacement for top-down construction, but sometimes necessary to increase flexibility and accessibility for nonontologist domain experts
• Necessary when domain has: large corpus, no formal categories, unstable entities, unrestricted entities, no clear edges
• Necessary when participants are: uncoordinated users, amateur users, naïve catalogers
Taken in part from Clay Shirky “Ontology is overrated: categories, links and tags” (http://www.shirky.com/writings/ontology_overrated.html)
REFERENCES
9. Update NIFSTD (Production)
||
||
OntoQuest
Ontology structured following
OBO foundry principles
Defines relationships between
categories as simple properties
Defines relationships between
classes as OWL restrictions
|||
|
6. Testing in BioPortal
NeuroLex
11. Update OntoQuest
Structure of class hierarchy based
on OBO foundry principles, but
may deviate in some cases
5. Testing in OntoQuest
NIF-STD
OWL Files
10. Update NeuroLex
12. Update BioPortal
NIF-STD OWL Files
(Pointing to PURLs)
BioPortal
7. Keep Persistent Link to Older
Versions
8. Update Project Wiki Release
Notes
13. Update TextPresso Bucket
TextPresso
Bottom-up ontology construction
NeuroLex
Feedback
4. Update NIFSTD (Testing)
Collection of cohesive, unified
modular ontologies deployed in
OWL 1
Top-down ontology construction
• A select few authors have write privileges
3. Identify Valid
Contributions
RELEASE
• A collection of OWL modules covering distinct domains of
biomedical reality
• A consistent, unified representation of biomedical domains typically
used to describe neuroscience data (e.g. anatomy, cell types,
techniques)
• Reuses existing community ontologies that contain the required
biomedical domains
• Digital resource (tools, database) being created throughout the
neuroscience community.
• Comprises ~ 20,000 classes.
• Built using the BFO, OBO Relations Ontology
• Built with OBO foundry best practices
• Constructed using Protégé and other custom tools
• Provides semantic content for the NIF search engine
(http://www.neuinfo.org/nif/nifgwt.html)
PROS & CONS
PROS
CONS
Easy to edit /add individual classes
Difficult to do bulk edits or uploads of classes – working around this with additional code
Content is easily available on the web by navigating to a URL
Transforming entire class hierarchy into RDF/OWL requires manual processing
Easy to create summary tables and hierarchies of groups of classes
All classes are held in the same namespace – breaks modularity
Recent changes list tracks all edits made globally and per class
Stabilizing a version of the vocabulary requires manual processing (see workflow)
Wikis more recognized as knowledge systems e.g. Society for Neuroscience recently encouraging
its members to update Wikipedia & NeuroLex
No silver bullet; still difficult to get contributions from neuroscientists without a lot of
education and hand holding
Bug WJ, Ascoli GA, Grethe JS, Gupta A, Fennema-Notestine C, Laird AR, Larson SD, Rubin D, Shepherd GM, Turner JA, Martone ME, (2008) The NIFSTD and BIRNLex Vocabularies: Building Comprehensive Ontologies for Neuroscience, Neuroinformatics, 31 October 2008.
Shirky, C “Ontology is overrated: categories, links and tags” (http://www.shirky.com/writings/ontology_overrated.html)