NeuroLex.org: A comprehensive semantic wiki for neuroscience Stephen D. Larson, Fahim Imam, Maryann E.
Download ReportTranscript NeuroLex.org: A comprehensive semantic wiki for neuroscience Stephen D. Larson, Fahim Imam, Maryann E.
NeuroLex.org: A comprehensive semantic wiki for neuroscience Stephen D. Larson, Fahim Imam, Maryann E. Martone Department of Neuroscience, University of California, San Diego, CA, USA INTRODUCTION COLLABORATIVE ONTOLOGY MANAGEMENT SYSTEM Bridging the domain knowledge of a scientific community and the knowledge engineering skills of the ontology community is still an imperfect practice. Within the field of neuroscience, we have tried to close this gap by presenting an ontology through the medium of a wiki where each page corresponds to a class. By opening it to the World Wide Web, we have made the process of maintaining a ~20,000 concept neuroscience ontology (NIFSTD), more collaborative. The neuroscience community needs its basic domain concepts organized into a coherent framework. Ontologies provide an important medium for reconciling knowledge into a portable and machine readable form. For many years we have been building community ontologies for neuroscience, first through the Biomedical Informatics Research Network and now through the Neuroscience Information Framework projects (http://neuinfo.org). These projects resulted in the construction of a large modular ontology, constructed by importing existing ontologies where possible, called NIF Standard Ontology (NIFSTD; Bug et al, 2008). One of the largest roadblocks that we encountered was the lack of tools for domain experts to view, edit, and contribute their knowledge to NIFSTD. Existing editing tools were difficult to use or required expert knowledge to employ. By combining several open source technologies related to semantic wikis and NIFSTD, we have created NeuroLex.org, a semantic wiki for neuroscience. TECHNOLOGIES USED: Example of a category page (above) with inferred relationships Example of a category page (above) for “Neuron” with subcategories.. RELATED TOOLS & SYSTEMS The current incarnation of the NeuroLex draws significantly from interactions with other groups and experiences using other tools and systems listed below. Early conversations with the BiomedGT group were particularly helpful. The Biomedical Grid Terminology (BiomedGT) is an open, collaboratively developed terminology for translational research. This wiki is being developed by the National Cancer Institute Center for Bioinformatics and the Mayo Clinic Division of Biomedical Informatics with contributions from Apelon, Inc., Northrup Grumman and Dionne-Associates Inc. LexWiki Distributed Terminology Development Platform. This is a collaborative content development platform based on Semantic MediaWiki. Most similar to NeuroLex. Mayo Clinic Informatics group BioPortal is a Web-based application for accessing and sharing biomedical ontologies. The National Center for Biomedical Ontology. OwlSight is an OWL ontology browser that runs in any web browser. OwlSight is the client component and uses Pellet as its OWL reasoner. Clark & Parsia Dynamic, auto-generated table displaying subcategories of “Neuron” that are related to “Glutamate” via the “has Neurotransmitter” property on NeuroLex.org Dynamic, auto-generated tree representation of the “part of” relationships available on NeuroLex.org (above). ONLINE EDITING / STATISTICS Below: At a glance statistics for usage of the NeuroLex website An OWL Ontology Browser. University of Manchester, Stanford University, and other UK institutions Web based version of the Protégé ontology editor. Stanford University Total edits since launch ~112,000 (July 2008) Total views since launch 2,077,482 (July 2008) Class (Category) pages 9,443 Average edits per weekday (human) ~25 Active registered users making edits (includes contributing neuroscientists, curators, administrators) 11 Average hits per weekday ~250 A version of the Protégé ontology editor desktop application that allows multiple users to edit the same OWL file. Stanford University. NIF STANDARD ONTOLOGY Neuroscience Information Framework Standard Ontology (NIFSTD) Left: Example of the form displayed when editing a category page THE NIF STANDARD / NEUROLEX WORKFLOW At a glance guide to the differences between NeuroLex and NIFSTD An ontology engineer’s workflow to transition knowledge between the NeuroLex and the NIFSTD OWL files The semantic domains covered in the NIFSTD v0.5 OWL ontology (Bug et al., 2008) ONTOLOGY CONSTRUCTION: TOP-DOWN VS BOTTOM-UP NeuroLex 2. Bulk Upload Request of Terms NIFSTD A semantic mediawiki based website containing the content of the NIFSTD plus additional community contributions 1. Add/Edit NeuroLex Terms/Categories NIFSTD • Maximizes consistency of terms with each other • Making changes requires approval and re-publishing • Works best when domain to be organized has: small corpus, formal categories, stable entities, restricted entities, clear edges. • Works best with participants who are: expert catalogers, coordinated users, expert users. Categories NeuroLex Wiki Classes Content is fluid and can be updated at any time Versions are centrally controlled by an ontology engineer / curator • Multiple participants can edit the ontology instantly. • Semantics are limited to what is convenient for the domain. • Not a replacement for top-down construction, but sometimes necessary to increase flexibility and accessibility for nonontologist domain experts • Necessary when domain has: large corpus, no formal categories, unstable entities, unrestricted entities, no clear edges • Necessary when participants are: uncoordinated users, amateur users, naïve catalogers Taken in part from Clay Shirky “Ontology is overrated: categories, links and tags” (http://www.shirky.com/writings/ontology_overrated.html) REFERENCES 9. Update NIFSTD (Production) || || OntoQuest Ontology structured following OBO foundry principles Defines relationships between categories as simple properties Defines relationships between classes as OWL restrictions ||| | 6. Testing in BioPortal NeuroLex 11. Update OntoQuest Structure of class hierarchy based on OBO foundry principles, but may deviate in some cases 5. Testing in OntoQuest NIF-STD OWL Files 10. Update NeuroLex 12. Update BioPortal NIF-STD OWL Files (Pointing to PURLs) BioPortal 7. Keep Persistent Link to Older Versions 8. Update Project Wiki Release Notes 13. Update TextPresso Bucket TextPresso Bottom-up ontology construction NeuroLex Feedback 4. Update NIFSTD (Testing) Collection of cohesive, unified modular ontologies deployed in OWL 1 Top-down ontology construction • A select few authors have write privileges 3. Identify Valid Contributions RELEASE • A collection of OWL modules covering distinct domains of biomedical reality • A consistent, unified representation of biomedical domains typically used to describe neuroscience data (e.g. anatomy, cell types, techniques) • Reuses existing community ontologies that contain the required biomedical domains • Digital resource (tools, database) being created throughout the neuroscience community. • Comprises ~ 20,000 classes. • Built using the BFO, OBO Relations Ontology • Built with OBO foundry best practices • Constructed using Protégé and other custom tools • Provides semantic content for the NIF search engine (http://www.neuinfo.org/nif/nifgwt.html) PROS & CONS PROS CONS Easy to edit /add individual classes Difficult to do bulk edits or uploads of classes – working around this with additional code Content is easily available on the web by navigating to a URL Transforming entire class hierarchy into RDF/OWL requires manual processing Easy to create summary tables and hierarchies of groups of classes All classes are held in the same namespace – breaks modularity Recent changes list tracks all edits made globally and per class Stabilizing a version of the vocabulary requires manual processing (see workflow) Wikis more recognized as knowledge systems e.g. Society for Neuroscience recently encouraging its members to update Wikipedia & NeuroLex No silver bullet; still difficult to get contributions from neuroscientists without a lot of education and hand holding Bug WJ, Ascoli GA, Grethe JS, Gupta A, Fennema-Notestine C, Laird AR, Larson SD, Rubin D, Shepherd GM, Turner JA, Martone ME, (2008) The NIFSTD and BIRNLex Vocabularies: Building Comprehensive Ontologies for Neuroscience, Neuroinformatics, 31 October 2008. Shirky, C “Ontology is overrated: categories, links and tags” (http://www.shirky.com/writings/ontology_overrated.html)