Transcript Document
The use of concept maps and automatic terminology extraction during the development of a domain ontology. Lessons learnt. Alexander Garcia C CGIAR/CIAT National Bioinformatics Network, South Africa “The world was so recent that many things lacked names, and in order to indicate them it was necessary to point” “What's in a name? That which we call a rose by any other name would smell as sweet" Outline • • • • • • Background information Our scenario Modifications Terminology extraction Story telling within ontology development Brief discussion and future work A semantic web scenario 1. Distributed information processing with ontologies: within the SW scenario, ontologies are developed by geographically distributed domain experts willing to collaborate, whereas KE deals with centrally-developed ontologies. 2. Domain expert-centric design: within the SW scenario, domain experts guide the effort while the knowledge engineer assists them. There is a clear and dynamic separation between the domain of knowledge and the operational domain. In contrast, traditional KE approaches relegate the role of the expert as an informant to the knowledge engineer. 3. Ontologies are in constant evolution in SW, whereas in KE scenarios ontologies are simply developed and deployed. 4. Additionally, within the SW scenario, fine-grained guidance should be provided by the knowledge engineer to the domain experts. Enterprise Methodology TOVE Methodology Unified Methodology Methontology Diligent GM Description of stages High-level description of stages. High-level description of stages. Stages are described for the chemical ontology. High level description. Terminology extraction Generality N/A Detail is provided for those ontologies developed with this methodology N/A N/A N/A N/A High level description as well as detailed information for each step N/A Not domain specific Not domain specific Not domain specific Not domain specific Not domain specific Ontology evaluation Competency questions Competency questions and formal axioms No evaluation method is provided An informal evaluation method is used for the Chemical ontology. Distributed / decentralized No No No No The community evaluates the ontology; agreement process. Yes N/A Chemical ontology N/A N/A WebODE N/A Usability N/A Supporting software N/A Business and foundational ontologies N/A Not domain specific No evaluation method is provided Yes Protégé, CMap tools CMs and the Ontology development process Concept maps Our scenario • Following Netches “an ontology defines the basic terms and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary” • The main goal for the GMS ontology is to describe the breeding history of germplasm • The GMS is part of a bigger picture, LIMS • Our domain experts: – Confusion between database schemata and ontology. Domain experts were not fully aware of the difference between the conceptual and the relational model – Domain experts were at the same time users, designers, developers, and policy makers of a particular kind of GMS. Their vision was too broad on the process but at the same time too narrow on the software Our competency question • • • • • • • • • • • Does the germplasm belong to an out-breeding, in-breeding or vegetatively reproduced species? Is the germplasm homozygous or heterozygous? Is the germplasm homogeneous or heterogeneous? What type of cultivar (fixed lines, hybrid, clone, etc.) is formed? How has the germplasm been stored? Where did this germplasm come from (e.g. how did I get it)? What are its parents, grandparents, ancestors, descendants, and relatives? What probability do they have of having genes in common? What proportion of genes is expected to come from a list of ancestors? What parents do they have in common? Given an allele of a gene, from which ancestor did it come? Modifications to the GM process CMs, what for? • Knowledge elicitation • Supporting an argumentative structure • Annotation/documentation of the evolution of the ontology • Instances, classes, process Terminology extraction and CMs • We used Text2ONTO as our terminology extraction tool -KAON • In parallel to our terminology extraction exercises, our domain experts were building informal ontology models • By extracting terminology we could indistinctively gather instances and possible classes • By combining terminology extraction and CMs we could “classify words” and at the same time gather a narrative that helped us to validate the ontology Reorganizing our conceptual structure • This task resembled in many ways the card-sorting technique , but also a story-telling participatory exercise. Once the re-shuffling was complete, and the narratives analyzed, our baseline ontology (e.g. one containing only those seminal elements of an ontology) was ready along with a set of instances. Narratives and ontology evaluation • In order to assist the knowledge engineer in the harmonization of those concept maps gathered, domain experts were required to tell a unified story that could bring together those different concept maps. Brief discussion • • • • Decentralized thinking The “man” in the middle Empower domain experts Separation between operational domain and domain of knowledge • Card sorting may not be so applicable Future work • How can Cmaps support a collaborative environment? • Better support terminology extraction • Better understanding of the ontology life-cycle Acknowledgments • Mark Ragan, Ben Goodman, Christian Tempich, Angela Norena… And all of my domain experts for their patience, support and understanding.