Transcript Document

The use of concept maps and
automatic terminology extraction
during the development of a
domain ontology. Lessons learnt.
Alexander Garcia C
CGIAR/CIAT
National Bioinformatics Network,
South Africa
“The world was so
recent that many
things lacked names,
and in order to
indicate them it was
necessary to point”
“What's in a name? That
which we call a rose
by any other name would
smell as sweet"
Outline
•
•
•
•
•
•
Background information
Our scenario
Modifications
Terminology extraction
Story telling within ontology development
Brief discussion and future work
A semantic web scenario
1. Distributed information processing with ontologies: within the SW scenario,
ontologies are developed by geographically distributed domain experts
willing to collaborate, whereas KE deals with centrally-developed
ontologies.
2. Domain expert-centric design: within the SW scenario, domain experts
guide the effort while the knowledge engineer assists them. There is a clear
and dynamic separation between the domain of knowledge and the
operational domain. In contrast, traditional KE approaches relegate the role
of the expert as an informant to the knowledge engineer.
3. Ontologies are in constant evolution in SW, whereas in KE scenarios
ontologies are simply developed and deployed.
4. Additionally, within the SW scenario, fine-grained guidance should be
provided by the knowledge engineer to the domain experts.
Enterprise
Methodology
TOVE
Methodology
Unified
Methodology
Methontology
Diligent
GM
Description
of stages
High-level
description of
stages.
High-level
description of
stages.
Stages are
described for
the chemical
ontology.
High level
description.
Terminology
extraction
Generality
N/A
Detail is
provided for
those
ontologies
developed with
this
methodology
N/A
N/A
N/A
N/A
High level
description
as well as
detailed
information
for each
step
N/A
Not domain
specific
Not domain
specific
Not domain
specific
Not domain
specific
Not domain
specific
Ontology
evaluation
Competency
questions
Competency
questions and
formal axioms
No evaluation
method is
provided
An informal
evaluation
method is used
for the
Chemical
ontology.
Distributed /
decentralized
No
No
No
No
The
community
evaluates
the
ontology;
agreement
process.
Yes
N/A
Chemical
ontology
N/A
N/A
WebODE
N/A
Usability
N/A
Supporting
software
N/A
Business and
foundational
ontologies
N/A
Not
domain
specific
No
evaluation
method is
provided
Yes
Protégé,
CMap tools
CMs and the Ontology development
process
Concept maps
Our scenario
• Following Netches “an ontology defines the basic terms and
relations comprising the vocabulary of a topic area, as well as the
rules for combining terms and relations to define extensions to the
vocabulary”
• The main goal for the GMS ontology is to describe the breeding
history of germplasm
• The GMS is part of a bigger picture, LIMS
• Our domain experts:
– Confusion between database schemata and ontology. Domain experts were
not fully aware of the difference between the conceptual and the relational
model
– Domain experts were at the same time users, designers, developers, and
policy makers of a particular kind of GMS. Their vision was too broad on
the process but at the same time too narrow on the software
Our competency question
•
•
•
•
•
•
•
•
•
•
•
Does the germplasm belong to an out-breeding, in-breeding or
vegetatively reproduced species?
Is the germplasm homozygous or heterozygous?
Is the germplasm homogeneous or heterogeneous?
What type of cultivar (fixed lines, hybrid, clone, etc.) is formed?
How has the germplasm been stored?
Where did this germplasm come from (e.g. how did I get it)?
What are its parents, grandparents, ancestors, descendants, and
relatives?
What probability do they have of having genes in common?
What proportion of genes is expected to come from a list of
ancestors?
What parents do they have in common?
Given an allele of a gene, from which ancestor did it come?
Modifications to the GM process
CMs, what for?
• Knowledge elicitation
• Supporting an argumentative structure
• Annotation/documentation of the
evolution of the ontology
• Instances, classes, process
Terminology extraction and
CMs
• We used Text2ONTO as our terminology
extraction tool -KAON
• In parallel to our terminology extraction exercises,
our domain experts were building informal ontology
models
• By extracting terminology we could indistinctively
gather instances and possible classes
• By combining terminology extraction and CMs we
could “classify words” and at the same time gather a
narrative that helped us to validate the ontology
Reorganizing our conceptual structure
•
This task resembled in many ways the card-sorting technique , but also a story-telling participatory exercise.
Once the re-shuffling was complete, and the narratives analyzed, our baseline ontology (e.g. one containing
only those seminal elements of an ontology) was ready along with a set of instances.
Narratives and ontology evaluation
•
In order to assist the knowledge engineer in the harmonization of those concept maps gathered,
domain experts were required to tell a unified story that could bring together those different
concept maps.
Brief discussion
•
•
•
•
Decentralized thinking
The “man” in the middle
Empower domain experts
Separation between operational domain
and domain of knowledge
• Card sorting may not be so applicable
Future work
• How can Cmaps support a collaborative
environment?
• Better support terminology extraction
• Better understanding of the ontology
life-cycle
Acknowledgments
• Mark Ragan, Ben Goodman, Christian
Tempich, Angela Norena…
And all of my domain experts for their
patience, support and understanding.