Visual Contextualisation of Digital Content Introduction VICODI (www.vicodi.org) is a collaborative RTD project carried out by seven partners from six European countries under.

Download Report

Transcript Visual Contextualisation of Digital Content Introduction VICODI (www.vicodi.org) is a collaborative RTD project carried out by seven partners from six European countries under.

Visual Contextualisation of Digital Content
Introduction
VICODI (www.vicodi.org) is a collaborative RTD project carried out by seven
partners from six European countries under the 5FP IST programme of the European
Union. The main aim of this project is to develop a novel visual contextualisation
environment for digital content on the Internet. The VICODI system will demonstrate
the benefits of semantics by improving searching and navigation in historical
databases.
The development of a visualisation and contextualisation environment for digital
content addresses the management of
searching and retrieval as well the
management of information presentation
in two ways. First by creating an open
knowledge space that can be
enhanced by its users and, second, by
providing an innovative interface that
employs Scalable Vector Graphics (SVG)
for the presentation of information.
In order to achieve this the creation of a
Fig. 1: VICODI portal enhanced with
history ontology was required.
a visual contextualisation system
devoted to European history
• Historical Time: To deal with the complexity of time we have interval times and an event
centric ontology. This means that instances with a time-dependent relation are connected
using an event with an existence time which represents the validity of that connection. For
the VICODI prototype the intervals are precisely defined, although a novel fuzzy temporal
model has now been devised (reference) and its use is being explored for future follow-on
projects.
• Historical sources: To get round the lack of general repositories we decided to build our
own ontology of history based upon our empirical deductive analysis of a 2000 document
corpus.
Fig. 4: The expert annotation tool interface >>
Fig. 2: The six basic concepts
(“flavours”) of our shallow concept
hierarchy.
History ontology building
SVG enabled GUI and Knowledge Portal
The main purpose of the history ontology for the VICODI project is to help machine
algorithms in the automatic contextualisation task by storing relevant historical
knowledge in machine processable form. In order to achieve this goal an ontology with
a well-defined formal semantics is needed. The task of devising an ontology of history
is very daunting. On the one hand, it is always challenging to build an ontology
covering a broad and very complex area of knowledge. On the other hand, history has
several unique features which are problematic from an ontological point of view.
VICODI has created a web portal with a graphical contextualisation interface, which uses
Scalable Vector Graphics (SVG) to visualise digital content with the help of historical
(period) maps. The portal's web application connects and integrates all the various system
components. This interface is also the basis of a new knowledge portal entitled
eurohistory.net (see Fig. 1).
Problems
• The complexity of history is immense and requires an almost unlimited number of
instances and property relations. To complicate matters historians do not focus only on
“what” questions but also on “when”, “where”, “who”, “how” and most importantly on
“why” questions.
• Historical time is uncertain and often debated. It includes many unknown dates,
imprecise intervals (ca., approximately from to etc.), and overlapping time (historical
periods and events extending into each other without clear start and end dates).
Moreover, many ontology relations are time-dependent.
• Historical Sources: there are no comprehensive and large-scale thesauri of history.
During the evaluation of related works in the area we realised that existing approaches to
historical ontologies were not suitable for our purposes. Some used non-formal,
"intuitive" taxonomies, which mixed various, semantically different hierarchical
relationships ("is-a", "part-of", "member-of") which made them unsuitable for machine
processing (like Hassett or the UNESCO thesaurus). Others covered only a tiny area of
history (like the Getty location names) which was too limited for our goals. Finally, the
CIDOC CRM ontology standard has a formal conceptual hierarchy. However it is too
complex and inflexible for our domain experts to fill it with the necessary domain
knowledge (instances) which it does not presently contain. A
Our solutions
• Complexity: We use a shallow concept hierarchy starting from only six basic concepts
(called flavours), which are meaningful for domain/history experts: person, artefact,
group, event, abstract notion and location. The hierarchy below these concepts is
shallow (2-3 levels), stops at an abstraction level which is already
meaningful for historians, but is still general enough to make the place of new instances
in the ontology easy to find, which speeds up the population of the ontology with new
historical knowledge. The complexity of history is represented by connecting instances
of these flavours by various property relations.
The second authoring tool is the expert annotation tool
(see Fig. 4) which is used to insert and adjust the
context settings of the VICODI historical resources.
This tool also allows for every resource document
to be described by ontology instances. The
contextualisation engine will provide a starting
set of relevant ontology instances which the
expert historical editors can correct and/or add
new instances if required.
The portal contains a number of innovative elements stimulating user interaction with its
contents. Users can paste their history-related texts in a special contextualisation box and
have this information automatically processed and classified on the basis of LATCH
(Location, Time, Category) (Wurman et al 2000). Moreover, textual information is
visualised on a map of Europe from the corresponding historical period. European history
terms (listed in VICODI ontology) are automatically highlighted and their contextual
relevance is marked. The ontology search and browsing may be carried out either by
Yahoo-type browsing or by Location (SVG maps), Time (decades from 1000-2000AD)
and/or Subject (historical topics). The portal also provides web-based tools for the
uploading and authoring context of new historical content.
The Management System of Knowledge Space (MSKS) component is the core of the VICODI
architecture. It provides for the continuous storage and management of both the ontology and
the contextualized historical documents (repository). The MSKS is based on KAON for the
ontology storage and management.
Context Engine, Transformation Engine and
Multilinguality
The context engine uses text categorization to build correlation scores between documents and
the notions in the VICODI ontology. This allows the system to enhance the documents'
visualization and linkage to give the users a faster and more intuitive understanding of a
document's position among the notions represented in the ontology. The transformation engine
processes the data of the relevant contextual information from the context engine and outputs it
by either transforming it into SVG instances (dynamic maps) or by generating hyperlinked
(contextualised) documents for display in the user interface. The transformation engine will
support both graphics-based visualisation and text-based presentation of the contexts.
Another novel aspect of VICODI is the multilinguality offered by Systran. B The KAON
framework provides a lexical layer on top of a language-independent ontology core for each
language. Using that feature it is possible to translate the language-dependent part of the
ontology without disturbing its logical structure. Historical information will be translated via
an automatic translation tool into English, German, French and Latvian.
Conclusions
VICODI will provide an example of how a visualisation and contextualisation environment for
humanities digital content can be build. Some of the most important results are:
• The creation of a usable and extensible European history ontology.
• Complexity within the history ontology can most easily be achieved by a shallow concept
hierarchy and property relations.
• The capability of KAON to provide programmatic access to the ontology makes it possible to
mass upload instances with the aid of textual glossaries or Excel sheets.
Authoring tools
VICODI has two authoring tools. The first is a
revised ontology editor based on the KAON
ontology. This is an extension of the W3C RDFS
standard and developed by FZI, one of the project
partners (Maedche et al, 2003). The tool allows
scalability for editing ontologies, as well as
incorporating some usability issues related to
ontology management. The editor provides several
windows for representing the ontology and tools
for editing and adding concepts, instances and
property relations (see Fig. 3).
Management System of Knowledge Space
References
Fig. 3: The user interface of the
ontology editor.
Thanks to KAON’s capability to provide programmatic access to the ontology it is also
possible to add a huge number of instances and concepts to the ontology by processing
textual glossaries or Excel sheets. This - together with our simple and intuitive concept
hierarchy - significantly sped up the ontology populating process, as the history domain
experts could use their preferred software tools while codifying their knowledge. The
manual functions of the ontology editor are therefore only needed to carry out some
advanced operations, like relocating existing concepts and instances, adding new
connections and visualising the existing ontology structure.
Maedche, A. et al, 2002. “A Conceptual Modeling Approach for Semantics-Driven
Enterprise Applications”, Proceedings of the First International Conference on
Ontologies, Databases and Application of Semantics (ODBASE-2002), Springer, LNAI.
Motik, B. et al, 2003. “A Fuzzy Model for Representing Uncertain, Subjective and
Vague Temporal Knowledge in Ontologies”, Proceedings of the Second International
Conference on Ontologies, Databases and Application of Semantics (ODBASE-2003),
Springer, LNAI.
Wurman, R.S. et al, 2000, Information Anxiety 2, Que, Indianapolis.
Ahttp://cidoc.ics.forth.gr/official_release_cidoc.html
BSYSTRAN,
White Papers, MT Summit 2001, 2002, 2003,
http://www.systransoft.com/Technology/WhitePapers.html