Integrating Structure and Semantics into Audio-visual Documents Raphaël Troncy Tuesday 21st of October, 2003 2nd International Semantic Web Conference (ISWC2003)

Download Report

Transcript Integrating Structure and Semantics into Audio-visual Documents Raphaël Troncy Tuesday 21st of October, 2003 2nd International Semantic Web Conference (ISWC2003)

Integrating Structure
and Semantics into
Audio-visual Documents
Raphaël Troncy
Tuesday 21st of October, 2003
2nd International Semantic Web
Conference (ISWC2003)
Description of the AV content
• A three step process :
– identification of the content creator and the
content provider : Dublin Core metadata, VRA
core categories …
– structural decomposition in video segments
corresponding to the logical structure of the
program : time-code, spatial coordinates
– semantic description of these segments :
controlled vocabulary, thesaurus, free text annotation
10/21/2003
Raphaël Troncy - ISWC'2003
1
Description of the AV content
describe the logical
structure
• Segmentation
– locate and date some
events
time t
• Description
– type each segment with an
AV genre
– type each segment with a
general thematic
report
– describe the scene (who,
when, where, what, …)
Michael Johnson smashed the 200m
world record to complete a 200m in
19''32 in Atlanta for the Olympic Games
athletics
describe the semantics of the content
10/21/2003
Raphaël Troncy - ISWC'2003
2
Example
13 [Indoor Set: 6th part]
at 18:43:56:00 - 00:09:06:00. – Eurosport
In studio, the second part of the interview, from Nice, of Sandy CASAR
by Jean René GODART about the Paris-Nice cycling race and a few
sports news with pictures commented by Alexandre BOYON and
Laurent PUYAT.
Q : Find all AV sequences of type dialog
sequence
with a
interview
with Sandy
rider
race cycling
with several
Casarand
andconcerning
concerningany
thecycling
Paris-Nice
race stages
– noise answer : there are other sports news in the sequence
– incomplete answer : the interview was broadcasted in two parts
and began in a previous sequence
– the query cannot be extended !
10/21/2003
Raphaël Troncy - ISWC'2003
3
Problems
• Weak use of the logical structures
• Descriptions are not made for reasoning
 make the AV descriptions accessible
to automated processes
• Requirements :
– express models that constrain the logical structure
• Which
languages
are
the
most
suitable
to
identify an interview inside a report of a sports magazine
perform all these tasks ?
– represent the meaning contained in this structure
• a cartoon is a fiction with no real characters
– describe semantically the content of each sequence
• the Prologue is always an individual time trial numbered stage 0
10/21/2003
Raphaël Troncy - ISWC'2003
4
"Pure" documentary approaches
• General bibliographic description languages (DC, VRA)
• MPEG-7 : the new multimedia description language ?
– three components: D, DS and DDL
– structure: segment = abstract unit defined by temporal localization or
masks
– semantics: entity–attribute–relation model + thesaurus for structuring the
knowledge (Classification Schemes)
– tools: Videto (ZGDV), Vizard (EU-IST Project), MovieTool (© Ricoh)
• Extensions
– XML Schema : add structure without semantics
• TV Anytime, Mdéfi [Tran Thuong, 2003]
– Classification Schemes : very poor expressivity
• COALA [Fatemi, 2003]
10/21/2003
Raphaël Troncy - ISWC'2003
5
KR approaches: OWL+RDF
• Definition of concepts and relations
StudioProgram  and ( HomogeneousProgram
(all hasPart StudioSequence) )
• Definition of axioms
HomogeneousProgram  HeterogeneousProgram = 
 Problem : the structure of the
document (i.e. the context) is lost !
 let us merge the two approaches !
10/21/2003
Raphaël Troncy - ISWC'2003
6
General architecture
MPEG-7 /
XML Schema
OWL / RDF
Transformation
users
AV Ontology
Document
schemes
valid
documentalists
query
statements
base
Document
instances
Domain-specific Ontology
10/21/2003
Raphaël Troncy - ISWC'2003
7
The Audio-visual Ontology
• Methodology of construction: [Bachimont et al., EKAW’02]
– Conceptualization : differential principles
– Formalization : formal definitions, axioms
– Operationalization : export into a KR language
• AV domain:
– Production objects (program, sequence, AV genre), Properties
(theme), Persons, Technical Process (shooting, recording, postproduction), Signal descriptors (audio, video), etc.
• Tools:
– Conceptualization : DOE [Bachimont et al., EKAW’02]
– Formalization : OilEd [Bechhofer, KI’01]
– Languages : DAML+OIL … OWL
• DOE and ontologies are available at :
http://opales.ina.fr/public/ontologies/
10/21/2003
Raphaël Troncy - ISWC'2003
8
The Audio-visual Ontology
10/21/2003
Raphaël Troncy - ISWC'2003
9
General architecture
MPEG-7 /
XML Schema
OWL / RDF
Transformation
users
AV Ontology
Document
schemes
valid
documentalists
query
statements
base
Document
instances
Domain-specific Ontology
10/21/2003
Raphaël Troncy - ISWC'2003
10
Generate XML Schema types
Some concepts (program, sequence) extend the MPEG-7
Segment type, hence the descriptions are MPEG-7 valid
OWL
•
•
•
• Class
• Sub-class
• Restriction on
properties
• Union of classes
•
XML Schema
Complex type
Extension
Element of the content
model
Choice in the content
model
XSLT ?
10/21/2003
Raphaël Troncy - ISWC'2003
11
Build description schemes for the
documents
• Let us watch some sports magazine
– construction of a simple schema based on
StudioSequence, Report and Interview
– a Report contains some FilmClips of Broadcast Live
Sports
• The schema provides the description skeleton
for several sports magazine:
– Téléfoot (soccer)
– VéloClub (cycling)
– 3 Partout (multisports)
10/21/2003
Raphaël Troncy - ISWC'2003
12
General architecture
MPEG-7 /
XML Schema
OWL / RDF
Transformation
users
AV Ontology
Document
schemes
valid
documentalists
query
statements
base
Document
instances
Domain-specific Ontology
10/21/2003
Raphaël Troncy - ISWC'2003
13
SegmenTool
10/21/2003
[French project CHAPERON]
Raphaël Troncy - ISWC'2003
14
Instantiate a document content model
<ina:Report id="aa23c647c-6517-4aee-8bce-870ae52a01af">
...
<mp7:TemporalDecomposition>
<ina:Interview id="adb23ab65-f8e7-4b2a-8c98-807197da600a">
<mp7:Semantic>...</mp7:Semantic>
Interview
<mp7:MediaTime>
hasStartTime
<mp7:MediaTimePoint>T00:24:19</mp7:MediaTimePoint>
hasThemes
hasDuration
<mp7:MediaDuration>PT00H00M07S</mp7:MediaDuration>
</mp7:MediaTime>
Cycling
7s
24m19s
<ina:Themes value="Cycling"/>
</ina:Interview>
</mp7:TemporalDecomposition>
...
</ina:Report>
KB
RDF triples
10/21/2003
Raphaël Troncy - ISWC'2003
15
General architecture
MPEG-7 /
XML Schema
OWL / RDF
Transformation
users
AV Ontology
Document
schemes
valid
documentalists
query
statements
base
Document
instances
Domain-specific Ontology
10/21/2003
Raphaël Troncy - ISWC'2003
16
The Cycling Ontology
10/21/2003
Raphaël Troncy - ISWC'2003
17
Knowledge base population
Cycling
Domain
Base of
facts
text
text
text
+
<rdf:Description
rdf:about="http://../Stade2-17_03_2002.xml#ina:Interview[@id=interview4]">
.....
</rdf:Description>
Rider
<rdf about="{URI}/SportsMagazine/Report3/Interview4">
overallResults
hasName
<!– formal statements from a base of fact} -->
Sandy Casar
position
</rdf>
10/21/2003
2
cyclingRace
Several
StagesRace
hasName
Paris-Nice
Raphaël Troncy - ISWC'2003
18
Implementation of the KB
• Sesame : architecture for the storage of RDF triples
[Broekstra, 2002]
– Supports different query languages: RQL, RDQL and SeRQL
– Implements the RDFS semantics (RDF-MT engine)
• BOR : reasoner for the DAML+OIL language [Simov &
Jordanov, 2002]
• SeBOR : integration of the two systems, done in the
On-To-Knowledge EU-IST Project
– Enhanced inference services are provided
– Closed to what OWL DL reasoner will perform
10/21/2003
Raphaël Troncy - ISWC'2003
19
Sesame+BOR interface
Demo
10/21/2003
Raphaël Troncy - ISWC'2003
20
Conclusion
• General architecture for reasoning on descriptions
of video documents:
–
–
–
–
Modeling of 2 ontologies (methodology + DOE)
Formalization of these ontologies (OilEd, OWL)
Creation of document schemes (extended MPEG-7)
Creation of instances of these schemas: the structure of
the descriptions (SegmenTool + XSLT transformation for
creating a base of RDF triples)
– Creation of a Knowledge Base of events related to
cycling race and use of an adapted reasoner (Sesame +
BOR, ©AIdministrator-NL & ©OntoText-BG)
10/21/2003
Raphaël Troncy - ISWC'2003
21
Future work
• Development integration
– provide a simple interface for querying on both the structure and
the content of the video
– watch the AV sequences corresponding to the RDF triples returned
by SeBOR
• Mid-term objectives
– scalability: test the system on a large base of videos annotated with
real users
– use the future OWL reasoners
• Long-term objectives
– use this architecture with another domain (other than cycling)
– will we have to simply build another ontology ? what do we have to
adapt ?
10/21/2003
Raphaël Troncy - ISWC'2003
22