Storing and Querying Fuzzy Knowledge in the Semantic Web
Download
Report
Transcript Storing and Querying Fuzzy Knowledge in the Semantic Web
National Technical University of Athens, Greece
School of Electrical and Computer Engineering
Department of Computer Science
Image, Video and Multimedia Laboratory
Storing and Querying Fuzzy
Knowledge in the Semantic Web
N. Simou, G. Stoilos, V. Tzouvaras,
G. Stamou, S. Kollias
[email protected]
4th International Workshop on Uncertainty Reasoning for the Semantic Web
Sunday 26th October, 2008 Karlsruhe, Germany
Motivation
Ontologies and OWL Language play a significant role in the
Semantic Web
Optimized Reasoners (Fact, Pellet)
Various tools for storing and querying OWL ontologies
Crisp DLs lack the ability to represent uncertain information
Fuzzy DLs
Fuzzy Reasoners (FiRE, fuzzyDL)
No work on persistent storage and querying for expressive
fuzzy DLs ([Straccia2007] and [Pan 2007] are based on fuzzy
DL-Lite)
Contribution
It presents a novel framework for persistent storage and
querying of expressive fuzzy knowledge bases
It integrates Fuzzy Reasoner FiRE with the RDF Triple
Store Sesame
It provides experimental evaluation of the proposed
architecture using a real-world industrial use case
scenario
Outline
Queries in crisp DL
RDF Stores
The fuzzy DL f-SHIN
Fuzzy Extensions to Queries
Fuzzy OWL Syntax in Triples
Sesame integration with FiRE
Evaluation
Queries in DLs
Conjunctive Queries
q( X ) Y .conj ( X , Y )
e.g. x <- Man(x) ^ hasChild(x,y) ^ Man(y)
Query answering algorithms for crisp DLs
Are highly complex
A practically scalable system is not available
Various tools support queries for crisp DLs
RDF Stores
Data storage systems for storing and querying ontologies
Sesame, Jena, Kowari
Ontologies are stored in various triples format
RDF/XML
N Triples
Turtle
Support of Query languages
SPARQL
SeRQL (Sesame)
Plugins for incoplete OWL DL querying and reasoning
(Sesame-OWLim)
Fuzzy SHIN - Syntax
A fuzzy extension of DL SHIN
f-SHIN concepts are formed in the same way as in SHIN
C,D ::=⊤ | ⊥ | ¬C | C ⊓ D | C ⊔ D |
∃R.C | ∀R.C | ≥nR | ≤nR
R,P::= R - | Trans(R) | P⊑R
Assertions are extended to fit uncertainty
〈 nick : Tall ≥ 0.7 〉
〈 (nick, theo) : isFriend ≥ 0.6 〉
Fuzzy SHIN - Inference Services
Entailment
“Does axiom Ψ logically follow from the ontology T?”
Satisfiability
“Can the concept C have any instances with degree of
participation ⋈ n in models of ontology T?”
Subsumption
“Is the concept D more general than the concept C in models of
the ontology T?”
Greatest Lower Bound (GLB)
“What is the greatest degree n that our ontology entails an
individual a to participate in a concept C?”
Fuzzy Extensions to Queries
Conjunctive threshold Queries (CTQs) [Pan2007 et al]
n
q( X ) Y .(atomi ( X , Y ) ki )
i 1
E.g. x <- Tall(x)>0.6 ^ hasFriend(x,y)>0.7
^ Short(y) > 0.8
General Fuzzy Conjunctive Queries (GFCQs) [Pan2007 et al]
n
q( X ) Y .(atomi ( X , Y ) : ki )
i 1
E.g. x <- Tall(x):0.6 ^ hasFriend(x,y):0.7
^ Short(y):0.8
Supported only in Fuzzy DL-Lite
Fuzzy OWL Syntax in Triples
Refication
Weak and ill defined model
Limited support by RDF tools
Datatypes
Concrete feature like datatypes are not appropriate for
the representation of abstract information like fuzzy
assertions
The proposed syntax
Is simple and clear
Is based on the use of blank nodes and properties
Fuzzy OWL Syntax in Triples
Fuzzy concept assertion
paul
frdf:membership
_:paulmembTall .
_:paulmembTall
rdf:type
Tall.
_:paulmembTall
frdf:degree
“n^^xsd:float”.
_:paulmembTall
frdf:ineqType
“>=” .
Fuzzy role assertion
paul
frdf:paulFriendOffrank
frank.
frdf:paulFriendOffrank
rdf:type
FriendOf.
frdf:paulFriendOffrank
frdf:degree
“n^^xsd:float”.
frdf:paulFriendOffrank
frdf:ineqType
“>=”.
Fuzzy Reasoning Engine FiRE
It is a JAVA based implementation available at
www.image.ece.ntua.gr/~nsimou/FiRE/
Can be used through a user friendly interface or as an API
Currently supports F-SHIN
The reasoning algorithm uses the fuzzy tableau [Stoilos 2007]
Its syntax is based on Knowledge Representation System
Specification appropriately extended to fit uncertainty
E.g.
(instance eve Model >= 0.7)
(related peter eve has-friend >= 0.8)
Sesame integration with FiRE
RDF-Store Sesame is used as a back end for storing and
querying.
FiRE is used as a front end permitting a user to
Write or edit a fuzzy knowledge base (Fuzzy KRSS Format)
Ask the GLB of all the individuals of the KB in all the concepts
(primitive and defined) of the KB
Export the explicit and implicit knowledge to a Sesame
repository using the proposed Fuzzy OWL syntax
Import a fuzzy knowledge base from a Sesame repository
Perform CTQs and GFCQs
Querying-I
Conjunctive threshold Queries (CTQs)
FiRE Syntax
x,y <- Man(x) ^ Tall(x)> 0.6 ^ hasfriend(x,y)
> 0.7 ^ Woman(y) ^ GoodLooking(y) >= 0.8
The query is converted to a SPARQL query based on the Fuzzy
OWL syntax in triples
The query is evaluated by Sesame
The results are visualized by FiRE
Querying-II
General Fuzzy Conjunctive Queries (GFCQs)
FiRE Syntax
x,y <- Man(x)^ Tall(x) : 0.6 ^ has-friend(x,y) : 0.7
^ Woman(y) ^ GoodLooking(y) : 0.8
A SPARQL query is constructed in a way that
The membership degrees of every Role or Concept used in atoms criteria
are retrieved for the individuals that satisfy all the atoms
The results are processed according to the query weights by
FiRE permitting
Fuzzy threshold queries using fuzzy implication
Fuzzy aggregation queries using fuzzy aggregation functions
Fuzzy weighted queries using weighted t-norms
The results are visualized by FiRE
Fuzzy Query Examples
(instance peter Man)
(instance peter Thin >= 0.6 )
(instance peter Clever >= 0.8 )
(instance peter Tall >= 0.7 )
(instance eve Model >= 0.7)
(related peter eve has-friend >= 0.7)
x,y <- Tall(x) > 0.2 ^ Clever(x) > 0.3 ^ has-friend(x,y) > 0.4
^ Model(y) > 0.6
x :peter
y:eve
x,y <- Tall(x) : 0.2 ^ Clever(x) : 0.3 ^ has-friend(x,y) : 0.4
^ Model(y) : 0.6
Using fuzzy aggregation queries
i.e.
(k1 d1 ) (k2 d 2 ) ... (k4 d 4 )
k1 k2 ... k4
<x,y> <peter eve> : 0.72
Use case
A production company had a database of 2140 models used
for casting purposes
Rich information was stored for each model...
i.e. age, height, body type, fitness type, tooth condition…
Inaccessible to the producers because
The information was fuzzy
The information was not semantically organized
Retrieval of models based on threshold criteria was inaccurate
The combination of information about models that would form
profession-like characteristics (like Teacher, Mafia, Scientist )
was extremely difficult
The Fuzzy Knowledge base
The set of Concepts consisted of the features described in the
database
Age was fuzzified giving concepts Baby, Kid, Teen, 20s,30s,40s, 60s and Old
Height was fuzzified depending on the model’s gender giving concepts
Very_Short, Short, Normal_Height, Tall, Very_Tall
The set of Roles consisted of some special characteristics
i.e. has-hairLength, has-hairColor…
The set of individuals consisted of the models
An expressive terminology was defined with 33 concepts that
refered to the professions of interest
e.g. Scientist≡Male⊓Serious ⊓ (40s ⊔ 50s) ⊓ ∃has-eyeCondition.Glasses
Results
Explicit knowledge
2140 individuals
82 Concepts 20 Roles
29469 assertions (Fuzzy KRSS)
Using GLB for all individuals in all the concepts of the KB
2430 implicit assertions were extracted (Fuzzy KRSS)
Average time was 1112 milliseconds per individual
Upload time to Sesame repository varied
from 200 millisecs in an empty repository (0-10.000 triples)
to 700 millisecs in repository (over 500.000 triples)
Total 529.926 triples (Fuzzy OWL Triples)
Results
Query
Native
100.000 250.000 500.000
Memory
100.000 250.000 500.000
x <- Scientist(x) >= 0.6
1042
2461
3335
894
2368
3332
x <- Father(x)^ Teacher(x)>=
0.8 ^ NormalHeight(x)>= 0.5
1068
2694
3932
994
2524
3732
x <- Legs(x)^Eyes(x)>= 0.8 ^
20s(x)>=0.5^hashairLength(x,y)
^ Long(y)>= 0.7
1352
2876
4021
1002
2650
3809
x <- Scientist(x):0.6
2562
4173
3935
3042
4543
6027
x <- Father(x)^ Teacher(x)
:0.8 ^NormalHeight(x):0.5
4318
6694
8935
4341
7896
9306
x <- Legs(x)^Eyes(x):0.8 ^
20s(x):0.5^hashairLength(x,y)^
Long(y):0.7
5423
6998
9230
5420
6879
9974
Conclusions
Limitations
Not complete query answering system
Queries are issued against stored assertions to an RDF repository
Queries on Sesame Repositories are not scalable
Dependence on size of the repository
Dependence on the number of query atoms
…However
Incompleteness is minimized by GLB
Query answering for crisp DLs is still an open problem
Query algorithms are highly complex
No practically scalable system is known
References
[Pan2007] J.Z. Pan, G. Stamou, G. Stoilos, and E. Thomas.
Expressive querying over fuzzy DL-Lite ontologies. In
Proceedings of the InternationalWorkshop on Description Logics (DL
2007), 2007.
[Stoilos2007] G.Stoilos, G.Stamou, V.Tzouvaras, J.Z.Pan,
and I.Horrocks. Reasoning with very expressive fuzzy
description logics. Journal of Artificial Intelligence Research,
30(5):273-320, 2007.
[Straccia2007] U.Straccia and G.Visco. DLMedia: an
ontology mediated multimedia information retrieval system.
In Proceeedings of the InternationalWorkshop on Description Logics
(DL 2007), 2007.
Questions - Acknowledgements
Thank you!
This work is supported by the FP6 Network of Excellence EU project XMedia (FP6-026978) and K-space (IST-2005-027026).
Fuzzy SHIN - Knowledge base
A fuzzy knowledge base is a triple
Σ= (T ,R, A)
where:
T is a finite set of fuzzy inclusion axioms: A ⊑ C or fuzzy
equivalence axioms : A ≡ C, called a fuzzy TBox
R is a finite set of fuzzy transitive role axioms: Trans(R) or
fuzzy role inclusion axioms P ⊑ R, called a fuzzy RBox
A is a finite set of fuzzy assertions: 〈 a : C ⋈ n 〉 or 〈 (a, b) : R
⋈ n 〉, where ⋈ ∈ {≥,>,<, ≤}, called a fuzzy ABox.
Fuzzy SHIN - Semantics
A fuzzy interpretation is a pair I= (ΔI×
where ΔI is the domain
of interpretation and .I is the interpretation function which maps
.I)
An individual name α ∈ I to an element
α I ∈ ΔI
A concept name A to a membership function AI : ΔI →[0,1]
A role name R to a membership function
RI : ΔI× ΔI →[0,1]
Fuzzy set theoretic operations are used to give semantics to
complex concepts
Lukasiewicz Fuzzy negation
Lukasiewicz Fuzzy intersection
Lukasiewicz Fuzzy union
Kleenes-Dienes Fuzzy implication
c(a) = 1-a
t(a,b) = min(a,b)
u(a,b) = max (a,b)
ℑ(a,b) = max(1-a, b)
A SPARQL Query
SELECT ?x WHERE {
?x ns5:membership ?Node1 .
?Node1 rdf:type ?Concept1 .
?Node1 ns5:ineqType ?IneqType1 .
?Node1 ns5:degree ?Degree1 .
FILTER regex (?Concept1 , "CONCEPTS#Tall")
FILTER regex (?IneqType1 ,">")
FILTER (?Degree1 >= "0.8^^xsd:float")
?BlankRole2 ns5:ineqType ?IneqType2 .
?BlankRole2 ns5:degree ?Degree2 .
?BlankRole2 rdf:type ?Role2 .
?x BlankRole2 ?y .
FILTER regex (?Role2 , "ROLES#has-friend")
FILTER regex (?IneqType1 ,">")
FILTER (?Degree2 >= "1.0^^xsd:float")
...
}