Storing and Querying Fuzzy Knowledge in the Semantic Web

Download Report

Transcript Storing and Querying Fuzzy Knowledge in the Semantic Web

National Technical University of Athens, Greece
School of Electrical and Computer Engineering
Department of Computer Science
Image, Video and Multimedia Laboratory
Storing and Querying Fuzzy
Knowledge in the Semantic Web
N. Simou, G. Stoilos, V. Tzouvaras,
G. Stamou, S. Kollias
[email protected]
4th International Workshop on Uncertainty Reasoning for the Semantic Web
Sunday 26th October, 2008 Karlsruhe, Germany
Motivation
 Ontologies and OWL Language play a significant role in the
Semantic Web
 Optimized Reasoners (Fact, Pellet)
 Various tools for storing and querying OWL ontologies
 Crisp DLs lack the ability to represent uncertain information
 Fuzzy DLs
 Fuzzy Reasoners (FiRE, fuzzyDL)
 No work on persistent storage and querying for expressive
fuzzy DLs ([Straccia2007] and [Pan 2007] are based on fuzzy
DL-Lite)
Contribution
 It presents a novel framework for persistent storage and
querying of expressive fuzzy knowledge bases
 It integrates Fuzzy Reasoner FiRE with the RDF Triple
Store Sesame
 It provides experimental evaluation of the proposed
architecture using a real-world industrial use case
scenario
Outline
 Queries in crisp DL
 RDF Stores
 The fuzzy DL f-SHIN
 Fuzzy Extensions to Queries
 Fuzzy OWL Syntax in Triples
 Sesame integration with FiRE
 Evaluation
Queries in DLs
 Conjunctive Queries
q( X )  Y .conj ( X , Y )
 e.g. x <- Man(x) ^ hasChild(x,y) ^ Man(y)
 Query answering algorithms for crisp DLs
 Are highly complex
 A practically scalable system is not available
 Various tools support queries for crisp DLs
RDF Stores
 Data storage systems for storing and querying ontologies
 Sesame, Jena, Kowari
 Ontologies are stored in various triples format
 RDF/XML
 N Triples
 Turtle
 Support of Query languages
 SPARQL
 SeRQL (Sesame)
 Plugins for incoplete OWL DL querying and reasoning
(Sesame-OWLim)
Fuzzy SHIN - Syntax
 A fuzzy extension of DL SHIN
 f-SHIN concepts are formed in the same way as in SHIN
C,D ::=⊤ | ⊥ | ¬C | C ⊓ D | C ⊔ D |
∃R.C | ∀R.C | ≥nR | ≤nR
R,P::= R - | Trans(R) | P⊑R
 Assertions are extended to fit uncertainty
〈 nick : Tall ≥ 0.7 〉
〈 (nick, theo) : isFriend ≥ 0.6 〉
Fuzzy SHIN - Inference Services
 Entailment
 “Does axiom Ψ logically follow from the ontology T?”
 Satisfiability
 “Can the concept C have any instances with degree of
participation ⋈ n in models of ontology T?”
 Subsumption
 “Is the concept D more general than the concept C in models of
the ontology T?”
 Greatest Lower Bound (GLB)
 “What is the greatest degree n that our ontology entails an
individual a to participate in a concept C?”
Fuzzy Extensions to Queries
 Conjunctive threshold Queries (CTQs) [Pan2007 et al]
n
q( X )  Y .(atomi ( X , Y )  ki )
i 1
 E.g. x <- Tall(x)>0.6 ^ hasFriend(x,y)>0.7
^ Short(y) > 0.8
 General Fuzzy Conjunctive Queries (GFCQs) [Pan2007 et al]
n
q( X )  Y .(atomi ( X , Y ) : ki )
i 1
 E.g. x <- Tall(x):0.6 ^ hasFriend(x,y):0.7
^ Short(y):0.8
 Supported only in Fuzzy DL-Lite
Fuzzy OWL Syntax in Triples
 Refication
 Weak and ill defined model
 Limited support by RDF tools
 Datatypes
 Concrete feature like datatypes are not appropriate for
the representation of abstract information like fuzzy
assertions
 The proposed syntax
 Is simple and clear
 Is based on the use of blank nodes and properties
Fuzzy OWL Syntax in Triples
 Fuzzy concept assertion
paul
frdf:membership
_:paulmembTall .
_:paulmembTall
rdf:type
Tall.
_:paulmembTall
frdf:degree
“n^^xsd:float”.
_:paulmembTall
frdf:ineqType
“>=” .
 Fuzzy role assertion
paul
frdf:paulFriendOffrank
frank.
frdf:paulFriendOffrank
rdf:type
FriendOf.
frdf:paulFriendOffrank
frdf:degree
“n^^xsd:float”.
frdf:paulFriendOffrank
frdf:ineqType
“>=”.
Fuzzy Reasoning Engine FiRE
 It is a JAVA based implementation available at




www.image.ece.ntua.gr/~nsimou/FiRE/
Can be used through a user friendly interface or as an API
Currently supports F-SHIN
The reasoning algorithm uses the fuzzy tableau [Stoilos 2007]
Its syntax is based on Knowledge Representation System
Specification appropriately extended to fit uncertainty
E.g.
(instance eve Model >= 0.7)
(related peter eve has-friend >= 0.8)
Sesame integration with FiRE
 RDF-Store Sesame is used as a back end for storing and
querying.
 FiRE is used as a front end permitting a user to
 Write or edit a fuzzy knowledge base (Fuzzy KRSS Format)
 Ask the GLB of all the individuals of the KB in all the concepts
(primitive and defined) of the KB
 Export the explicit and implicit knowledge to a Sesame
repository using the proposed Fuzzy OWL syntax
 Import a fuzzy knowledge base from a Sesame repository
 Perform CTQs and GFCQs
Querying-I
 Conjunctive threshold Queries (CTQs)
 FiRE Syntax
x,y <- Man(x) ^ Tall(x)> 0.6 ^ hasfriend(x,y)
> 0.7 ^ Woman(y) ^ GoodLooking(y) >= 0.8
 The query is converted to a SPARQL query based on the Fuzzy
OWL syntax in triples
 The query is evaluated by Sesame
 The results are visualized by FiRE
Querying-II
 General Fuzzy Conjunctive Queries (GFCQs)
 FiRE Syntax
x,y <- Man(x)^ Tall(x) : 0.6 ^ has-friend(x,y) : 0.7
^ Woman(y) ^ GoodLooking(y) : 0.8
 A SPARQL query is constructed in a way that
 The membership degrees of every Role or Concept used in atoms criteria
are retrieved for the individuals that satisfy all the atoms
 The results are processed according to the query weights by
FiRE permitting
 Fuzzy threshold queries using fuzzy implication
 Fuzzy aggregation queries using fuzzy aggregation functions
 Fuzzy weighted queries using weighted t-norms
 The results are visualized by FiRE
Fuzzy Query Examples
(instance peter Man)
(instance peter Thin >= 0.6 )
(instance peter Clever >= 0.8 )
(instance peter Tall >= 0.7 )
(instance eve Model >= 0.7)
(related peter eve has-friend >= 0.7)

x,y <- Tall(x) > 0.2 ^ Clever(x) > 0.3 ^ has-friend(x,y) > 0.4
^ Model(y) > 0.6
 x :peter

y:eve
x,y <- Tall(x) : 0.2 ^ Clever(x) : 0.3 ^ has-friend(x,y) : 0.4
^ Model(y) : 0.6
 Using fuzzy aggregation queries
 i.e.
(k1  d1 )  (k2  d 2 )  ...  (k4  d 4 )
k1  k2  ...  k4
 <x,y> <peter eve> : 0.72
Use case
 A production company had a database of 2140 models used
for casting purposes
 Rich information was stored for each model...
 i.e. age, height, body type, fitness type, tooth condition…
 Inaccessible to the producers because
 The information was fuzzy
 The information was not semantically organized
 Retrieval of models based on threshold criteria was inaccurate
 The combination of information about models that would form
profession-like characteristics (like Teacher, Mafia, Scientist )
was extremely difficult
The Fuzzy Knowledge base
 The set of Concepts consisted of the features described in the
database
 Age was fuzzified giving concepts Baby, Kid, Teen, 20s,30s,40s, 60s and Old
 Height was fuzzified depending on the model’s gender giving concepts
Very_Short, Short, Normal_Height, Tall, Very_Tall
 The set of Roles consisted of some special characteristics
 i.e. has-hairLength, has-hairColor…
 The set of individuals consisted of the models
 An expressive terminology was defined with 33 concepts that
refered to the professions of interest
 e.g. Scientist≡Male⊓Serious ⊓ (40s ⊔ 50s) ⊓ ∃has-eyeCondition.Glasses
Results
 Explicit knowledge
 2140 individuals
 82 Concepts 20 Roles
 29469 assertions (Fuzzy KRSS)
 Using GLB for all individuals in all the concepts of the KB
 2430 implicit assertions were extracted (Fuzzy KRSS)
 Average time was 1112 milliseconds per individual
 Upload time to Sesame repository varied
 from 200 millisecs in an empty repository (0-10.000 triples)
 to 700 millisecs in repository (over 500.000 triples)
 Total 529.926 triples (Fuzzy OWL Triples)
Results
Query
Native
100.000 250.000 500.000
Memory
100.000 250.000 500.000
x <- Scientist(x) >= 0.6
1042
2461
3335
894
2368
3332
x <- Father(x)^ Teacher(x)>=
0.8 ^ NormalHeight(x)>= 0.5
1068
2694
3932
994
2524
3732
x <- Legs(x)^Eyes(x)>= 0.8 ^
20s(x)>=0.5^hashairLength(x,y)
^ Long(y)>= 0.7
1352
2876
4021
1002
2650
3809
x <- Scientist(x):0.6
2562
4173
3935
3042
4543
6027
x <- Father(x)^ Teacher(x)
:0.8 ^NormalHeight(x):0.5
4318
6694
8935
4341
7896
9306
x <- Legs(x)^Eyes(x):0.8 ^
20s(x):0.5^hashairLength(x,y)^
Long(y):0.7
5423
6998
9230
5420
6879
9974
Conclusions
 Limitations
 Not complete query answering system
 Queries are issued against stored assertions to an RDF repository
 Queries on Sesame Repositories are not scalable
 Dependence on size of the repository
 Dependence on the number of query atoms
 …However
 Incompleteness is minimized by GLB
 Query answering for crisp DLs is still an open problem
 Query algorithms are highly complex
 No practically scalable system is known
References
 [Pan2007] J.Z. Pan, G. Stamou, G. Stoilos, and E. Thomas.
Expressive querying over fuzzy DL-Lite ontologies. In
Proceedings of the InternationalWorkshop on Description Logics (DL
2007), 2007.
 [Stoilos2007] G.Stoilos, G.Stamou, V.Tzouvaras, J.Z.Pan,
and I.Horrocks. Reasoning with very expressive fuzzy
description logics. Journal of Artificial Intelligence Research,
30(5):273-320, 2007.
 [Straccia2007] U.Straccia and G.Visco. DLMedia: an
ontology mediated multimedia information retrieval system.
In Proceeedings of the InternationalWorkshop on Description Logics
(DL 2007), 2007.
Questions - Acknowledgements
Thank you!
This work is supported by the FP6 Network of Excellence EU project XMedia (FP6-026978) and K-space (IST-2005-027026).
Fuzzy SHIN - Knowledge base
 A fuzzy knowledge base is a triple
Σ= (T ,R, A)
where:
 T is a finite set of fuzzy inclusion axioms: A ⊑ C or fuzzy
equivalence axioms : A ≡ C, called a fuzzy TBox
 R is a finite set of fuzzy transitive role axioms: Trans(R) or
fuzzy role inclusion axioms P ⊑ R, called a fuzzy RBox
 A is a finite set of fuzzy assertions: 〈 a : C ⋈ n 〉 or 〈 (a, b) : R
⋈ n 〉, where ⋈ ∈ {≥,>,<, ≤}, called a fuzzy ABox.
Fuzzy SHIN - Semantics
 A fuzzy interpretation is a pair I= (ΔI×
where ΔI is the domain
of interpretation and .I is the interpretation function which maps
.I)
 An individual name α ∈ I to an element
α I ∈ ΔI
 A concept name A to a membership function AI : ΔI →[0,1]
 A role name R to a membership function
RI : ΔI× ΔI →[0,1]
 Fuzzy set theoretic operations are used to give semantics to
complex concepts




Lukasiewicz Fuzzy negation
Lukasiewicz Fuzzy intersection
Lukasiewicz Fuzzy union
Kleenes-Dienes Fuzzy implication
c(a) = 1-a
t(a,b) = min(a,b)
u(a,b) = max (a,b)
ℑ(a,b) = max(1-a, b)
A SPARQL Query
SELECT ?x WHERE {
?x ns5:membership ?Node1 .
?Node1 rdf:type ?Concept1 .
?Node1 ns5:ineqType ?IneqType1 .
?Node1 ns5:degree ?Degree1 .
FILTER regex (?Concept1 , "CONCEPTS#Tall")
FILTER regex (?IneqType1 ,">")
FILTER (?Degree1 >= "0.8^^xsd:float")
?BlankRole2 ns5:ineqType ?IneqType2 .
?BlankRole2 ns5:degree ?Degree2 .
?BlankRole2 rdf:type ?Role2 .
?x BlankRole2 ?y .
FILTER regex (?Role2 , "ROLES#has-friend")
FILTER regex (?IneqType1 ,">")
FILTER (?Degree2 >= "1.0^^xsd:float")
...
}