#### Transcript Storing and Querying Fuzzy Knowledge in the Semantic Web

National Technical University of Athens, Greece School of Electrical and Computer Engineering Department of Computer Science Image, Video and Multimedia Laboratory Storing and Querying Fuzzy Knowledge in the Semantic Web N. Simou, G. Stoilos, V. Tzouvaras, G. Stamou, S. Kollias [email protected] 4th International Workshop on Uncertainty Reasoning for the Semantic Web Sunday 26th October, 2008 Karlsruhe, Germany Motivation Ontologies and OWL Language play a significant role in the Semantic Web Optimized Reasoners (Fact, Pellet) Various tools for storing and querying OWL ontologies Crisp DLs lack the ability to represent uncertain information Fuzzy DLs Fuzzy Reasoners (FiRE, fuzzyDL) No work on persistent storage and querying for expressive fuzzy DLs ([Straccia2007] and [Pan 2007] are based on fuzzy DL-Lite) Contribution It presents a novel framework for persistent storage and querying of expressive fuzzy knowledge bases It integrates Fuzzy Reasoner FiRE with the RDF Triple Store Sesame It provides experimental evaluation of the proposed architecture using a real-world industrial use case scenario Outline Queries in crisp DL RDF Stores The fuzzy DL f-SHIN Fuzzy Extensions to Queries Fuzzy OWL Syntax in Triples Sesame integration with FiRE Evaluation Queries in DLs Conjunctive Queries q( X ) Y .conj ( X , Y ) e.g. x <- Man(x) ^ hasChild(x,y) ^ Man(y) Query answering algorithms for crisp DLs Are highly complex A practically scalable system is not available Various tools support queries for crisp DLs RDF Stores Data storage systems for storing and querying ontologies Sesame, Jena, Kowari Ontologies are stored in various triples format RDF/XML N Triples Turtle Support of Query languages SPARQL SeRQL (Sesame) Plugins for incoplete OWL DL querying and reasoning (Sesame-OWLim) Fuzzy SHIN - Syntax A fuzzy extension of DL SHIN f-SHIN concepts are formed in the same way as in SHIN C,D ::=⊤ | ⊥ | ¬C | C ⊓ D | C ⊔ D | ∃R.C | ∀R.C | ≥nR | ≤nR R,P::= R - | Trans(R) | P⊑R Assertions are extended to fit uncertainty 〈 nick : Tall ≥ 0.7 〉 〈 (nick, theo) : isFriend ≥ 0.6 〉 Fuzzy SHIN - Inference Services Entailment “Does axiom Ψ logically follow from the ontology T?” Satisfiability “Can the concept C have any instances with degree of participation ⋈ n in models of ontology T?” Subsumption “Is the concept D more general than the concept C in models of the ontology T?” Greatest Lower Bound (GLB) “What is the greatest degree n that our ontology entails an individual a to participate in a concept C?” Fuzzy Extensions to Queries Conjunctive threshold Queries (CTQs) [Pan2007 et al] n q( X ) Y .(atomi ( X , Y ) ki ) i 1 E.g. x <- Tall(x)>0.6 ^ hasFriend(x,y)>0.7 ^ Short(y) > 0.8 General Fuzzy Conjunctive Queries (GFCQs) [Pan2007 et al] n q( X ) Y .(atomi ( X , Y ) : ki ) i 1 E.g. x <- Tall(x):0.6 ^ hasFriend(x,y):0.7 ^ Short(y):0.8 Supported only in Fuzzy DL-Lite Fuzzy OWL Syntax in Triples Refication Weak and ill defined model Limited support by RDF tools Datatypes Concrete feature like datatypes are not appropriate for the representation of abstract information like fuzzy assertions The proposed syntax Is simple and clear Is based on the use of blank nodes and properties Fuzzy OWL Syntax in Triples Fuzzy concept assertion paul frdf:membership _:paulmembTall . _:paulmembTall rdf:type Tall. _:paulmembTall frdf:degree “n^^xsd:float”. _:paulmembTall frdf:ineqType “>=” . Fuzzy role assertion paul frdf:paulFriendOffrank frank. frdf:paulFriendOffrank rdf:type FriendOf. frdf:paulFriendOffrank frdf:degree “n^^xsd:float”. frdf:paulFriendOffrank frdf:ineqType “>=”. Fuzzy Reasoning Engine FiRE It is a JAVA based implementation available at www.image.ece.ntua.gr/~nsimou/FiRE/ Can be used through a user friendly interface or as an API Currently supports F-SHIN The reasoning algorithm uses the fuzzy tableau [Stoilos 2007] Its syntax is based on Knowledge Representation System Specification appropriately extended to fit uncertainty E.g. (instance eve Model >= 0.7) (related peter eve has-friend >= 0.8) Sesame integration with FiRE RDF-Store Sesame is used as a back end for storing and querying. FiRE is used as a front end permitting a user to Write or edit a fuzzy knowledge base (Fuzzy KRSS Format) Ask the GLB of all the individuals of the KB in all the concepts (primitive and defined) of the KB Export the explicit and implicit knowledge to a Sesame repository using the proposed Fuzzy OWL syntax Import a fuzzy knowledge base from a Sesame repository Perform CTQs and GFCQs Querying-I Conjunctive threshold Queries (CTQs) FiRE Syntax x,y <- Man(x) ^ Tall(x)> 0.6 ^ hasfriend(x,y) > 0.7 ^ Woman(y) ^ GoodLooking(y) >= 0.8 The query is converted to a SPARQL query based on the Fuzzy OWL syntax in triples The query is evaluated by Sesame The results are visualized by FiRE Querying-II General Fuzzy Conjunctive Queries (GFCQs) FiRE Syntax x,y <- Man(x)^ Tall(x) : 0.6 ^ has-friend(x,y) : 0.7 ^ Woman(y) ^ GoodLooking(y) : 0.8 A SPARQL query is constructed in a way that The membership degrees of every Role or Concept used in atoms criteria are retrieved for the individuals that satisfy all the atoms The results are processed according to the query weights by FiRE permitting Fuzzy threshold queries using fuzzy implication Fuzzy aggregation queries using fuzzy aggregation functions Fuzzy weighted queries using weighted t-norms The results are visualized by FiRE Fuzzy Query Examples (instance peter Man) (instance peter Thin >= 0.6 ) (instance peter Clever >= 0.8 ) (instance peter Tall >= 0.7 ) (instance eve Model >= 0.7) (related peter eve has-friend >= 0.7) x,y <- Tall(x) > 0.2 ^ Clever(x) > 0.3 ^ has-friend(x,y) > 0.4 ^ Model(y) > 0.6 x :peter y:eve x,y <- Tall(x) : 0.2 ^ Clever(x) : 0.3 ^ has-friend(x,y) : 0.4 ^ Model(y) : 0.6 Using fuzzy aggregation queries i.e. (k1 d1 ) (k2 d 2 ) ... (k4 d 4 ) k1 k2 ... k4 <x,y> <peter eve> : 0.72 Use case A production company had a database of 2140 models used for casting purposes Rich information was stored for each model... i.e. age, height, body type, fitness type, tooth condition… Inaccessible to the producers because The information was fuzzy The information was not semantically organized Retrieval of models based on threshold criteria was inaccurate The combination of information about models that would form profession-like characteristics (like Teacher, Mafia, Scientist ) was extremely difficult The Fuzzy Knowledge base The set of Concepts consisted of the features described in the database Age was fuzzified giving concepts Baby, Kid, Teen, 20s,30s,40s, 60s and Old Height was fuzzified depending on the model’s gender giving concepts Very_Short, Short, Normal_Height, Tall, Very_Tall The set of Roles consisted of some special characteristics i.e. has-hairLength, has-hairColor… The set of individuals consisted of the models An expressive terminology was defined with 33 concepts that refered to the professions of interest e.g. Scientist≡Male⊓Serious ⊓ (40s ⊔ 50s) ⊓ ∃has-eyeCondition.Glasses Results Explicit knowledge 2140 individuals 82 Concepts 20 Roles 29469 assertions (Fuzzy KRSS) Using GLB for all individuals in all the concepts of the KB 2430 implicit assertions were extracted (Fuzzy KRSS) Average time was 1112 milliseconds per individual Upload time to Sesame repository varied from 200 millisecs in an empty repository (0-10.000 triples) to 700 millisecs in repository (over 500.000 triples) Total 529.926 triples (Fuzzy OWL Triples) Results Query Native 100.000 250.000 500.000 Memory 100.000 250.000 500.000 x <- Scientist(x) >= 0.6 1042 2461 3335 894 2368 3332 x <- Father(x)^ Teacher(x)>= 0.8 ^ NormalHeight(x)>= 0.5 1068 2694 3932 994 2524 3732 x <- Legs(x)^Eyes(x)>= 0.8 ^ 20s(x)>=0.5^hashairLength(x,y) ^ Long(y)>= 0.7 1352 2876 4021 1002 2650 3809 x <- Scientist(x):0.6 2562 4173 3935 3042 4543 6027 x <- Father(x)^ Teacher(x) :0.8 ^NormalHeight(x):0.5 4318 6694 8935 4341 7896 9306 x <- Legs(x)^Eyes(x):0.8 ^ 20s(x):0.5^hashairLength(x,y)^ Long(y):0.7 5423 6998 9230 5420 6879 9974 Conclusions Limitations Not complete query answering system Queries are issued against stored assertions to an RDF repository Queries on Sesame Repositories are not scalable Dependence on size of the repository Dependence on the number of query atoms …However Incompleteness is minimized by GLB Query answering for crisp DLs is still an open problem Query algorithms are highly complex No practically scalable system is known References [Pan2007] J.Z. Pan, G. Stamou, G. Stoilos, and E. Thomas. Expressive querying over fuzzy DL-Lite ontologies. In Proceedings of the InternationalWorkshop on Description Logics (DL 2007), 2007. [Stoilos2007] G.Stoilos, G.Stamou, V.Tzouvaras, J.Z.Pan, and I.Horrocks. Reasoning with very expressive fuzzy description logics. Journal of Artificial Intelligence Research, 30(5):273-320, 2007. [Straccia2007] U.Straccia and G.Visco. DLMedia: an ontology mediated multimedia information retrieval system. In Proceeedings of the InternationalWorkshop on Description Logics (DL 2007), 2007. Questions - Acknowledgements Thank you! This work is supported by the FP6 Network of Excellence EU project XMedia (FP6-026978) and K-space (IST-2005-027026). Fuzzy SHIN - Knowledge base A fuzzy knowledge base is a triple Σ= (T ,R, A) where: T is a finite set of fuzzy inclusion axioms: A ⊑ C or fuzzy equivalence axioms : A ≡ C, called a fuzzy TBox R is a finite set of fuzzy transitive role axioms: Trans(R) or fuzzy role inclusion axioms P ⊑ R, called a fuzzy RBox A is a finite set of fuzzy assertions: 〈 a : C ⋈ n 〉 or 〈 (a, b) : R ⋈ n 〉, where ⋈ ∈ {≥,>,<, ≤}, called a fuzzy ABox. Fuzzy SHIN - Semantics A fuzzy interpretation is a pair I= (ΔI× where ΔI is the domain of interpretation and .I is the interpretation function which maps .I) An individual name α ∈ I to an element α I ∈ ΔI A concept name A to a membership function AI : ΔI →[0,1] A role name R to a membership function RI : ΔI× ΔI →[0,1] Fuzzy set theoretic operations are used to give semantics to complex concepts Lukasiewicz Fuzzy negation Lukasiewicz Fuzzy intersection Lukasiewicz Fuzzy union Kleenes-Dienes Fuzzy implication c(a) = 1-a t(a,b) = min(a,b) u(a,b) = max (a,b) ℑ(a,b) = max(1-a, b) A SPARQL Query SELECT ?x WHERE { ?x ns5:membership ?Node1 . ?Node1 rdf:type ?Concept1 . ?Node1 ns5:ineqType ?IneqType1 . ?Node1 ns5:degree ?Degree1 . FILTER regex (?Concept1 , "CONCEPTS#Tall") FILTER regex (?IneqType1 ,">") FILTER (?Degree1 >= "0.8^^xsd:float") ?BlankRole2 ns5:ineqType ?IneqType2 . ?BlankRole2 ns5:degree ?Degree2 . ?BlankRole2 rdf:type ?Role2 . ?x BlankRole2 ?y . FILTER regex (?Role2 , "ROLES#has-friend") FILTER regex (?IneqType1 ,">") FILTER (?Degree2 >= "1.0^^xsd:float") ... }