Transcript Slide 1

Leveraging Web Services Discovery
with Customizable Hybrid Matching
Natallia Kokash,
Willem-Jan van den Heuvel
(University of Tilburg, the Netherlands),
Vincenzo D'Andrea
(ICSOC 2007)
Introduction
• Web Service Discovery
– Web Service Interface Description
– Web Service Matching
• Hybrid Matching
– Classification
– Hybrid Algorithms
– Experiments
– Involved Data
• Related Work
Web Service Discovery and Selection
Service
Behavior
Descriptions
Service
Quality
Service
Registry
(UDDI, ebXML)
Service
Interface
Descriptions
Preferences
User
experience
Service
Description
Service
Provider
Service
Find
Service
Requestor
Service-oriented
application
Web Services Discovery
• Matching – meeting the functionality required
by a user with specifications of existing services
– Generic (heuristics, domain-independent ontologies)
– Community (domain-specific ontologies)
– Personal (preferences, specific functions and
patterns for comparing requests and existing
services)
• Selection – choosing a service with the best
quality among those able to satisfy a user goal
Service Interface Description
• (Syntactic)
– Identity – unique identity of the interface
– Input/Output– the meaning of input and output parameters
– Faults – specify the abstract message format for any error messages
that may be output as the result of the operation
– Types – declare data types used in the interface (XML Schema)
– Documentation – natural language service description and usage
guide
• (Semantic)
– Preconditions – a set of semantic statements that are required to be
true before an operation can be successfully invoked
– Effects – a set of semantic statements that must be true after an
operation completes execution after being invoked.
– Restrictions – a set of assumptions about the environment that must
be true
– Quality of Service – non-functional parameters such as response
time, execution cost, capacity, etc.
Web Service Description Language (WSDL)
GoogleSearch.wsdl
WolframSearch.wsdl
<message name=”doGoogleSearchResponse”>
<part name=”return” type=”GoogleSearchResult”/>
</message>
...
<complexType name=”GoogleSearchResult”>
<all>
<element name=”searchComments” type=”string”/>
<element name=”estimatedTotalResultsCount”
type=”int”/>
<element name=”resultElements”
type=”ResultElementArray”/>
...
< /all>
< /complexType>
<message name=”WolframSearchResponse”>
<part element=”WolframSearchReturn”/ >
</message>
...
<element name=”WolframSearchReturn”>
<complexType>
<sequence>
<element name=”Result” type=”WolframSearchResult”/ >
< /sequence>
< /complexType>
< /element>
...
<complexType name=”WolframSearchResult”>
<sequence>
<element name=”TotalMatches” type=”int”/ >
<element name=”Comment” type=”string”/ >
<element name=”Matches” type=”WolframSearchMatchArray”/
>
...
< /sequence>
< /complexType>
Semantic Web Service Description
•
•
•
•
•
•
Managing End-To-End OpeRations (METEOR-S)
Semantic Web Services Framework (SWSF)
Web Service Modelling Ontology (WSMO)
Ontology Web Language for Services (OWL-S)
Semantic Web Rule Language (SWRL)
Web Service Description Language – Semantics (WSDL-S)
– The semantics of operations are directly added to WSDL files
– Easy to deploy and use
– Does not support full features of OWL-S process ontology
• Universal Service-Semantics Description Language (USDL)
– Based on Web Ontology Language (OWL)
– Employs Word-Net as a common basis for understanding the
meaning of services
– Side effects: create, update, delete, find, affects
• …
WSDL-S (Semantics)
StockQuotes.wsdls
…
xmlns:Ontology4=http://lsdis.cs.uga.edu/projects/meteor-s/wsdls/ontologies/LSDIS_Finance.owl
xmlns:wssem="http://www.ibm.com/xmlns/WebServices/WSSemantics">
…
<message name="GetQuotesSoapIn">
<part name="QuoteTicker" type="string"
wssem:modelReference="Ontology4#Stock.stockSymbol"/>
</message>
…
<operation name="GetStockQuotes"
wssem:modelReference="Ontology4#DelayedStockQuote">
<input name="GetQuotes" message="GetQuotesSoapIn" />
…
</operation>
…
Ontology Web Language (OWL)
LSDIS_FInance.owl
…
<owl:ObjectProperty rdf:ID="stockSymbol">
<rdfs:range rdf:resource="#SymbolicString"/>
<rdfs:domain rdf:resource="#Stock"/>
</owl:ObjectProperty>
…
<owl:ObjectProperty rdf:ID="stock">
<rdfs:range>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<owl:Class rdf:about="#Stock"/>
<owl:Class rdf:about="#Security"/>
</owl:unionOf>
</owl:Class>
</rdfs:range>
<rdfs:domain rdf:resource="#SecuritiesTransaction"/>
</owl:ObjectProperty>
Semantic Web Services…
•
•
•
•
•
•
•
•
•
•
Vast semantic descriptions are required
Fits only easily formalized domains
High complexity of matching
Do not describe what services do
Do not provide context identification
Do not describe objects used by the service but not provided by the
client
Logic-based approaches are difficult to use
Do not reflect real world scenarios sufficiently
Users must guess which ontology is used to write requests
…
Open Directory Project (human-edited directory of the Web, constructed
and maintained by a vast community of volunteer editors) vs.
(Google) search engine that tries to find a document that can satisfy the
information need (regardless of its format).
WS Matching Algorithms
Matching
Registry
Parsing
Tagging
Indexing
Evaluation
VSM
Hybrid
Semantic
Other
Ontology
Metadata
Query
Kokash, N.: "A Comparison of Web Service Interface Similarity
Measures", Proceedings of STAIRS, 2006, pp. 220--231.
http://dit.unitn.it/~kokash/sources
Structural Matching
Requested Provided
service
service
Similarity ?
Name,
Description,
Operations
A
B
X
Y
Z
…
Web Service
Name,
Description,
Operations
….
Web Service
Requested Provided
operation operation
Similarity ?
Name,
Description,
Input,
Output
…
Operation
a
b
c
wij
x
y
Name,
Description,
Input,
Output
….
Operation
wij
Maximum weight bipartite matching: Kuhn’s Hungarian method
Define overall similarity score: Matching average, Dice, Simpson coefficients…
Lexical Matching
Service
Operation
Message
Part
Type
Element
Name
Operations
Description
Name
Input message
Output message
Description
Name
Parts
Description
Name
Type
Description
Name
Elements
Description
Name
Description
Tokenization
–
Example: ”tns:GetDNSInfoByWebAddressResponse”  {tns,
get, dns, info, by, web, address, response}.
1. Vector-Space Model (VSM) (tf-idf)
2. VSM + synonyms from the WordNet
3. Semantic
–
Seco, N., Veale, T., Hayes, J.: “An intrinsic information content metric
for semantic similarity in WordNet”, ECAI, 2004, pp. 1089-1090
Experimental Results
•Collection 1
–Criterion: One semantically equivalent operation
–40 web services, 5 groups
•Collection 2
–Criterion: Related domains
–371 web services, 68 groups
•VSM is better than Semantic (WordNet-based)
•WordNet is too general
–Lexical difference between queries and existing services
•The Longman defining vocabulary
–make it easier to create logically precise definitions
–2 200 words (~4 000 senses)
–Corresponds to 10 000 WordNet synsets
•WordNet provides a limited set of relations (hyponyms,
synonyms…)
•Matching confidence often is very low
Matching results: service rating list
Web service
(.wsdl file)
Query
Similarity relevance
score
HuZip.wsdl
CurrencyExchangeService.wsdl
1.0
0.121
+
-
Services.wsdl
0.114
+
GetCurrencyExchange.wsdl
FastWeather.wsdl
…
ZipDistance.wsdl
0.083
0.080
-
0.036
+
Matching
confidence
Hybrid Algorithms
Hybridization
Algorithms
Mixed
Data
Switching
Augmentation
Combination
Cascade
Notation
• q – query
• x – web service (operation, message, part, data
type, element)
• γ – similarity threshold
• simA(q; x) – similarity between a query q and a
web service x
• XA(q; γ) = {x | simA(q; x) > γ} denotes a set of
services (operations,…) found by the algorithm A.
• simSy(q; x) – syntactic similarity
• simSe(q; x) – semantic similarity
Hybrid Algorithms: Examples
Mixed
XH1(q; γ) = {simH1(q; x) > γ}, where
simH1(q; x) = F{simSy(q; x), simSe(q; x)},
F(a,b)={min(a,b), max(a,b),
w1a+w2b | w1+w2 = 1; 0 ≤ w1;w2 ≤ 1
• Cascade
XH2(q; γ) = {x | simH2(q; x) > γ}, where
simH2(q; x) = simSy(q; x), x  XSe(q; γ)
Experimental Results
1.
2.
3.
4.
Min(simSy(q; x), simSe(q; x))
Max(simSy(q; x), simSe(q; x))
0.8simSy(q; x) +0.2 simSe(q; x)
0.6simSy(q; x) +0.4 simSe(q; x)
Hybrid data
• Service knowledge
– Features of existing services (service documentation,
specification, interface description, ontology-based
semantic extension, service reputation, monitored
information)
• Client knowledge
– Client's profile: area of expertise, location, history of
searches and previously used web services
• Functional knowledge
– Knowledge required by the matching algorithm to map
between the client needs and the services that might
satisfy those needs
Client knowledge
24 / 40 services exist
How many of them provide feasible results?
<respond>
Web Service
<register>
Proxy
(Axis)
Application
<develop>
Community
<report>
<respond>
Registry
<invoke>
<query>
IC-Service
<recommend>
<feedback>
Birukou, A., Blanzieri, E., D'Andrea, V., Giorgini, P., Kokash, N., Modena,
A.: "IC-Service: A Service-Oriented Approach to the Development of
Recommendation Systems", The ACM Symposium on Applied
Computing, Special Track on Web Technologies (WT), March 2007.
System for Implicit Culture Support
Produce a
theory about
common user
behavior
Recommend
actions
theory
domain theory
Inductive
Module
observations
Composer
Cultural Action
Finder
Scene
Producer
scenes
scenes
Observer
agents , objects,
actions, scenes
agents, objects,
actions, scenes
DB of
observations
observations
Stores
information
about actions
http://dit.unitn.it/~implicit
Observation of web service invocations
• Actors:
– Applications (application name, user name, location)
– Users (user name, location)
• Objects:
– Operation (operation name, web service name, category)
– Inputs/Outputs (parameter name, parameter value)
– Requests (operation names, input/output parameters, category)
• Actions:
–
–
–
–
–
Bind (timestamp, web service),
Invoke (timestamp, operation, input),
Get response (timestamp, operation, output, response time),
Raise exception (timestamp, operation, exception type, input),
Provide feedback (report about contract violations, domain-specific
QoS parameters),
– Submit query (request, preferences)
Matching + User experience
Relevant
Matching algorithm
IC-Service: invoke-response
S2
S1
0.8
S3
0.8
S3
S2
0.75
S2
0.5
S4
S3
0.5
S1
0.3
S1
S4
0.25
S4
0.2
Final ranking
IC-Service: feedback 0-1
S2
1
S2
0.75
S3
0.5
S3
0.6
S4
0.5
S1
0.36
S1
0
S4
0.31
Functional knowledge
• Query  knowledge-based reasoning 
response
– Vocabularies
• currency exchange  [currency  country] 
getExchangeRate(country1, country2)
– Composition
• currency exchange 
getCountryByCurrency(currency) +
getExchangeRate(country1, country2)
Matching: Related Work
1.
2.
3.
4.
5.
6.
7.
8.
9.
[Sajjanhar’04] Sajjanhar, A., Hou, J., Zhang, Y.: ”Algorithm for Web Services
Matching”, APWeb, 2004, pp. 665–670.
[Bruno ’05] Bruno, M., Canfora, G. et al.: ”An Approach to support Web
Service Classification and Annotation”, IEEE International Conference on eTechnology, e-Commerce and e-Service, 2005.
[Corella’06] Corella, M.A., Castells, P.: “Semi-automatic semantic-based web
service classification”, International Conference on Knowledge-Based
Intelligent Information and Engineering Systems, 2006.
[Dong’04] Dong, X.L. et al.: ”Similarity Search for Web Services”, VLDB,
2004.
[Platzer’05] Platzer, C.; Dustdar, S.: “A vector space search engine for Web
services”, Proceedings of IEEE European Conference on Web services
(ECOWS), 2005.
[Stroulia’05] Stroulia, E., Wang, Y.: ”Structural and Semantic Matching for
Accessing Web Service Similarity”, International Journal of Cooperative
Information Systems, Vol. 14, No. 4, 2005, pp. 407-437.
[Wu’05] Wu, J., Wu, Z.: ”Similarity-based Web Service Matchmaking”, IEEE
International Conference on Services Computing, 2005, pp. 287-294.
[Zhuang’05] Zhuang, Z., Mitra, Pr., Jaiswal, A.: ”Corpus-based Web Services
Matchmaking”, AAAI, 2005.
[Verma’05] Verma, K., Sivashanmugam, K., et al.: “Meteors wsdi: A scalable
p2p infrastructure of registries for semantic publication and discovery of
web services.” Journal of Information Technology and Management. Special
Issue on Universal Global Integration, Vol. 6, No.1, 2005, pp. 17-39.
Hybrid algorithms: Related work
• Syeda-Mahmood, T., Shah, G., et al.: “Searching service
repositories by combining semantic and ontological
matching”, International Conference on Web Services, 2005,
pp. 13-20.
“(1) The domain-independent relationships are derived using
an English thesaurus… (2) The domain-specific ontological
similarity is derived by inferencing the semantic annotations
associated with web service descriptions.
…better relevancy results can be obtained for service matches
from a large repository, than could be obtained using any
one cue alone.”
• Klusch, M. Fries, B., Sycara, K.: “Automated Semantic Web
Service Discovery with OWLS-MX”, AAMAS, 2006.
“…under certain constraints logic based only approaches to
OWLS service I/O matching can be significantly
outperformed by hybrid ones.”
Hybrid matching: Related work
• Rocha, C. et al.: “A Hybrid Approach for Searching in the
Semantic Web”, International World Wide Web Conference,
2004, pp. 374-383)
• Castells, P., Fernandez, M., Vallet, D.: “An Adaptation of the
Vector-Space Model for Ontology-Based Information
Retrieval”, IEEE Transactions on Knowledge and Data
Engineering, 2007, to appear.