投影片 1 - Welcome to Yuh-Jzer Joung's Web host!
Download
Report
Transcript 投影片 1 - Welcome to Yuh-Jzer Joung's Web host!
Introduction to
Dist. Info. Systems Lab. – Group B
(DISLab-B@NTUIM)
分散式資訊系統實驗室 (B)
2002.3.27.
DISLab-B
NTUIM
Outline
Introduction to DISLab-B
Selected Topic 1:
Web
Services Brokering
Selected Topic 2:
Some
Improvements for CAN
Selected Topic 3:
A Sequence-Based
Text Retrieval System
DISLab-B
NTUIM
Profile
Location
教學館
Room 112-B
Homepage
http://DISLab.im.ntu.edu.tw
Telephone number
+886
2 2363-0231 Ext. 4131
DISLab-B
NTUIM
People
Supervisor
蔡益坤
Dr. Tsay
Graduate Students
蔡澤銘
饒訓豪 王佳竣 陳柏安 陳郁方 葉雲文 陳健龍
蔡明憲 陳柏宏 陳佳蘊 劉智雄 許均揚 陳柏均
Undergraduate Students
黃鈞澤
喻至瑋 李勝驊 黃如頤
DISLab-B
NTUIM
Research Areas
Distributed Algorithms / Systems
Web Services / Semantic Web
Security
Text Retrieval
Formal Verification
DISLab-B
NTUIM
Research Areas (cont.)
Distributed Algorithms / Systems
Exclusion
and synchronization
Object location and routing
Tapestry (Plaxton), CAN, P-Grid
RFID-based
systems
DISLab-B
NTUIM
Research Areas (cont.)
RFID-based
systems
Source: Technology Guid, http://www.autoidcenter.org
DISLab-B
NTUIM
Research Areas (cont.)
Web Services / Semantic Web
Service
discovery, composition, and execution
Semantic annotation
DISLab-B
NTUIM
Research Areas (cont.)
Security
Mobile
agent protection
Web service security
Smart card-aware applications
DISLab-B
NTUIM
Research Areas (cont.)
Text Retrieval
Chinese-English
text indexing and retrieval
Publications management
DISLab-B
NTUIM
Research Areas (cont.)
Formal Verification
Temporal
logic
Infinite-state systems
DISLab-B
NTUIM
Intelligent
Web Services
Brokering
Chia-Chun Wang
DISLab-B
NTUIM
Research Objectives
Development of a Web services brokering
framework
Design of service matching algorithms
Incorporation of negotiation mechanisms
into Web services
Implementation of a prototype system
Automation of a business scenario--planning a trip
DISLab-B
NTUIM
Relevant Standards
Service Description Languages:
WSDL(interface),
Service Registries:
UDDI,
ebXML, …
Service Composition Description
Language
BPEL4WS,
DAML-S(semantics), …
DAML-S, …
Ontology Definition Languages
RDF,
DAML+OIL, DAML-S, …
DISLab-B
NTUIM
Web Services Brokering
Collection and storage of
Service
descriptions
Provider/service profiles
Comprehension of the customer’s needs
Service composition
Service discovery by matching service
advertisements against requests
Service execution
DISLab-B
NTUIM
Service/Provider Profiles
A service/provider profile (in DAML)
contains information about a service and
its provider that is not included in its
service description (in WSDL).
Two service/provider profiles can be
directly compared if they are derived from
the same ontology
DISLab-B
NTUIM
A Brokering System Architecture
DISLab-B
NTUIM
Tourism-Related Ontologies
Hotels
Transportations
Geographic Information
Trips
DISLab-B
NTUIM
The Hotel Ontology
DISLab-B
NTUIM
The Transportation Ontology
DISLab-B
NTUIM
The Geography Ontology
DISLab-B
NTUIM
The Trip Ontology
DISLab-B
NTUIM
The Customer’s Needs: Planning a
Trip
Check if hotels, flights, and
trains are available.
Choose a schedule that has
convenient but inexpensive
hotels, better weather
condition, more events, etc.
Day
Morning
1
2
3
4
5
6
7
8
9
10
Taipei -> Osaka, class=B
FRI
SAT
SUN
MON
TUE
WED
THU
FRI
SAT
SUN
Afternoon
Eveing
Day
Schedule A Schedule B Schedule C
1
2
3
4
5
6
7
8
9
10
2002/7/5
2002/7/6
2002/7/7
2002/7/8
2002/7/9
2002/7/10
2002/7/11
2002/7/12
2002/7/13
2002/7/14
FRI
SAT
SUN
MON
TUE
WED
THU
FRI
SAT
SUN
2002/7/12
2002/7/13
2002/7/14
2002/7/15
2002/7/16
2002/7/17
2002/7/18
2002/7/19
2002/7/20
2002/7/21
2002/9/6
2002/9/7
2002/9/8
2002/9/9
2002/9/10
2002/9/11
2002/9/12
2002/9/13
2002/9/14
2002/9/15
Accommodation
Osaka, class>=3
airplane, hotel
Osaka, class>=3
hotel
Osaka -> Sapporo, class=A Sleeping Car, class<=A
reasoning for trains overnight
Sapporo, class>=3
hotel
Taipei
台北
Sapporo, class>=3
hotel
大阪
Sapporo -> Hakodate, class=B Hakodate, class>=3
train, hotel Osaka
札幌
Hakodate -> Tokyo, class=B Tokyo, class>=3, downtown<=0.5miletrain, hotel Sapporo
Hakodate 函館
Tokyo, class>=3, downtown<=0.5milehotel
Tokyo
東京
Tokyo, class>=3, downtown<=0.5milehotel
Tokyo -> Taipei, class=B
bus, airplane
DISLab-B
NTUIM
The Service Matchmaker
Input
General
requirements on the provider/service
in DAML
The requested service in WSDL
Output
Ranked
URLs to WDSL files
Database
Provider/service
profiles in DAML
Service descriptions in WSDL
DISLab-B
NTUIM
A Hotel Requirement: WSDL Part
<wsdl:message name="ReserveIn">
<wsdl:part name="CustomerName" element="s:string" />
<wsdl:part name="StartDate" element="s:string" />
<wsdl:part name="EndDate" element="s:string" />
<wsdl:part name="NumRooms" element="s:integer" />
<wsdl:part name="RoomType" element="s:string" />
<wsdl:part name="NumPersons" element="s:integer" />
</wsdl:message>
<wsdl:message name="ReserveOut">
<wsdl:part name="RservationNum" element="s:string" />
<wsdl:part name="Comment" element="s:string" />
</wsdl:message>
<message name="CancelIn">
<wsdl:part name="CustomerName" element="s:string" />
<wsdl:part name="RservationNum" element="s:string" />
</message>
DISLab-B
NTUIM
A Hotel Requirement: WSDL Part
(cont.)
<message name="CancelOut">
<wsdl:part name="Result" element="s:string" />
<wsdl:part name="Comment" element="s:string" />
</message>
<message name="GeneralInformationIn">
</message>
<message name="GeneralInformationOut">
<wsdl:part name="GeneralInfomation" element="s:string" />
</message>
DISLab-B
NTUIM
A Hotel Requirement: WSDL Part
(cont.)
<wsdl:portType name="ReserveWS">
<wsdl:operation name="Reserve" >
<wsdl:input message="intf:ReserveIn"/>
<wsdl:output message="intf:ReserveOut"/>
</wsdl:operation>
</wsdl:portType>
<wsdl:portType name="CancelWS">
<wsdl:operation name="Cancel" >
<wsdl:input message="intf:CancelIn"/>
<wsdl:output message="intf:CancelOut"/>
</wsdl:operation>
</wsdl:portType>
<wsdl:portType name="GeneralInformationWS">
<wsdl:operation name="GeneralInformation" >
<wsdl:input message="intf:GeneralInformationIn"/>
<wsdl:output message="intf:GeneralInformationOut"/>
</wsdl:operation>
</wsdl:portType>
DISLab-B
NTUIM
A Hotel Requirement: DAML Part
<rdf:Description rdf:about="#myHotelRequirement">
<rdf:type>
<daml:Class
rdf:about="http://dislab.im.ntu.edu.tw/ontology/hotel.daml#hotel"/>
</rdf:type>
<ns0:hotelClass><xsd:integer xsd:value="4"/></ns0:hotelClass>
<ns0:hotelCity><xsd:string xsd:value="Tokyo"/></ns0:hotelCity>
<ns0:hotelCountry><xsd:string xsd:value="Japan"/></ns0:hotelCountry>
<ns0:hotelRoom rdf:resource="#myRoomRequirement"/>
</rdf:Description>
<rdf:Description rdf:about="#myRoomRequirement">
<rdf:type>
<daml:Class
rdf:about="http://dislab.im.ntu.edu.tw/ontology/hotel.daml#room"/>
</rdf:type>
</rdf:Description>
DISLab-B
NTUIM
A Prototype Brokering System
DISLab-B
NTUIM
Outline
Problem definition, models, approaches,
and analyses
Preliminaries
Our scheme for object routing and location
DISLab-B
NTUIM
Problem definition, model,
approaches, and analyses
Basic model: A overlay network G = (V,E)
which is weighted and is often assumed
metric shares objects (resources) O, |V| =
n, |O| = m
A node u requests for an object o held
possibly by any other node v, i.e., u routes
to or locates o
Considers scalability, availability, load
balance, locality, mobility, faulttolerance …
DISLab-B
NTUIM
Problem definition, models,
analyses, and approaches
(cont’d)
Two approaches: Plaxton’s, Tapestry, Pastry,
CAN, Chord v.s. Freenet, Gnutella, Morpheus,
Napster
The
former guarantee “availability” while the latter do
not
The latter is more facilitating for “search queries” than
the former
Complexity measures
Routing
Stretch:
path length
the ratio of “the cost of the routing path the
routing algorithm takes” to “the cost of the shortest
routing path”
Routing table space
DISLab-B
NTUIM
Preliminaries
Hypercubes
110
010
111
011
100
101
000
001
Routing
path length: O(log n)
Routing table space: O(log n)
DISLab-B
NTUIM
Preliminaries (cont’d)
Plaxton’s (Tapestry, Pastry, Chord)
Mapping
object ID to node ID
Unmapped digits need not to be fixed which makes it
possible to bound the stretch (O(1))
1XX
11X
XXX
111
Request for object 11111
Relating Plaxton’s (Tapestry, Pastry, Chord) to
hypercubes
Hypercubes
cannot take edge costs into
consideration
DISLab-B
NTUIM
Preliminaries (cont’d)
CAN (Content-Addressable Network)
O
G
I
B
P
C
J
H
M
N
QE
A
D
L
E
F
K
Relating CAN to Plaxton’s (Tapestry, Pastry,
Chord)
{O(d n1/d)} = O(log n) when d = log n, which
means arriving at destination in each dimension in
constant hops rather than in n1/d hops
min
DISLab-B
NTUIM
Our scheme for object routing
and location
CAN is a more general model in some way than
Plaxton-like schemes, it achieves
average routing path length: O(d n1/d) hops
O(d) space, where d is the number of dimensions
An
Yet, at each dimension, the routing does like an
“linear search” (due to n1/d), which can be
improved
Idea: with some bounded extra information, a
distributed search structure can be built for each
dimension which enables a more efficient
routing
DISLab-B
NTUIM
Our scheme for object routing
and location (cont’d) -- Idea
Illustration
O
N
M
G
I
H
B
P
A
QE
D
C
A
E
B
F
E
P
J
L
K
G
B
Q
P
A
Q
E
Q
D
L
H
D
Q
H
DISLab-B
NTUIM
Our scheme for object routing
and location (cont’d)
Since tree structure does not provide each node
equal probability responsible for routing requests,
it becomes necessary to balance each node’s
load
This balancing task is made possible for
Each
node participates in d different binary trees at
the same time
It has a height of its position in each tree
An average routing path length: O(d log n1/d) =
O(log n) hops
Routing table space: O(d)
DISLab-B
NTUIM
Our scheme for object routing
and location (cont’d)
The stretch is an issue which is tackled by
Topologically-sensitive construction of the
overlay network and other optimizations in CAN
Yet, CAN does not provide a formal proof to
bound the stretch; instead, it gives simulation
results
We still work on this stretch issue
Other interesting related issues include
Search
ability
Handling mutable objects
Security
DISLab-B
NTUIM
Sequence-Based Text
Retrieval
DISLab-B
NTUIM
What is Sequence-Based
Retrieval Method
Basically the method treats both user
query and target document as sequences
It uses sequence similarity metrics to
judge relevance between sequences
DISLab-B
NTUIM
Compare with other method
Boolean model:
Quantity
and position of matched terms is not
considered
Query String: aba
In this model, axxb , aba has the same importance
Vector model:
Position
of matched terms is not considered
Query String: aba
In this model, axxba , xxaba has the same importance
DISLab-B
NTUIM
SIR System
SIR is a text retrieval system implements
sequence-based method
DISLab-B
NTUIM
Information Retrieval
Information:
Doc 1, pos 1,7,51
Doc 2, pos 3
Doc 3, pos 11,19
DISLab-B
NTUIM
Candidate document
selection
In SIR system, we treat documents
contains any one word in query string as
candidate documents
This approach is inefficient, a huge
number of irrelevant documents will be
considered as relevant ones
DISLab-B
NTUIM
Multi-language support
The SIR system handles Chinese and
English
However the sequence-based approach is
applicable to other language
DISLab-B
NTUIM
Basic differences between
English and Chinese
In English text retrieval
Word
stemming
Stop word
A word is an appropriate unit for text retrieval
In Chinese text retrieval
Do
not have clear word boundary
A character is an appropriate unit for text
retrieval
DISLab-B
NTUIM
DEMO
DISLab-B
NTUIM