投影片 1 - Welcome to Yuh-Jzer Joung's Web host!

Download Report

Transcript 投影片 1 - Welcome to Yuh-Jzer Joung's Web host!

Introduction to
Dist. Info. Systems Lab. – Group B
(DISLab-B@NTUIM)
分散式資訊系統實驗室 (B)
2002.3.27.
DISLab-B
NTUIM
Outline
Introduction to DISLab-B
 Selected Topic 1:

 Web

Services Brokering
Selected Topic 2:
 Some

Improvements for CAN
Selected Topic 3:
 A Sequence-Based
Text Retrieval System
DISLab-B
NTUIM
Profile

Location
 教學館

Room 112-B
Homepage
 http://DISLab.im.ntu.edu.tw

Telephone number
 +886
2 2363-0231 Ext. 4131
DISLab-B
NTUIM
People

Supervisor
 蔡益坤

Dr. Tsay
Graduate Students
 蔡澤銘
饒訓豪 王佳竣 陳柏安 陳郁方 葉雲文 陳健龍
 蔡明憲 陳柏宏 陳佳蘊 劉智雄 許均揚 陳柏均

Undergraduate Students
 黃鈞澤
喻至瑋 李勝驊 黃如頤
DISLab-B
NTUIM
Research Areas





Distributed Algorithms / Systems
Web Services / Semantic Web
Security
Text Retrieval
Formal Verification
DISLab-B
NTUIM
Research Areas (cont.)

Distributed Algorithms / Systems
 Exclusion
and synchronization
 Object location and routing

Tapestry (Plaxton), CAN, P-Grid
 RFID-based
systems
DISLab-B
NTUIM
Research Areas (cont.)

RFID-based
systems
Source: Technology Guid, http://www.autoidcenter.org
DISLab-B
NTUIM
Research Areas (cont.)

Web Services / Semantic Web
 Service
discovery, composition, and execution
 Semantic annotation
DISLab-B
NTUIM
Research Areas (cont.)

Security
 Mobile
agent protection
 Web service security
 Smart card-aware applications
DISLab-B
NTUIM
Research Areas (cont.)

Text Retrieval
 Chinese-English
text indexing and retrieval
 Publications management
DISLab-B
NTUIM
Research Areas (cont.)

Formal Verification
 Temporal
logic
 Infinite-state systems
DISLab-B
NTUIM
Intelligent
Web Services
Brokering
Chia-Chun Wang
DISLab-B
NTUIM
Research Objectives
Development of a Web services brokering
framework
 Design of service matching algorithms
 Incorporation of negotiation mechanisms
into Web services
 Implementation of a prototype system
 Automation of a business scenario--planning a trip

DISLab-B
NTUIM
Relevant Standards

Service Description Languages:
 WSDL(interface),

Service Registries:
 UDDI,

ebXML, …
Service Composition Description
Language
 BPEL4WS,

DAML-S(semantics), …
DAML-S, …
Ontology Definition Languages
 RDF,
DAML+OIL, DAML-S, …
DISLab-B
NTUIM
Web Services Brokering

Collection and storage of
 Service
descriptions
 Provider/service profiles
Comprehension of the customer’s needs
 Service composition
 Service discovery by matching service
advertisements against requests
 Service execution

DISLab-B
NTUIM
Service/Provider Profiles
A service/provider profile (in DAML)
contains information about a service and
its provider that is not included in its
service description (in WSDL).
 Two service/provider profiles can be
directly compared if they are derived from
the same ontology

DISLab-B
NTUIM
A Brokering System Architecture
DISLab-B
NTUIM
Tourism-Related Ontologies
Hotels
 Transportations
 Geographic Information
 Trips

DISLab-B
NTUIM
The Hotel Ontology
DISLab-B
NTUIM
The Transportation Ontology
DISLab-B
NTUIM
The Geography Ontology
DISLab-B
NTUIM
The Trip Ontology
DISLab-B
NTUIM
The Customer’s Needs: Planning a
Trip
Check if hotels, flights, and
trains are available.
Choose a schedule that has
convenient but inexpensive
hotels, better weather
condition, more events, etc.


Day
Morning
1
2
3
4
5
6
7
8
9
10
Taipei -> Osaka, class=B
FRI
SAT
SUN
MON
TUE
WED
THU
FRI
SAT
SUN
Afternoon
Eveing
Day
Schedule A Schedule B Schedule C
1
2
3
4
5
6
7
8
9
10
2002/7/5
2002/7/6
2002/7/7
2002/7/8
2002/7/9
2002/7/10
2002/7/11
2002/7/12
2002/7/13
2002/7/14
FRI
SAT
SUN
MON
TUE
WED
THU
FRI
SAT
SUN
2002/7/12
2002/7/13
2002/7/14
2002/7/15
2002/7/16
2002/7/17
2002/7/18
2002/7/19
2002/7/20
2002/7/21
2002/9/6
2002/9/7
2002/9/8
2002/9/9
2002/9/10
2002/9/11
2002/9/12
2002/9/13
2002/9/14
2002/9/15
Accommodation
Osaka, class>=3
airplane, hotel
Osaka, class>=3
hotel
Osaka -> Sapporo, class=A Sleeping Car, class<=A
reasoning for trains overnight
Sapporo, class>=3
hotel
Taipei
台北
Sapporo, class>=3
hotel
大阪
Sapporo -> Hakodate, class=B Hakodate, class>=3
train, hotel Osaka
札幌
Hakodate -> Tokyo, class=B Tokyo, class>=3, downtown<=0.5miletrain, hotel Sapporo
Hakodate 函館
Tokyo, class>=3, downtown<=0.5milehotel
Tokyo
東京
Tokyo, class>=3, downtown<=0.5milehotel
Tokyo -> Taipei, class=B
bus, airplane
DISLab-B
NTUIM
The Service Matchmaker

Input
 General
requirements on the provider/service
in DAML
 The requested service in WSDL

Output
 Ranked

URLs to WDSL files
Database
 Provider/service
profiles in DAML
 Service descriptions in WSDL
DISLab-B
NTUIM
A Hotel Requirement: WSDL Part
<wsdl:message name="ReserveIn">
<wsdl:part name="CustomerName" element="s:string" />
<wsdl:part name="StartDate" element="s:string" />
<wsdl:part name="EndDate" element="s:string" />
<wsdl:part name="NumRooms" element="s:integer" />
<wsdl:part name="RoomType" element="s:string" />
<wsdl:part name="NumPersons" element="s:integer" />
</wsdl:message>
<wsdl:message name="ReserveOut">
<wsdl:part name="RservationNum" element="s:string" />
<wsdl:part name="Comment" element="s:string" />
</wsdl:message>
<message name="CancelIn">
<wsdl:part name="CustomerName" element="s:string" />
<wsdl:part name="RservationNum" element="s:string" />
</message>
DISLab-B
NTUIM
A Hotel Requirement: WSDL Part
(cont.)
<message name="CancelOut">
<wsdl:part name="Result" element="s:string" />
<wsdl:part name="Comment" element="s:string" />
</message>
<message name="GeneralInformationIn">
</message>
<message name="GeneralInformationOut">
<wsdl:part name="GeneralInfomation" element="s:string" />
</message>
DISLab-B
NTUIM
A Hotel Requirement: WSDL Part
(cont.)
<wsdl:portType name="ReserveWS">
<wsdl:operation name="Reserve" >
<wsdl:input message="intf:ReserveIn"/>
<wsdl:output message="intf:ReserveOut"/>
</wsdl:operation>
</wsdl:portType>
<wsdl:portType name="CancelWS">
<wsdl:operation name="Cancel" >
<wsdl:input message="intf:CancelIn"/>
<wsdl:output message="intf:CancelOut"/>
</wsdl:operation>
</wsdl:portType>
<wsdl:portType name="GeneralInformationWS">
<wsdl:operation name="GeneralInformation" >
<wsdl:input message="intf:GeneralInformationIn"/>
<wsdl:output message="intf:GeneralInformationOut"/>
</wsdl:operation>
</wsdl:portType>
DISLab-B
NTUIM
A Hotel Requirement: DAML Part
<rdf:Description rdf:about="#myHotelRequirement">
<rdf:type>
<daml:Class
rdf:about="http://dislab.im.ntu.edu.tw/ontology/hotel.daml#hotel"/>
</rdf:type>
<ns0:hotelClass><xsd:integer xsd:value="4"/></ns0:hotelClass>
<ns0:hotelCity><xsd:string xsd:value="Tokyo"/></ns0:hotelCity>
<ns0:hotelCountry><xsd:string xsd:value="Japan"/></ns0:hotelCountry>
<ns0:hotelRoom rdf:resource="#myRoomRequirement"/>
</rdf:Description>
<rdf:Description rdf:about="#myRoomRequirement">
<rdf:type>
<daml:Class
rdf:about="http://dislab.im.ntu.edu.tw/ontology/hotel.daml#room"/>
</rdf:type>
</rdf:Description>
DISLab-B
NTUIM
A Prototype Brokering System
DISLab-B
NTUIM
Outline
Problem definition, models, approaches,
and analyses
 Preliminaries
 Our scheme for object routing and location

DISLab-B
NTUIM
Problem definition, model,
approaches, and analyses
Basic model: A overlay network G = (V,E)
which is weighted and is often assumed
metric shares objects (resources) O, |V| =
n, |O| = m
 A node u requests for an object o held
possibly by any other node v, i.e., u routes
to or locates o
 Considers scalability, availability, load
balance, locality, mobility, faulttolerance …

DISLab-B
NTUIM

Problem definition, models,
analyses, and approaches
(cont’d)
Two approaches: Plaxton’s, Tapestry, Pastry,
CAN, Chord v.s. Freenet, Gnutella, Morpheus,
Napster
 The
former guarantee “availability” while the latter do
not
 The latter is more facilitating for “search queries” than
the former

Complexity measures
 Routing
 Stretch:
path length
the ratio of “the cost of the routing path the
routing algorithm takes” to “the cost of the shortest
routing path”
 Routing table space
DISLab-B
NTUIM
Preliminaries

Hypercubes
110
010
111
011
100
101
000
001
 Routing
path length: O(log n)
 Routing table space: O(log n)
DISLab-B
NTUIM
Preliminaries (cont’d)

Plaxton’s (Tapestry, Pastry, Chord)
 Mapping
object ID to node ID
 Unmapped digits need not to be fixed which makes it
possible to bound the stretch (O(1))
1XX
11X
XXX

111
Request for object 11111
Relating Plaxton’s (Tapestry, Pastry, Chord) to
hypercubes
 Hypercubes
cannot take edge costs into
consideration
DISLab-B
NTUIM
Preliminaries (cont’d)

CAN (Content-Addressable Network)
O
G
I
B
P
C
J

H
M
N
QE
A
D
L
E
F
K
Relating CAN to Plaxton’s (Tapestry, Pastry,
Chord)
{O(d n1/d)} = O(log n) when d = log n, which
means arriving at destination in each dimension in
constant hops rather than in n1/d hops
 min
DISLab-B
NTUIM
Our scheme for object routing
and location

CAN is a more general model in some way than
Plaxton-like schemes, it achieves
average routing path length: O(d n1/d) hops
 O(d) space, where d is the number of dimensions
 An


Yet, at each dimension, the routing does like an
“linear search” (due to n1/d), which can be
improved
Idea: with some bounded extra information, a
distributed search structure can be built for each
dimension which enables a more efficient
routing
DISLab-B
NTUIM
Our scheme for object routing
and location (cont’d) -- Idea
Illustration
O
N
M
G
I
H
B
P
A
QE
D
C
A
E
B
F
E
P
J
L
K
G
B
Q
P
A
Q
E
Q
D
L
H
D
Q
H
DISLab-B
NTUIM
Our scheme for object routing
and location (cont’d)


Since tree structure does not provide each node
equal probability responsible for routing requests,
it becomes necessary to balance each node’s
load
This balancing task is made possible for
 Each
node participates in d different binary trees at
the same time
 It has a height of its position in each tree


An average routing path length: O(d log n1/d) =
O(log n) hops
Routing table space: O(d)
DISLab-B
NTUIM
Our scheme for object routing
and location (cont’d)




The stretch is an issue which is tackled by
Topologically-sensitive construction of the
overlay network and other optimizations in CAN
Yet, CAN does not provide a formal proof to
bound the stretch; instead, it gives simulation
results
We still work on this stretch issue
Other interesting related issues include
 Search
ability
 Handling mutable objects
 Security
DISLab-B
NTUIM
Sequence-Based Text
Retrieval
DISLab-B
NTUIM
What is Sequence-Based
Retrieval Method
Basically the method treats both user
query and target document as sequences
 It uses sequence similarity metrics to
judge relevance between sequences

DISLab-B
NTUIM
Compare with other method

Boolean model:
 Quantity
and position of matched terms is not
considered
Query String: aba
In this model, axxb , aba has the same importance

Vector model:
 Position
of matched terms is not considered
Query String: aba
In this model, axxba , xxaba has the same importance
DISLab-B
NTUIM
SIR System

SIR is a text retrieval system implements
sequence-based method
DISLab-B
NTUIM
Information Retrieval
Information:
Doc 1, pos 1,7,51
Doc 2, pos 3
Doc 3, pos 11,19
DISLab-B
NTUIM
Candidate document
selection
In SIR system, we treat documents
contains any one word in query string as
candidate documents
 This approach is inefficient, a huge
number of irrelevant documents will be
considered as relevant ones

DISLab-B
NTUIM
Multi-language support
The SIR system handles Chinese and
English
 However the sequence-based approach is
applicable to other language

DISLab-B
NTUIM
Basic differences between
English and Chinese

In English text retrieval
 Word
stemming
 Stop word
 A word is an appropriate unit for text retrieval

In Chinese text retrieval
 Do
not have clear word boundary
 A character is an appropriate unit for text
retrieval
DISLab-B
NTUIM
DEMO
DISLab-B
NTUIM