[PowerPoint presentation]

Download Report

Transcript [PowerPoint presentation]

YAPPERS: A Peer-to-Peer Lookup
Service over Arbitrary Topology
Qixiang Sun
Prasanna Ganesan
Hector Garcia-Molina
Stanford University
Outline
• Background and Motivation
• High-level overview of YAPPERS
• Brief evaluation
Problem
Where is X?
Problem (2)
A
C
1. Search
2. Node join/leave
3. Register/remove content
B
Background
• Gnutella-style
• join
– anywhere in the
overlay
• register
– do nothing
• search
– flood the overlay
Background (2)
• Distributed hash table
(DHT)
Chord
...
• join
– a unique location in
the overlay
• register
– place pointer at a
unique node
• search
CAN
– route towards the
unique node
Background (3)
• Gnutella-style
+
+
+
+
Simple
Local control
Robust
Arbitrary topology
– Inefficient
– Disturbs many nodes
• DHT
+ Efficient search
– Restricted overlay
– Difficulty with dynamism
Motivation
• Best of both worlds
– Gnutella’s local interactions
– DHT-like efficiency
• Respect application-defined topology
– Social network
– Ad hoc wireless network
– Physical-network proximity
Partition Nodes
Given any overlay, first partition nodes into
buckets (colors) based on hash of IP
Partition Nodes
Given any overlay, first partition nodes into
buckets (colors) based on hash of IP
Partition Nodes (2)
Around each node, there is at least one
node of each color
X
Y
May require backup color assignments
Register Content
Partition content space into buckets (colors)
and register pointer at “nearby” nodes.
Nodes around
Z form a small
hash table!
Z
register red
content locally
register yellow
content at a
yellow node
Searching Content
Start at a “nearby” colored node, search
other nodes of the same color.
X
W
Y
U
V
Z
Searching Content (2)
A smaller overlay for each color and use
Gnutella-style flood
Fan-out = degree of nodes in the smaller overlay
Recap
• Hybrid approach
– Around each node, act like a hash table
– Flood the relevant nodes in the entire network
• What do we gain?
– Respect original overlay
– Efficient search for popular data
– Avoid disturbing nodes unnecessarily
Brief Evaluation
• Using a 24,702 nodes Gnutella snapshot
as the underlying overlay
• We study
– Number of nodes contacted per query when
searching the entire network
– Trade-off in using our hybrid approach when
flooding the entire network
Nodes Searched per Query
Fraction of Nodes Searched
0.3
Limited by the number
of nodes “nearby”
0.25
0.2
0.15
0.1
0.05
0
0
10
20
30
Number of Buckets (Colors)
40
50
Trade-off
• Fan-out = degree of each colored node when
flooding “nearby” nodes of the same color
Average Fan-out
Vanilla
835
Heuristics
82
• Good in searching nearby nodes quickly.
• Bad in searching the entire network
Conclusion
Does YAPPERS work?
– YES
•
•
•
•
Respects the original overlay
Searches efficiently in small area
Disturbs fewer nodes than Gnutella
Handles node arrival/departure better than DHT
– NO
• Large fan-out (vanilla flooding won’t work)
For More Information
• A short position paper advocating locallyorganized P2P systems
http://dbpubs.stanford.edu/pub/2002-60
• Other P2P work at Stanford
http://www-db.stanford.edu/peers
Recap
•
node join
– anywhere in the overlay
•
register content
– at nearby node(s) of the appropriate color
•
search
– start at a nearby node of the search color
and then flood nodes of the same color.
What Do We Gain?
• Respect original overlay
• Efficient search for popular data
• Avoid disturbing nodes unnecessarily
• Better handling of dynamic node arrival
and departure
Design Issues
• How to build a small hash table around
each node, i.e., assign colors?
• How to connect nodes of the same color?
Small-scale Hash Table
Small = all nodes within h hops (e.g., h=2)
– Consistent across overlapping hash tables
– Stable when nodes enter/leave
C
A
X
B
Small-scale Hash Table (2)
• Fixed number of buckets (colors)
• Determine bucket (color) based on the
hash value of node IP addresses
– Multiple nodes of the same color
– No nodes of a color
Searching the Overlay
Find another node of the same color in a
“nearby” hash table
Frontier Node
B
All nodes
within h hops
A
C
Need to track all nodes within 2h+1 hops
Searching the Overlay (2)
For a color C and each frontier node v,
1. determine which nodes v might
contact to search for color C
2. contact these nodes
Theorem: Regardless of starting node, one
can search all nodes of all color.
Buckets per Node
• Using 32 buckets (colors) per hash table
30
AVG = 3.7
Fraction of Nodes (%)
25
20
15
10
3.7
5
0
1
2
3
4
5
6
7
8
9
10
11
12
Number of Buckets (Colors) per Node
13
14
15
32
= 11.5%
Overloading a Node
• A node may have many colors even if it
has a large neighborhood.
A
X