SKIP GRAPHS (continued) James Aspnes Gauri Shah

Download Report

Transcript SKIP GRAPHS (continued) James Aspnes Gauri Shah

SKIP GRAPHS
(continued)
Some slides adapted from the original slides
by
James Aspnes
Gauri Shah
2
So far...
Decentralization.
Locality properties.
O(log n) neighbors per node.
O(log n) search, insert, and delete time.
Independent of system size.

Coming up...
• Load balancing.
•Tolerance to faults.
• Random faults.
• Adversarial faults.
• Self-stabilization.
Load balancing
Interested in average load on a node u.
i.e. how many searches from source
s to destination t use node u?
Theorem: Let dist (u, t) = d. Then the
probability that a search from s to t passes
through u is < 2/(d+1).
where V = {nodes v: u <= v <= t} and |V| = d+1.
3
4
Skip list restriction
Level 2
Level 1
Level 0
s
Node u
u
t
Node u is on the search path from s to t only if it is in
the skip list formed from the lists of s at each level.
5
Tallest nodes
s
u is not on path.
s

u is on path.
u
u
u
t
u
u
t
Node u is on the search path from s to t only if it is
in T = the set of k tallest nodes in the path [u..t].
d+1
Pr [u εT] =  Pr[|T|=k] • k/(d+1) = E[|T|]/(d+1).
k=1
Heights independent of position, so distances are symmetric.
6
Load on node u
Start with n nodes. Each node goes to next set with prob. 1/2.
We want expected size of T = last non-empty set.
=T
We show that: E[|T|] < 2.
Asymptotically: E[|T|] = 1/(ln 2)  2x10-5  1.4427… [Trie analysis]
Average load on a node is inversely proportional
to the distance from the destination.
We also show that the distribution of average load
declines exponentially beyond this point.
7
Experimental result
1.1
1.0
Load on node
0.9
Expected load
Actual load
Destination = 76542
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
76400
76450
76500
76550
Node location
76600
76650
8
Fault tolerance
How do node failures affect skip graph performance?
Random failures: Randomly chosen nodes fail.
Experimental results.
Adversarial failures: Adversary carefully chooses
nodes that fail.
Bound on expansion ratio.
9
Random faults
Size of largest connected component
as fraction of live nodes
1.20
131072 nodes
1.00
0.60
0.40
0.20
Probability of node failure
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.00
Size
0.80
10
Searches with random failures
Fraction of failed searches
131072 nodes
10000 messages
0.20
0.15
0.10
Probability of node failure
0.6
0.5
0.4
0.3
0.2
0.00
0.1
0.05
0.0
Failed searches
0.25
11
Adversarial faults
A
dA
dA = nodes adjacent to A but
not in A.
Expansion ratio = min |dA|/|A|,
1 <= |A| <= n/2.
Theorem: A skip graph with n nodes has
expansion ratio =  (1/log n).
f failures can isolate only O(f•log n ) nodes.
12
Level 0
Level 1
Level 2
Need for repair mechanism
A
A
A
G
G
G
J
M
J
M
J
M
R
W
R
W
R
W
Node failures can leave skip graph in inconsistent state.
13
Ideal skip graph
Let xRi (xLi) be the right (left) neighbor of x
at level i.
If xLi, xRi exist:
xLi < x < xRi.
xLiRi = xRiLi = x.
Invariant
k
xLi = xLi-1.
k
xRi = xRi-1.
Level i
x
Level i-1
x
..00..
Successor
constraints
xRi
1
xR i-1
..01..
2
xR i-1
..00..
14
Basic repair
If a node detects a missing neighbor, it tries
to patch the link using other levels.
1
5
1
1
3
2
3
4
5
6
5
6
Also relink at other lower levels.
Successor constraints may be violated by node
arrivals or failures.
15
Constraint violation
Neighbor at level i not present at level (i-1).
Level i
x
x
..00.. ..01.. ..01.. ..01..
x
x
Level i-1
..00.. ..01.. ..01.. ..01..
x
x
Level i-1
..00..
..01..
zipper
Level i
..01..
..01.. ..00.. ..01..
16
Self-stabilization
Level i
zOp(B)
A
C
B
zOp(A)
zOp(E)
D
zOp(I)
F
E
zOp(D)
J
G
H
I
zipperOp
message
zOp(F)
Eventually want each connected component of the skip
graph to reorganize itself into an ideal skip graph.
17
Conclusions
Similarities with DHTs
• Decentralization.
• O(log n) space at each node.
• O(log n) search time.
• Load balancing properties.
• Tolerant of random faults.
18
Differences
Property
DHTs
Skip Graphs
O(log2n)
O(log n)
No
Yes
Repair mechanism
?
Partial
Tolerance of
adversarial faults
?
Yes
Reqd.
Not reqd.
Insert/Delete
time
Locality
Keyspace size
19
Open Problems
• Design efficient repair mechanism.
• Incorporate geographical proximity.
• Study multi-dimensional skip graphs.
• Evaluate performance in practice.