Adaptive Replication to Achieve O(1) Lookup Time in DHT

Download Report

Transcript Adaptive Replication to Achieve O(1) Lookup Time in DHT

Beehive: Achieving O(1) Lookup
Performance in P2P Overlays for Zipflike Query Distributions
Venugopalan Ramasubramanian (Rama)
and
Emin Gün Sirer
Cornell University
introduction
caching is widely-used to improve
latency and to decrease overhead
passive caching


caches distributed throughout the network
store objects that are encountered
not well-suited for a large-class
applications
problems with passive caching
no performance guarantees
heavy-tail effect


large percentage of queries to unpopular
objects
ad-hoc heuristics for cache management
introduces coherency problems


difficult to locate all copies
weak consistency model
overview of beehive
general replication framework for structured
DHTs

decentralization, self-organization, resilience
properties



high performance: O(1) average lookup time
scalable: minimize number of replicas and reduce
storage, bandwidth, and network load
adaptive: promptly respond to changes in
popularity – flash crowds
prefix-matching DHTs
object 0121
logbN hops
0021

0112
0122
2012
several RTTs on the
Internet
key intuition
tunable latency

0021
0112
0122
2012
adjust number of
objects replicated at
each level
fundamental spacetime tradeoff
analytical model
optimization problem
minimize: total number of replicas, s.t.,
average lookup performance  C
configurable target lookup performance

continuous range, sub one-hop
minimizing number of replicas decreases
storage and bandwidth overhead
analytical model
zipf-like query distributions with parameter 


number of queries to rth popular object  1/r
fraction of queries for m most popular objects 
(m1- - 1) / (M1- - 1)
level of replication



nodes share i prefix-digits with the object
i hop lookup latency
replicated on N/bi nodes
optimization problem
minimize (storage/bandwidth)
x0 + x1/b + x2/b2 + … + xK-1/bK-1
such that (average lookup time is C hops)
K – (x01- + x11- + x21- + … + xK-11-)  C
and
x0  x1  x2  …  xK-1  1
b: base
K: logb(N)
xi: fraction of objects replicated at level i
optimal closed-form solution
x*i =
[
]
dj (K’ – C)
1 + d + … + dK’-1
1
1
1-
, 0  i  K’ – 1
, K’  i  K
where, d = b(1- ) /
K’ is determined by setting (typically 2 or 3)
x*K’-1  1
 dK’-1 (K’ – C) / (1 + d + … + dK’-1)  1
latency - overhead trade off
beehive: system overview
estimation


popularity of objects, zipf parameter
local measurement, limited aggregation
replication


apply analytical model independently at each node
push new replicas to nodes at most one hop away
L1
beehive replication protocol
home node
L3
L2
0 *
A
0 1 * B
0 *
B
0 *
C
0 1 *
0 *
D
E
0 1 2 *
0 *
E
object 0121
E
0 *
F
0 1 * I
0 *
G
0 *
H
0 *
I
mutable objects
leverage the underlying structure of DHT

replication level indicates the locations of all the
replicas
proactive propagation to all nodes from the
home node


home node sends to one-hop neighbors with i
matching prefix-digits
level i nodes send to level i+1 nodes
implementation and evaluation
implemented using Pastry as the underlying DHT
evaluation using a real DNS workload



MIT DNS trace (zipf parameter 0.91)
1024 nodes, 40960 objects
compared with passive caching on pastry
main properties evaluated



lookup performance
storage and bandwidth overhead
adaptation to changes in query distribution
evaluation: lookup performance
passive caching is not
very effective because
of heavy tail query
distribution and
mutable objects.
beehive converges to
the target of 1 hop
evaluation: overhead
Bandwidth
Storage
average number of replicas
per node
Pastry
40
PC-Pastry
420
Beehive
380
evaluation: flash crowds
lookup performance
evaluation: zipf parameter change
Cooperative Domain Name System
(CoDoNS)
replacement for legacy DNS

secure authentication through DNSSEC
incremental deployment path


completely transparent to clients
uses legacy DNS to populate resource
records on demand
deployed on planet-lab
advantages of CoDoNS
higher performance than legacy DNS

median latency of 7 ms for codons (planetlab), 39 ms for legacy DNS
resilience against denial of service
attacks

self configuration after host and network
failures
fast update propagation
conclusions
model-driven proactive caching

O(1) lookup performance with optimal replicas
beehive: a general replication framework


structured overlays with uniform fan-out
high performance, resilience, improved availability
well-suited for latency sensitive applications
www.cs.cornell.edu/people/egs/beehive
evaluation: zipf parameter change
evaluation: instantaneous
bandwidth overhead
lookup performance: target 0.5 hops
lookup performance: planet-lab
typical values of zipf parameter
MIT DNS trace:  = 0.91
Web traces:
trace
Dec
UPisa
FuNet
UCB
Quest
NLANR

0.83
0.84
0.84
0.83
0.88
0.90
comparative overview of
structured DHTs
DHT
lookup
performance
CAN
O(dN1/d)
Chord, Kademlia, Pastry, Tapestry,
Viceroy
O(logN)
de Bruijn graphs (Koorde)
O(logN/loglogN)
Kelips, Salad, [Gupta, Liskov,
Rodriguez], [Mizrak, Cheng, Kumar,
Savage]
O(1)
O(1) structured DHTs
DHT
lookup
performance
routing state
Salad
d
O(dN1/d)
[Mizrak, Cheng,
Kumar, Savage]
2
N
Kelips
1
N
(N replication)
[Gupta, Liskov,
Rodriguez]
1
N
security issues in beehive
underlying DHT


corruption in routing tables
[Castro, Druschel, Ganesh, Rowstrom, Wallach]
beehive


misrepresentation of popularity
remove outliers
application


corruption of data
certificates (ex. DNS-SEC)
Beehive DNS: Lookup Performance
CoDoNS
Legacy DNS
median
6.56 ms
38.8 ms
90th percentile
281 ms
337 ms