Adaptive Replication to Achieve O(1) Lookup Time in DHT
Download
Report
Transcript Adaptive Replication to Achieve O(1) Lookup Time in DHT
Beehive: Achieving O(1) Lookup
Performance in P2P Overlays for Zipflike Query Distributions
Venugopalan Ramasubramanian (Rama)
and
Emin Gün Sirer
Cornell University
introduction
caching is widely-used to improve
latency and to decrease overhead
passive caching
caches distributed throughout the network
store objects that are encountered
not well-suited for a large-class
applications
problems with passive caching
no performance guarantees
heavy-tail effect
large percentage of queries to unpopular
objects
ad-hoc heuristics for cache management
introduces coherency problems
difficult to locate all copies
weak consistency model
overview of beehive
general replication framework for structured
DHTs
decentralization, self-organization, resilience
properties
high performance: O(1) average lookup time
scalable: minimize number of replicas and reduce
storage, bandwidth, and network load
adaptive: promptly respond to changes in
popularity – flash crowds
prefix-matching DHTs
object 0121
logbN hops
0021
0112
0122
2012
several RTTs on the
Internet
key intuition
tunable latency
0021
0112
0122
2012
adjust number of
objects replicated at
each level
fundamental spacetime tradeoff
analytical model
optimization problem
minimize: total number of replicas, s.t.,
average lookup performance C
configurable target lookup performance
continuous range, sub one-hop
minimizing number of replicas decreases
storage and bandwidth overhead
analytical model
zipf-like query distributions with parameter
number of queries to rth popular object 1/r
fraction of queries for m most popular objects
(m1- - 1) / (M1- - 1)
level of replication
nodes share i prefix-digits with the object
i hop lookup latency
replicated on N/bi nodes
optimization problem
minimize (storage/bandwidth)
x0 + x1/b + x2/b2 + … + xK-1/bK-1
such that (average lookup time is C hops)
K – (x01- + x11- + x21- + … + xK-11-) C
and
x0 x1 x2 … xK-1 1
b: base
K: logb(N)
xi: fraction of objects replicated at level i
optimal closed-form solution
x*i =
[
]
dj (K’ – C)
1 + d + … + dK’-1
1
1
1-
, 0 i K’ – 1
, K’ i K
where, d = b(1- ) /
K’ is determined by setting (typically 2 or 3)
x*K’-1 1
dK’-1 (K’ – C) / (1 + d + … + dK’-1) 1
latency - overhead trade off
beehive: system overview
estimation
popularity of objects, zipf parameter
local measurement, limited aggregation
replication
apply analytical model independently at each node
push new replicas to nodes at most one hop away
L1
beehive replication protocol
home node
L3
L2
0 *
A
0 1 * B
0 *
B
0 *
C
0 1 *
0 *
D
E
0 1 2 *
0 *
E
object 0121
E
0 *
F
0 1 * I
0 *
G
0 *
H
0 *
I
mutable objects
leverage the underlying structure of DHT
replication level indicates the locations of all the
replicas
proactive propagation to all nodes from the
home node
home node sends to one-hop neighbors with i
matching prefix-digits
level i nodes send to level i+1 nodes
implementation and evaluation
implemented using Pastry as the underlying DHT
evaluation using a real DNS workload
MIT DNS trace (zipf parameter 0.91)
1024 nodes, 40960 objects
compared with passive caching on pastry
main properties evaluated
lookup performance
storage and bandwidth overhead
adaptation to changes in query distribution
evaluation: lookup performance
passive caching is not
very effective because
of heavy tail query
distribution and
mutable objects.
beehive converges to
the target of 1 hop
evaluation: overhead
Bandwidth
Storage
average number of replicas
per node
Pastry
40
PC-Pastry
420
Beehive
380
evaluation: flash crowds
lookup performance
evaluation: zipf parameter change
Cooperative Domain Name System
(CoDoNS)
replacement for legacy DNS
secure authentication through DNSSEC
incremental deployment path
completely transparent to clients
uses legacy DNS to populate resource
records on demand
deployed on planet-lab
advantages of CoDoNS
higher performance than legacy DNS
median latency of 7 ms for codons (planetlab), 39 ms for legacy DNS
resilience against denial of service
attacks
self configuration after host and network
failures
fast update propagation
conclusions
model-driven proactive caching
O(1) lookup performance with optimal replicas
beehive: a general replication framework
structured overlays with uniform fan-out
high performance, resilience, improved availability
well-suited for latency sensitive applications
www.cs.cornell.edu/people/egs/beehive
evaluation: zipf parameter change
evaluation: instantaneous
bandwidth overhead
lookup performance: target 0.5 hops
lookup performance: planet-lab
typical values of zipf parameter
MIT DNS trace: = 0.91
Web traces:
trace
Dec
UPisa
FuNet
UCB
Quest
NLANR
0.83
0.84
0.84
0.83
0.88
0.90
comparative overview of
structured DHTs
DHT
lookup
performance
CAN
O(dN1/d)
Chord, Kademlia, Pastry, Tapestry,
Viceroy
O(logN)
de Bruijn graphs (Koorde)
O(logN/loglogN)
Kelips, Salad, [Gupta, Liskov,
Rodriguez], [Mizrak, Cheng, Kumar,
Savage]
O(1)
O(1) structured DHTs
DHT
lookup
performance
routing state
Salad
d
O(dN1/d)
[Mizrak, Cheng,
Kumar, Savage]
2
N
Kelips
1
N
(N replication)
[Gupta, Liskov,
Rodriguez]
1
N
security issues in beehive
underlying DHT
corruption in routing tables
[Castro, Druschel, Ganesh, Rowstrom, Wallach]
beehive
misrepresentation of popularity
remove outliers
application
corruption of data
certificates (ex. DNS-SEC)
Beehive DNS: Lookup Performance
CoDoNS
Legacy DNS
median
6.56 ms
38.8 ms
90th percentile
281 ms
337 ms