Distributed Hash Tables

Download Report

Transcript Distributed Hash Tables

Distributed Hash Tables
David Tam
Patrick Pang
Presentation Outline
•
What is DHT (Distributed Hash Table)?
•
Why DHTs?
•
Applications
•
How lookup works?
•
Alternatives to DHTs
•
Performance – Routing
•
Performance – Load Balancing
•
Security – Routing Attack
•
Security – Inconsistent Behaviour
•
Comparison to Other Facilities
•
Current Research Projects
•
Conclusion
What is DHT?
Distributed application
put(key, data)
node
get (key)
Distributed hash table
node
….
data
node
DHT provides the information look up service for P2P
applications.
• Nodes uniformly distributed across key space
• Nodes form an overlay network
• Nodes maintain list of neighbours in routing table
• Decoupled from physical network topology
(Figure adopted from Frans Kaashoek)
Why DHTs?
Why Middleware?
• Simplifies the development for large-scale distributed Apps
• Better security and robustness
• Simple API
Why Do We Need DHTs?
• Simplifies the development for large-scale distributed Apps
• Better security and robustness
• Simple API
• Exploits P2P resources
Applications
• Anything that requires a hash table
• Databases, FSes, storage, archival
• Web serving, caching
• Content distribution
• Query & indexing
• Naming systems
• Communication primitives
• Chat services
• Application-layer multi-casting
• Event notification services
• Publish/subscribe systems ?
How lookup works?
Example: Chord [Stoica et. al.]
0
15
Finger Table for Node 2
start interval
3
[3,4)
succ.
5
1
2
14
3
13
4
12
4
6
10
[4,6)
[6,10)
[10,2)
5
7
10
5
11
10
7
9
8
6
How lookup works?
Example: Chord
0
15
Finger Table for Node 10
start interval
11 [11,12)
succ.
12
1
2
14
3
13
4
12
12
14
2
[12,14)
[14,2)
[2,10)
12
14
2
5
11
10
7
9
8
6
How lookup works?
Example: Chord
0
15
Finger Table for Node 10
start interval
11 [11,12)
succ.
12
1
2
14
3
13
4
12
12
14
2
[12,14)
[14,2)
[2,10)
12
14
2
5
11
10
7
9
8
6
How lookup works?
Example: Chord
0
15
Finger Table for Node 14
start interval
15 [15,0)
succ.
15
1
2
14
3
13
4
12
0
2
6
[0,2)
[2,6)
[6,13)
1
2
7
5
11
10
7
9
8
6
How lookup works?
Example: Chord
0
15
Finger Table for Node 14
start interval
15 [15,0)
succ.
15
1
2
14
3
13
4
12
0
2
6
[0,2)
[2,6)
[6,13)
1
2
7
5
11
10
7
9
8
6
How lookup works?
Example: Chord
0
15
1
2
14
Now Node 2 can retrive
information for key 0
from Node 1.
3
4
12
5
11
10
7
9
8
6
Alternatives to DHTs
• Distributed file system
• Centralized lookup
• P2P flooding queries
N1
N4
Internet
N9
N1
Client
N4
Target
Client
Client
Server
N3
Target
Server
Client
N2
N9
N6
N10
N7 N8
N2
DB
N6
Start
N3
N7
N8
Start
N10
(Figures adopted from Frans Kaashoek)
Performance -- Lookup
Purpose -- to locate a target node
•Each step, try to get closer to locating target node
• Ask a closer neighbour
• Performance & scalability tied directly to lookup algorithm
2 Aspects to Performance
• Path latency
• Lookup path length (# hops)
2 Aspects to Scalability
• size of routing table – O(log N)
• lookup path length – O(log N)
3 Techniques
• proximity lookup
• proximity neighbour selection
• geographic layout
Performance -- Load Balancing
Issues
• Hot-spots
• Content
• Lookup
• Heterogeneous nodes & paths
• System flux
Solution
• Replication is the key
• Also good for fault-tolerance
• Cache lookup answers backwards along path
Security – Incorrect Lookup (1)
• When asked for the “next hop”, give a wrong answer
0
Finger Table for Node 2
start
interval
succ.
3
[3,4)
5
4
[4,6)
5
6
[6,10)
7
10
[10,2)
10
Node 2 to Node 10: Please tell
me how to reach key 0 ….
1
15
2
14
3
13
4
12
5
11
10
7
9
8
6
Security – Incorrect Lookup (2)
• When asked for the “next hop”, give a wrong answer
0
Finger Table for Node 10
start
interval
succ.
11
[11,12)
12
12
[12,14)
12
14
[14,2)
14
2
[2,10)
2
Node 2 to Node 10: Please tell
me how to reach key 0 ….
Node 10 answers: ask Node 14
1
15
2
14
3
13
4
12
5
11
10
7
9
8
6
Security – Incorrect Lookup (3)
• When asked for the “next hop”, give a wrong answer
0
Finger Table for Node 14
start
interval
succ.
15
[15,0)
15
0
[0,2)
1
2
[2,6)
2
6
[6,13)
7
Node 2 to Node 14: Please tell
me how to reach key 0 ….
Node 14 answers: ask Node 10
1
15
2
14
3
13
4
12
5
11
10
7
9
8
6
Security – Incorrect Lookup (4)
Solution [Sit and Morris]:
• “Define verifiable system invariant”
• “Allow the querier to observe lookup progress”
Our idea how this can be implemented:
• Concretely, using an integral monotonically
decreasing quantity to implement the idea of
“progress”.
• The concept of “monotonically decreasing quantity”
has been used in program construction guaranteeing
total correctness. [Parnas]
Security – Inconsistent Behaviour
• Inconsistent Behaviour, i.e., lie intelligibly
• Sybil attack [Kaashoek]
Solution 1: public key solution
Security – Inconsistent Behaviour
• Inconsistent Behaviour, i.e., lie intelligibly
• Sybil attack [Kaashoek]
Solution 1: public key solution
Solution 2: Byzantine Protocol
Byzantine Generals Problem:
How to find out the traitors
among the Generals? [Lamport]
Security – Inconsistent Behaviour
• Inconsistent Behaviour, i.e., lie intelligibly
• Sybil attack [Kaashoek]
Solution 1: public key solution
Solution 2: Byzantine Protocol
Commander
Byzantine Generals Problem:
attack
attack
How to find out the traitors
among the Generals? [Lamport]
Lieutenant 1
“he said ‘retreat’”
Lieutenant 2
Security – Inconsistent Behaviour
• Inconsistent Behaviour, i.e., lie intelligibly
• Sybil attack [Kaashoek]
Solution 1: public key solution
Solution 2: Byzantine Protocol
Commander
Byzantine Generals Problem:
attack
retreat
How to find out the traitors
among the Generals? [Lamport]
Lieutenant 1
“he said ‘retreat’”
Lieutenant 2
Comparison to Other Facilities
Facility
Abstraction Easy Use/Prg Scalability
Load-Balance
DHT
high
high
high
yes
Centralized Lookup
medium
medium
low
no
P2P flooding queries
medium
high
low
no
Distributed FS
low
medium
medium
no
Facility
Fault-Tolerance Self-Org Admin
DHT
high
yes
low
Centralized Lookup
low
no
medium
P2P flooding queries depends
yes
low
Distributed FS
no
high
medium
Research Projects
Iris – security & fault-tolerance – US Gov’t
Chord – circular key space
Pastry – circular key space
Tapestry – hypercube space
CAN – n-dimensional key space
Kelips – n-dimensional key space
DDS -- middleware platform for internet service construction
-- cluster-based
-- incremental scalability
Summary
• Good middleware platform
• Exploits P2P networks
• An exciting new research area
References
• Lamport, Leslie et. al. The Byzantine Generals Problem
• Sit, Emil, Morris, Robert. Security Considerations for Peerto-Peer Distributed Hash Tables
• Kaashoek, Frans. Distributed Hash Tables – Building largesacle, robust distributed applications
• Stoica, Ion et. al. Chord: A scalable peer-to-peer lookup
service for Internet applications
• Parnas, D. L. Connecting Theory to Practice: Software
Engineering Programme