Transcript Talk Slides
Online Balancing of
Range-Partitioned Data
with Applications to P2P Systems
Prasanna Ganesan
Mayank Bawa
Hector Garcia-Molina
Stanford University
1
Motivation
Parallel databases use range partitioning
Key Range
0
20
35
60
80
100
Advantages: Inter-query parallelism
– Data Locality Low-cost range queries High
thru’put
2
The Problem
How to achieve load balance?
– Partition boundaries have to change over time
– Cost: Data Movement
Goal: Guarantee load balance at low cost
– Assumption: Load balance beneficial !!
Contribution
– Online balancing -- self-tuning system
– Slows down updates by small constant factor
3
Roadmap
Model and Definitions
Load Balancing Operations
The Algorithms
Extension to P2P Setting
Experimental Results
4
Model and Definitions (1)
Nodes maintain range partition (on a key)
– Load of a node = # tuples in its partition
– Load imbalance σ = Largest load/Smallest
load
Arbitrary sequence of tuple inserts and
deletes
– Queries not relevant
– Automatically directed to relevant node
5
Model and Definitions (2)
After each insert/delete:
– Potentially fix “imbalance” by modifying partitioning
– Cost= # tuples moved
Assume no inserts/deletes during balancing
– Non-critical simplification
Goal: σ < constant always
– Constant amortized cost per insert/delete
– Implication: Faster queries, slower updates
6
Load Balancing Operations (1)
NbrAdjust: Transfer data between “neighbors’’
A
B
[0,50)
[50,100)
[0,35)
[35,100)
7
Is NbrAdjust good enough?
Can be highly inefficient
– (n) amortized cost per insert/delete ( n=#nodes )
A
B
C
D
E
F
8
Load Balancing Operations (2)
Reorder: Hand over data to neighbor and
split load of some other node
A
[0,5)
[0,10)
B
[10,20)
[5,10)
C
D
[20,30) [30,40)
E
[40,50)
[40,60)
F
[50,60)
9
Roadmap
Model and Definitions
Load Balancing Operations
The Algorithms
Experimental Results
Extension to P2P Setting
10
The Doubling Algorithm
Geometrically divide loads into
levels
– Level i Load in ( 2i,2i+1 ]
– Will try balancing on level change
Two Invariants
– Neighbors tightly balanced
Max 1 level apart
– All nodes within 3 levels
Guarantees σ ≤ 8
2i+2
Level i
2i+1
2i
Level 2
Level 1
Level 0
8
4
2
1
Load Scale
11
The Doubling Algorithm (2)
A
B
C
D
E
F
12
The Doubling Algorithm (2)
A
B
C
D
E
F
13
The Doubling Algorithm (2)
A
B
C
D
E
F
14
The Doubling Algorithm: Case 2
Search for a blue node
– If none, do nothing!
A
B
C
D
E
F
15
The Doubling Algorithm: Case 2
Search for a blue node
– If none, do nothing!
A
B
E
C
D
F
16
The Doubling Algorithm (3)
Similar operations when load goes down a
level
– Try balancing with neighbor
– Otherwise, find a red node and reorder
yourself
Costs and Guarantees
–σ≤8
– Constant amortized cost per insert/delete
17
From Doubling to Fibbing
Change thresholds to Fibonacci numbers
– σ ≤ 3 4.2
– Can also use other geometric sequences
– Costs are still constant
Fi+2
=
Fi+1 + Fi
18
More Generalizations
Improve σ to (1+) for any >0 [BG04]
– Generalize neighbors to c-neighbors
– Still constant cost O(1/ )
Dealing with concurrent inserts/deletes
– Allow multiple balancing actions in parallel
– Paper claims it is ok
19
Application to P2P Systems
Goal: Construct P2P system supporting efficient
range queries
– Provide asymptotic performance a la DHTs
What is a P2P system? A parallel DB with
– Nodes joining and leaving at will
– No centralized components
– Limited communication primitives
Enhance load-balancing algorithms to
– Allow dynamic node joins/leaves
– Decentralize implementation
20
Experiments
Goal: Study cost of balancing for different workloads
– Compare to periodic re-balancing algorithms (Paper)
– Trade-off between cost and imbalance ratio (Paper)
Results presented on Fibbing Algorithm (n=256)
Three-phase Workload
– (1) Inserts (2) Alternating inserts and deletes (3) Deletes
Workload 1: Zipf
– Random draws from Zipf-like distribution
Workload 2: HotSpot
– Think key=timestamp
Workload 3: ShearStress
– Insert at most-loaded, delete from least-loaded
21
Load Imbalance (Zipf)
4.5
Growing Phase
Steady Phase
Shrinking Phase
4
Load Imbalance
3.5
3
2.5
2
1.5
1
0.5
0
0
500
1000
1500
Time (x1000)
2000
2500
3000
22
Load Imbalance (ShearStress)
4.5
Growing Phase
Steady Phase
Shrinking Phase
4
Load Imbalance
3.5
3
2.5
2
1.5
1
0.5
0
0
500
1000
1500
Time (x1000)
2000
2500
3000
23
Cost of Load Balancing
6000
Growing Phase
Steady Phase
Shrinking Phase
Cumulative Cost (x1000)
5000
4000
3000
2000
1000
0
0
500
1000
1500
Time (x1000)
2000
2500
3000
24
Related Work
Karger & Ruhl [SPAA 04]
– Dynamic model, weaker guarantees
Load balancing in DBs
– Partitioning static relations, e.g.,
[GD92,RZML02, SMR00]
– Migrating fragments across disks, e.g.,
[SWZ93]
– Intra-node data structures, e.g., [LKOTM00]
Litwin et al. SDDS
25
Conclusions
Indeed possible to maintain well-balanced range
partitions
– Range partitions competitive with hashing
Generalize to more complex load functions
– Allow tuples to have dynamic weights
– Change load definition in algorithms!*
– Range partitioning is powerful
Enables P2P system supporting range queries
– Generalizes DHTs with same asymptotic guarantees
*Lots of caveats apply. Need load to be evenly divisible. No guarantees offered on costs. This offer not valid with any other offers. Etc, etc. etc.
26