Transcript Talk Slides
Online Balancing of Range-Partitioned Data with Applications to P2P Systems Prasanna Ganesan Mayank Bawa Hector Garcia-Molina Stanford University 1 Motivation Parallel databases use range partitioning Key Range 0 20 35 60 80 100 Advantages: Inter-query parallelism – Data Locality Low-cost range queries High thru’put 2 The Problem How to achieve load balance? – Partition boundaries have to change over time – Cost: Data Movement Goal: Guarantee load balance at low cost – Assumption: Load balance beneficial !! Contribution – Online balancing -- self-tuning system – Slows down updates by small constant factor 3 Roadmap Model and Definitions Load Balancing Operations The Algorithms Extension to P2P Setting Experimental Results 4 Model and Definitions (1) Nodes maintain range partition (on a key) – Load of a node = # tuples in its partition – Load imbalance σ = Largest load/Smallest load Arbitrary sequence of tuple inserts and deletes – Queries not relevant – Automatically directed to relevant node 5 Model and Definitions (2) After each insert/delete: – Potentially fix “imbalance” by modifying partitioning – Cost= # tuples moved Assume no inserts/deletes during balancing – Non-critical simplification Goal: σ < constant always – Constant amortized cost per insert/delete – Implication: Faster queries, slower updates 6 Load Balancing Operations (1) NbrAdjust: Transfer data between “neighbors’’ A B [0,50) [50,100) [0,35) [35,100) 7 Is NbrAdjust good enough? Can be highly inefficient – (n) amortized cost per insert/delete ( n=#nodes ) A B C D E F 8 Load Balancing Operations (2) Reorder: Hand over data to neighbor and split load of some other node A [0,5) [0,10) B [10,20) [5,10) C D [20,30) [30,40) E [40,50) [40,60) F [50,60) 9 Roadmap Model and Definitions Load Balancing Operations The Algorithms Experimental Results Extension to P2P Setting 10 The Doubling Algorithm Geometrically divide loads into levels – Level i Load in ( 2i,2i+1 ] – Will try balancing on level change Two Invariants – Neighbors tightly balanced Max 1 level apart – All nodes within 3 levels Guarantees σ ≤ 8 2i+2 Level i 2i+1 2i Level 2 Level 1 Level 0 8 4 2 1 Load Scale 11 The Doubling Algorithm (2) A B C D E F 12 The Doubling Algorithm (2) A B C D E F 13 The Doubling Algorithm (2) A B C D E F 14 The Doubling Algorithm: Case 2 Search for a blue node – If none, do nothing! A B C D E F 15 The Doubling Algorithm: Case 2 Search for a blue node – If none, do nothing! A B E C D F 16 The Doubling Algorithm (3) Similar operations when load goes down a level – Try balancing with neighbor – Otherwise, find a red node and reorder yourself Costs and Guarantees –σ≤8 – Constant amortized cost per insert/delete 17 From Doubling to Fibbing Change thresholds to Fibonacci numbers – σ ≤ 3 4.2 – Can also use other geometric sequences – Costs are still constant Fi+2 = Fi+1 + Fi 18 More Generalizations Improve σ to (1+) for any >0 [BG04] – Generalize neighbors to c-neighbors – Still constant cost O(1/ ) Dealing with concurrent inserts/deletes – Allow multiple balancing actions in parallel – Paper claims it is ok 19 Application to P2P Systems Goal: Construct P2P system supporting efficient range queries – Provide asymptotic performance a la DHTs What is a P2P system? A parallel DB with – Nodes joining and leaving at will – No centralized components – Limited communication primitives Enhance load-balancing algorithms to – Allow dynamic node joins/leaves – Decentralize implementation 20 Experiments Goal: Study cost of balancing for different workloads – Compare to periodic re-balancing algorithms (Paper) – Trade-off between cost and imbalance ratio (Paper) Results presented on Fibbing Algorithm (n=256) Three-phase Workload – (1) Inserts (2) Alternating inserts and deletes (3) Deletes Workload 1: Zipf – Random draws from Zipf-like distribution Workload 2: HotSpot – Think key=timestamp Workload 3: ShearStress – Insert at most-loaded, delete from least-loaded 21 Load Imbalance (Zipf) 4.5 Growing Phase Steady Phase Shrinking Phase 4 Load Imbalance 3.5 3 2.5 2 1.5 1 0.5 0 0 500 1000 1500 Time (x1000) 2000 2500 3000 22 Load Imbalance (ShearStress) 4.5 Growing Phase Steady Phase Shrinking Phase 4 Load Imbalance 3.5 3 2.5 2 1.5 1 0.5 0 0 500 1000 1500 Time (x1000) 2000 2500 3000 23 Cost of Load Balancing 6000 Growing Phase Steady Phase Shrinking Phase Cumulative Cost (x1000) 5000 4000 3000 2000 1000 0 0 500 1000 1500 Time (x1000) 2000 2500 3000 24 Related Work Karger & Ruhl [SPAA 04] – Dynamic model, weaker guarantees Load balancing in DBs – Partitioning static relations, e.g., [GD92,RZML02, SMR00] – Migrating fragments across disks, e.g., [SWZ93] – Intra-node data structures, e.g., [LKOTM00] Litwin et al. SDDS 25 Conclusions Indeed possible to maintain well-balanced range partitions – Range partitions competitive with hashing Generalize to more complex load functions – Allow tuples to have dynamic weights – Change load definition in algorithms!* – Range partitioning is powerful Enables P2P system supporting range queries – Generalizes DHTs with same asymptotic guarantees *Lots of caveats apply. Need load to be evenly divisible. No guarantees offered on costs. This offer not valid with any other offers. Etc, etc. etc. 26