Transcript Slide 1
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Resource Allocation Algorithms for Publish/Subscribe Systems Hans-Arno Jacobsen June 23, 2011 Joint work with Alex King Yeung Cheung http://padres.msrg.org MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Green Resource Allocation Algorithms for Publish/Subscribe Systems http://padres.msrg.org Publish/Subscribe in Practice MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG (Distributed and brokered publish/subscribe) • GooPS ▫ Google’s internal pub/sub messaging middleware to integrate applications across data centers ▫ Hundreds of brokers with tens of thousands of pub/sub clients • Yahoo Message Broker ▫ Yahoo’s pub/sub middleware ▫ Used for example in PNUTS key/value-store (cf. VLDB’08) • SuperMontage ▫ Tibco’s pub/sub distribution network for NASDAQ’s quote and order-processing • GDSN (Global Data Synchronization Network) ▫ A global pub/sub network that allows retailers and suppliers (i.e., Walmart, Target, Metro, etc.) to exchange timely and accurate supply chain data ICDCS 2011 3 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Problem Input Output P P P P Deployment strategy that uses the least number of brokers? Brokers Publishers P P P P Subscribers S S S S Overload! S ICDCS 2011 S S S 4 Challenges MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Brokers have limited and heterogeneous resource capacities – Computational – I/O or bandwidth – Memory and storage • Publishers publish at different message rates • Subscribers have unique interests that sink zero or more publications from zero or more publishers ICDCS 2011 5 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Challenges When Scaling Up P P P P P P P P How to connect the publishers if subscribers sink traffic from >2 publishers? How to connect the publishers if subscribers sink traffic from >2 publishers? How to connect the brokers to minimize traffic while avoiding overload? How to allocate subscribers to brokers? How to allocate This is an NP-complete problem! subscribers to S S S brokers? S S ICDCS 2011 S S S 6 Additional Requirements MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Minimize – Amount of processing – Amount of messages forwarded • Work effectively under any workload distribution (defined or undefined) • Readily adaptable to any pub/sub system by being language independent – Content-based (XPath, regex, ranged, SQL, composite subscriptions, etc.) – Topic-based pub/sub ICDCS 2011 7 Summary of Our Approach MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG (A customizable framework ) • Phase 1: Subscription profiling (& publisher) – Record publications delivered to each subscription • Phase 2: Subscription to broker allocation – Allocate subscriptions to brokers depending on the load induced by each subscription • Phase 3: Broker overlay construction – Construct and configure broker overlay • Apply publisher re-allocation (GRAPE, cf. ICDCS’2010) ICDCS 2011 8 Phase 1: Subscription Profiling Message ID B34-M213 B34-M215 Publications delivered to subscription B34-M216 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Profile of each subscription per advertisement maintained at the subscriber’s first broker B34-M213 Message ID of first index Start of bit vector 10 0 0 1 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 1 0 0 B34-M217 B34-M220 Fixed vector size; shift left if next publication is out of bit vector range B34-M222 Cardinality of bit vector approximates bandwidth requirement of subscription B34-M225 Used to compute “closeness” between any two subscriptions in the allocation phase based on clustering algorithm. E.g, closeness = |si ∩ sj| B34-M226 ICDCS 2011 9 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Phase 2: Subscription Allocation Algorithms • MANUAL & AUTOMATIC as baseline – Tree with fanout of 2; random placement of clients (manual) – Random allocation (automatic) • Fastest Broker First (FBF) – Assign subscriptions randomly to the next most powerful broker • Bin Packing – Like FBF, but assigns the next highest traffic subscription • PAIRWISE-N, PAIRWISE-K (Riabov et al. ICDCS’02) – Pairwise subscription clustering where the number of clusters is specified beforehand • CRAM (Clustering with Resource Awareness and Minimization) – Dynamically determines the number of clusters – Utilizes a novel one-to-many clustering scheme – Evaluated with 4 different subscription closeness metrics, with one derived from Banavar et al. ICDCS '99 ICDCS 2011 10 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Allocation with Bin Packing S S S ICDCS 2011 S S S 11 Allocation Result (Bin Packing) S S S S MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG S S ICDCS 2011 12 Allocation with CRAM (Basic version) MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 1. Find and cluster a pair of subscriptions having next highest non-zero “closeness” 2. Run BIN PACKING algorithm with new pairing 3. Allocation fails, if: – More brokers are allocated than without this pairing – Not all subscriptions can be allocated to brokers 4. On failure, undo and remember incompatible pairing 5. Repeat loop until no more pairings can be found • Initially BIN PACKING is run to determine initial allocation • Pairings found are combined and re-inserted in sub pool • Final subscription clustering is last successful allocation ICDCS 2011 13 Summary of Optimizations MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Grouping of subscriptions with equal profiles – Apply CRAM an groups – In our experiments, reductions of up to 61% • Limit closeness computations among groups – Exploit covering relationships among subscriptions – Disregard groups with small closeness – In our experiments, a 20x improvement, roughly • One-to-many clustering – Cluster groups of subscriptions & covered subs ICDCS 2011 14 Closeness Metrics MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Intersect: |si ∩ sj| Good for highest overlap XOR: |si XOR sj|-1 Good for least non-overlapping traffic (If value is 0, defined as MAXVAL) IOS:|si ∩ sj|2 / |si| + |sj| IOU:| si ∩ sj|2 / |si U sj| } Good for both conditions, yield 0 for empty relationships, favour clustering higher traffic subs (Intersection over sum & … over union) Ideally, find subscriptions sharing highest overlap in traffic, while introducing least amount of non-overlapping traffic. XOR is derived from Banavar et al. ICDCS '99) ICDCS 2011 15 Traditional One-to-One Clustering MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG C = 82/(36+24) = 1.07 42/(36+4) C= = 0.4 S1a S2a S1b Bit Vector of S1 S2c S2b S2d S2e S1c S2g Closeness, C = |si ∩ sj|2 |si| + |sj| ICDCS 2011 S2f Bit Vector of S2 S2h C = 12/(24+1) = 0.04 16 New One-to-Many Clustering MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG C = 82/(36+24) = 1.07 2/(36+12) 42/(36+4) 12 C= =3 0.4 S1a S2a S1b Bit Vector of S1 S2c S2b S2d S2e S1c S2g 2 |s ∩ s | i j C= |si| + |sj| ICDCS 2011 S2f Bit Vector of S2 S2h C=8 12/(24+8) /(24+1) =2 0.04 17 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Phase 3: Broker Overlay Construction S S S S S S S S S ICDCS 2011 18 Bin Packing’s Final Overlay P (( GRAPE )) MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG P (( GRAPE )) S SS S S S ICDCS 2011 S S 19 Greedy Relocation Algorithm for Publishers of Events (GRAPE) MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Distributed algorithm that dynamically relocates publishers to minimize – Broker message rates, and/or – Delivery Delay • Similar three phased design: 1. 2. 3. Profile load of subscriptions matching each publisher Determine the placement strategy that minimizes the specified metric Transparently migrate the publisher • Cf. GRAPE paper from ICDCS 2010 ICDCS 2011 20 http://padres.msrg.org Evaluation MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Implemented on the PADRES open source content-based publish/subscribe system • Evaluated on a cluster testbed using 80 brokers • Evaluated on SciNet using 1000 brokers • Comparison against two related approaches (Riabov et al. ICDCS’02, Banavar et al. ICDCS’99) • Homogeneous and heterogeneous scenarios • Workload saturates the initial deployment (MANUAL) ICDCS 2011 21 Output Utilization Ratio MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Resource aware algorithms make full use of allocated resources ICDCS 2011 22 Broker Message Rate MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Allocating fewer brokers does not help Clustering significantly reduces message rate ICDCS 2011 CRAM reduced message rate by up to 92%23 Number of Allocated Brokers Uses all resources ICDCS 2011 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Reduces number of allocated brokers by up to 91% 24 Computation Time MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 91% improvement at only 30% higher computation time ICDCS 2011 25 Impact of Publisher Relocation & Subscription Clustering MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 50% reduction in broker message rate ICDCS 2011 26 Broker Message Rates Using Various Closeness Metrics MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG XOR closeness metric cannot identify emptyrelations ICDCS 2011 27 Conclusions MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • CRAM combines the benefits of ▫ Subscription clustering from PAIRWISE-N/K ▫ Resource awareness from Bin Packing by simultaneously reducing both ▫ Broker message rate (up to 92%) ▫ Number of allocated brokers (up to 91%) to meet green IT objectives! • By using bit vectors, CRAM is ▫ Language independent (XPath, regex, topics) ▫ Effective for any workload distribution ICDCS 2011 28 Q&A ICDCS 2011 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 29 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG ICDCS 2011 30 Future Work MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • React dynamically by growing and shrinking the network in incremental steps • Improve runtime of the CRAM algorithm by parallelization or reducing its computational complexity • Model workload with more sophisticated methods, such as stochastic processes, to improve accuracy of load estimation • Address fault resiliency ICDCS 2011 31 Related Works - Clustering MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG • Riabov et al. (ICDCS’02) ▫ The number of clusters K is pre-specified ▫ Each cluster is a multicast address, thus there is no upper limit on its size ▫ Event space is divided into grids ▫ Supports only ranged subscriptions ▫ Their pairwise clustering considers each subscription individually • Gryphon (ICDCS'99) ▫ Supports only equal and * subscriptions ▫ Each cluster is stored in memory, the upper bound limit is not a major concern • SUB-2-SUB (IPTPS'06) ▫ Supports only ranged subscriptions ▫ Each cluster is a p2p network, thus there is no upper limit on the cluster size ICDCS 2011 32 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Related Works – Broker Overlay Construction, Publisher and Subscriber Placement Algorithms • Baldoni et al. (The Computer Journal), • Jaeger et al. (SAC'07) • Migliavacca et al. (DEBS’07) – Reconfigure broker overlay to reduce delivery delay and broker processing load • Cheung et al. (Middleware’06, ICDCS’10) – Load balancing by relocating subscriber clients – Reduce delivery delay and broker processing load by relocating publisher clients ICDCS 2011 33 Hop Count Using Various Closeness Metrics ICDCS 2011 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 34 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Computation Time vs. Bit Vector Size ICDCS 2011 35 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Allocated Brokers vs. Bit Vector Size ICDCS 2011 36 Average Hop Count ICDCS 2011 MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG 37 Computation Time Using Various Closeness Metrics ICDCS 2011 MIDDLEWARE SYSTEMS 108% higher RESEARCH GROUP MSRG.ORG computation time using Gryphon-derived closeness metric (XOR). 38 Delivery Delay MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Overload with Pairwise-K ICDCS 2011 39