Transcript Slide 1

MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Resource Allocation Algorithms for
Publish/Subscribe Systems
Hans-Arno Jacobsen
June 23, 2011
Joint work with Alex King Yeung Cheung
http://padres.msrg.org
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Green Resource Allocation Algorithms
for Publish/Subscribe Systems
http://padres.msrg.org
Publish/Subscribe in Practice
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
(Distributed and brokered publish/subscribe)
• GooPS
▫ Google’s internal pub/sub messaging middleware to integrate
applications across data centers
▫ Hundreds of brokers with tens of thousands of pub/sub clients
• Yahoo Message Broker
▫ Yahoo’s pub/sub middleware
▫ Used for example in PNUTS key/value-store (cf. VLDB’08)
• SuperMontage
▫ Tibco’s pub/sub distribution network for NASDAQ’s quote and
order-processing
• GDSN (Global Data Synchronization Network)
▫ A global pub/sub network that allows retailers and suppliers (i.e.,
Walmart, Target, Metro, etc.) to exchange timely and accurate
supply chain data
ICDCS 2011
3
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Problem
Input
Output
P P P P
Deployment
strategy that uses
the least number
of brokers?
Brokers
Publishers
P
P
P
P
Subscribers
S
S
S
S
Overload!
S
ICDCS 2011
S
S
S
4
Challenges
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Brokers have limited and heterogeneous
resource capacities
– Computational
– I/O or bandwidth
– Memory and storage
• Publishers publish at different message rates
• Subscribers have unique interests that sink
zero or more publications from zero or more
publishers
ICDCS 2011
5
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Challenges When Scaling Up
P
P
P
P
P
P
P
P
How to connect the
publishers if subscribers
sink traffic from >2
publishers?
How to connect the
publishers if subscribers
sink traffic from >2
publishers?
How to connect the
brokers to minimize
traffic while
avoiding overload?
How to allocate
subscribers to
brokers?
How to allocate
This is an NP-complete problem!
subscribers to
S
S
S
brokers?
S
S
ICDCS 2011
S
S
S
6
Additional Requirements
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Minimize
– Amount of processing
– Amount of messages forwarded
• Work effectively under any workload distribution
(defined or undefined)
• Readily adaptable to any pub/sub system by being
language independent
– Content-based (XPath, regex, ranged, SQL, composite
subscriptions, etc.)
– Topic-based pub/sub
ICDCS 2011
7
Summary of Our Approach
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
(A customizable framework )
• Phase 1: Subscription profiling (& publisher)
– Record publications delivered to each subscription
• Phase 2: Subscription to broker allocation
– Allocate subscriptions to brokers depending on
the load induced by each subscription
• Phase 3: Broker overlay construction
– Construct and configure broker overlay
• Apply publisher re-allocation (GRAPE, cf.
ICDCS’2010)
ICDCS 2011
8
Phase 1: Subscription Profiling
Message ID
B34-M213
B34-M215
Publications
delivered to
subscription
B34-M216
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Profile of each subscription per advertisement
maintained at the subscriber’s first broker
B34-M213
Message ID of first index
Start of bit vector
10 0 0
1 0
1 0
1 0 0 0
1 0 0
1 0 0 0
1 0
1 0 0
B34-M217
B34-M220
Fixed vector size; shift left if next publication is
out of bit vector range
B34-M222
Cardinality of bit vector approximates bandwidth
requirement of subscription
B34-M225
Used to compute “closeness” between any two
subscriptions in the allocation phase based on
clustering algorithm. E.g, closeness = |si ∩ sj|
B34-M226
ICDCS 2011
9
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Phase 2: Subscription Allocation Algorithms
• MANUAL & AUTOMATIC as baseline
– Tree with fanout of 2; random placement of clients (manual)
– Random allocation (automatic)
• Fastest Broker First (FBF)
– Assign subscriptions randomly to the next most powerful broker
• Bin Packing
– Like FBF, but assigns the next highest traffic subscription
• PAIRWISE-N, PAIRWISE-K (Riabov et al. ICDCS’02)
– Pairwise subscription clustering where the number of clusters is
specified beforehand
• CRAM (Clustering with Resource Awareness and Minimization)
– Dynamically determines the number of clusters
– Utilizes a novel one-to-many clustering scheme
– Evaluated with 4 different subscription closeness metrics, with one
derived from Banavar et al. ICDCS '99
ICDCS 2011
10
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Allocation with Bin Packing
S
S
S
ICDCS 2011
S
S
S
11
Allocation Result (Bin Packing)
S
S
S
S
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
S
S
ICDCS 2011
12
Allocation with CRAM
(Basic version)
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
1. Find and cluster a pair of subscriptions having next highest
non-zero “closeness”
2. Run BIN PACKING algorithm with new pairing
3. Allocation fails, if:
– More brokers are allocated than without this pairing
– Not all subscriptions can be allocated to brokers
4. On failure, undo and remember incompatible pairing
5. Repeat loop until no more pairings can be found
• Initially BIN PACKING is run to determine initial allocation
• Pairings found are combined and re-inserted in sub pool
• Final subscription clustering is last successful allocation
ICDCS 2011
13
Summary of Optimizations
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Grouping of subscriptions with equal profiles
– Apply CRAM an groups
– In our experiments, reductions of up to 61%
• Limit closeness computations among groups
– Exploit covering relationships among subscriptions
– Disregard groups with small closeness
– In our experiments, a 20x improvement, roughly
• One-to-many clustering
– Cluster groups of subscriptions & covered subs
ICDCS 2011
14
Closeness Metrics
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Intersect: |si ∩ sj|
Good for highest overlap
XOR: |si XOR sj|-1
Good for least non-overlapping traffic
(If value is 0, defined as MAXVAL)
IOS:|si ∩ sj|2 / |si| + |sj|
IOU:| si ∩ sj|2 / |si U sj|
}
Good for both conditions,
yield 0 for empty relationships,
favour clustering higher traffic subs
(Intersection over sum & … over union)
Ideally, find subscriptions sharing highest overlap in traffic, while
introducing least amount of non-overlapping traffic.
XOR is derived from Banavar et al. ICDCS '99)
ICDCS 2011
15
Traditional One-to-One Clustering
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
C = 82/(36+24) = 1.07
42/(36+4)
C=
= 0.4
S1a
S2a
S1b
Bit Vector of S1
S2c
S2b
S2d
S2e
S1c
S2g
Closeness, C =
|si ∩ sj|2
|si| + |sj|
ICDCS 2011
S2f
Bit Vector of S2
S2h
C = 12/(24+1)
= 0.04
16
New One-to-Many Clustering
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
C = 82/(36+24) = 1.07
2/(36+12)
42/(36+4)
12
C=
=3
0.4
S1a
S2a
S1b
Bit Vector of S1
S2c
S2b
S2d
S2e
S1c
S2g
2
|s
∩
s
|
i
j
C=
|si| + |sj|
ICDCS 2011
S2f
Bit Vector of S2
S2h
C=8
12/(24+8)
/(24+1)
=2
0.04
17
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Phase 3: Broker Overlay Construction
S
S
S
S
S
S
S
S
S
ICDCS 2011
18
Bin Packing’s Final Overlay
P
(( GRAPE ))
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
P
(( GRAPE ))
S
SS
S
S
S
ICDCS 2011
S
S
19
Greedy Relocation Algorithm
for Publishers of Events (GRAPE)
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Distributed algorithm that dynamically relocates
publishers to minimize
– Broker message rates, and/or
– Delivery Delay
• Similar three phased design:
1.
2.
3.
Profile load of subscriptions matching each publisher
Determine the placement strategy that minimizes the
specified metric
Transparently migrate the publisher
• Cf. GRAPE paper from ICDCS 2010
ICDCS 2011
20
http://padres.msrg.org
Evaluation
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Implemented on the PADRES open source
content-based publish/subscribe system
• Evaluated on a cluster testbed using 80 brokers
• Evaluated on SciNet using 1000 brokers
• Comparison against two related approaches
(Riabov et al. ICDCS’02, Banavar et al. ICDCS’99)
• Homogeneous and heterogeneous scenarios
• Workload saturates the initial deployment
(MANUAL)
ICDCS 2011
21
Output Utilization Ratio
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Resource aware algorithms make
full use of allocated resources
ICDCS 2011
22
Broker Message Rate
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Allocating fewer
brokers does not help
Clustering
significantly reduces
message rate
ICDCS 2011
CRAM reduced
message rate
by up to 92%23
Number of Allocated Brokers
Uses all resources
ICDCS 2011
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Reduces number
of allocated
brokers by up to
91%
24
Computation Time
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
91% improvement
at only 30% higher
computation time
ICDCS 2011
25
Impact of Publisher Relocation &
Subscription Clustering
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
50%
reduction in
broker
message rate
ICDCS 2011
26
Broker Message Rates Using
Various Closeness Metrics
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
XOR
closeness
metric
cannot
identify
emptyrelations
ICDCS 2011
27
Conclusions
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• CRAM combines the benefits of
▫ Subscription clustering from PAIRWISE-N/K
▫ Resource awareness from Bin Packing
by simultaneously reducing both
▫ Broker message rate (up to 92%)
▫ Number of allocated brokers (up to 91%)
to meet green IT objectives!
• By using bit vectors, CRAM is
▫ Language independent (XPath, regex, topics)
▫ Effective for any workload distribution
ICDCS 2011
28
Q&A
ICDCS 2011
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
29
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
ICDCS 2011
30
Future Work
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• React dynamically by growing and shrinking
the network in incremental steps
• Improve runtime of the CRAM algorithm by
parallelization or reducing its computational
complexity
• Model workload with more sophisticated
methods, such as stochastic processes, to
improve accuracy of load estimation
• Address fault resiliency
ICDCS 2011
31
Related Works - Clustering
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
• Riabov et al. (ICDCS’02)
▫ The number of clusters K is pre-specified
▫ Each cluster is a multicast address, thus there is no upper limit on its
size
▫ Event space is divided into grids
▫ Supports only ranged subscriptions
▫ Their pairwise clustering considers each subscription individually
• Gryphon (ICDCS'99)
▫ Supports only equal and * subscriptions
▫ Each cluster is stored in memory, the upper bound limit is not a major
concern
• SUB-2-SUB (IPTPS'06)
▫ Supports only ranged subscriptions
▫ Each cluster is a p2p network, thus there is no upper limit on the
cluster size
ICDCS 2011
32
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Related Works – Broker Overlay Construction,
Publisher and Subscriber Placement Algorithms
• Baldoni et al. (The Computer Journal),
• Jaeger et al. (SAC'07)
• Migliavacca et al. (DEBS’07)
– Reconfigure broker overlay to reduce delivery
delay and broker processing load
• Cheung et al. (Middleware’06, ICDCS’10)
– Load balancing by relocating subscriber clients
– Reduce delivery delay and broker processing load
by relocating publisher clients
ICDCS 2011
33
Hop Count Using
Various Closeness Metrics
ICDCS 2011
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
34
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Computation Time vs. Bit Vector Size
ICDCS 2011
35
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Allocated Brokers vs. Bit Vector Size
ICDCS 2011
36
Average Hop Count
ICDCS 2011
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
37
Computation Time Using
Various Closeness Metrics
ICDCS 2011
MIDDLEWARE SYSTEMS
108%
higher
RESEARCH GROUP
MSRG.ORG
computation
time using
Gryphon-derived
closeness metric
(XOR).
38
Delivery Delay
MIDDLEWARE SYSTEMS
RESEARCH GROUP
MSRG.ORG
Overload
with
Pairwise-K
ICDCS 2011
39