In-Network Query Processing Sam Madden CS294-1 9/30/03 Outline • TinyDB • • • • • – And demo! Aggregate Queries ACQP Break Adaptive Operator Placement …
Download ReportTranscript In-Network Query Processing Sam Madden CS294-1 9/30/03 Outline • TinyDB • • • • • – And demo! Aggregate Queries ACQP Break Adaptive Operator Placement …
In-Network Query Processing
Sam Madden CS294-1 9/30/03
Outline
• • • • • • TinyDB – And demo!
Aggregate Queries ACQP Break Adaptive Operator Placement …
Outline
• • • • • • TinyDB – And demo!
Aggregate Queries ACQP Break Adaptive Operator Placement …
Programming Sensor Nets Is Hard
– Months of lifetime required from small batteries – 3-5 days naively; can’t recharge often – Interleave sleep with processing – Lossy, low-bandwidth, short range communication » Nodes coming and going » Multi-hop High-Level Abstraction – Remote, zero administration deployments – Highly distributed environment – Limited Development Tools » Embedded, LEDs for Debugging!
A Solution: Declarative
•
Queries
Users specify the data they want – Simple, SQL-like queries – Using predicates, not specific addresses – Our system: TinyDB • Challenge is to provide: – Expressive & easy-to-use interface – High-level operators • “ Transparent Optimizations ” that many programmers would miss – Sensor-net specific techniques – Power efficient execution framework
TinyDB Demo
TinyDB Architecture
SELECT AVG(temp) WHERE light > 400 Queries Results
Multihop Network Query Processor
Agg avg(temp) T :1, AVG : 225 T :2, AVG : 250
Schema: •“Catalog” of commands & attributes ~10,000 Lines Embedded C Code
400
Tables
light >
Samples got(‘temp’) getTempFunc(…) (3x larger than 2 nd Name : temp Time to sample : 50 uS Cost to sample : 90 uJ Calibration Table :
3
Units : Deg. F Error : ± 5 Deg F largest TinyOS Program) … TinyDB
Declarative Queries for Sensor Networks
“Find the sensors in bright nests.” • 1 Examples: SELECT nodeid, nestNo, light FROM sensors WHERE light > 400 EPOCH DURATION 1s
0 0 1 1
Sensors
Epoch Nodeid nestNo Light 1 2 1 2 17 25 17 25 455 389 422 405
Aggregation Queries
2 SELECT AVG(sound) FROM sensors EPOCH DURATION 10s “Count the number occupied nests in each loud region of the island.” 3 SELECT region, CNT (occupied) AVG (sound) FROM sensors GROUP BY region HAVING AVG (sound) > 200 EPOCH DURATION 10s Epoch region CNT(…) 0 North 3 0 1 1 South 3 North 3 South 3 AVG(…) 360 520 370 520 Regions w/ AVG(sound) > 200
• • • • •
Benefits of Declarative Queries
Specification of “whole-network” behavior Simple, safe Complex behavior via multiple queries, app logic Optimizable – Exploit (non-obvious) interactions – E.g.: • ACQP operator ordering, Adaptive join operator placement, Lifetime selection, Topology selection Versus other approaches, e.g., Diffusion • Black box ‘filter’ operators • Intanagonwiwat , “Directed Diffusion”, Mobicomm 2000
Outline
• • • • • • TinyDB – And demo!
Aggregate Queries ACQP Break Adaptive Operator Placement …
Tiny Aggregation (TAG)
• • Not in today’s reading In-network processing of aggregates – Common data analysis operation • Aka gather operation or reduction in || programming – Communication reducing • Operator dependent benefit • Exploit query semantics to improve efficiency!
Madden, Franklin, Hellerstein, Hong. Tiny AGgregation (TAG), OSDI 2002 .
Query Propagation Via Tree Based Routing
• Tree-based routing – Used in: • Query delivery • Data collection – Topology selection is important; e.g.
• Krishnamachari, DEBS 2002 , Intanagonwiwat, ICDCS 2002 , Heidemann, SOSP 2001 • LEACH/SPIN, Heinzelman et al . MOBICOM 99 • SIGMOD 2003 – Continuous process • Mitigates failures Q:SELECT … R:{…} E Q R:{…} Q B D Q Q Q A Q Q R:{…} C Q Q R:{…} Q F Q
Basic Aggregation
• In each epoch: – Each node samples local sensors once – Generates partial state record ( PSR ) • local readings • readings from children – Outputs PSR during assigned comm. interval • Communication scheduling for power reduction • • At end of epoch, PSR for whole network output at root New result on each successive epoch • Extras: – Predicate-based partitioning via GROUP BY 2 4 1 5 3
Illustration: Aggregation
SELECT COUNT(*) FROM sensors
Sensor #
1 2 3 4
1
5 4 3 2 1 4
2 1 3 4 1 5
Illustration: Aggregation
SELECT COUNT(*) FROM sensors
Sensor #
4 3 2 1 4 1 2 3
2
4
1
5
2 1 4 2 3 5
Illustration: Aggregation
SELECT COUNT(*) FROM sensors
Sensor #
4 3 2 1 4 1
1
2
3
3
2
4
1
5
2 1 1 3 3 4 5
Illustration: Aggregation
SELECT COUNT(*) FROM sensors
Sensor #
4 3 2 1 4
5
1
1
2
3
3
2
4
1
5
2 5 1 4 3 5
Illustration: Aggregation
SELECT COUNT(*) FROM sensors
Sensor #
4 3 2 1 4
5
1
1
2
3
3
2
4
1
5
1 2 1 3 4 1 5
Aggregation Framework
• As in extensible databases, TAG supports any aggregation function conforming to:
Agg n ={f init , f merge , f evaluate } F init {a 0 }
aggregate value
Partial State Record (PSR) Example: Average
AVG init {v} AVG merge {, } AVG evaluate {}
< S 1 + S 2 , C 1 + C 2 >
S/C
Restriction: Merge associative, commutative
Types of Aggregates
• SQL supports MIN, MAX, SUM, COUNT, AVERAGE • Any function over a set can be computed via TAG • In network benefit for many operations – E.g. Standard deviation, top/bottom N, spatial union/intersection, histograms, etc. – Compactness of PSR
Taxonomy of Aggregates
• TAG insight: classify aggregates according to various functional properties – Yields a general set of optimizations that can automatically be applied Properties Partial State Monotonicity Exemplary vs. Summary Duplicate Sensitivity
Drives an API!
Partial State
• Growth of PSR vs. number of aggregated values (n) – Algebraic: |PSR| = 1 (e.g. MIN) – Distributive: – Holistic: – Unique: • d = # of distinct values – Content Sensitive: |PSR| = c (e.g. AVG) |PSR| = n (e.g. MEDIAN) “Data Cube”, Gray et. al |PSR| = d (e.g. COUNT DISTINCT) |PSR| < n (e.g. HISTOGRAM) Property Partial State Examples MEDIAN : unbounded, MAX : 1 record Affects Effectiveness of TAG
Benefit of In-Network Processing
Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Uniform Dist over [0,100] 100000 90000 80000 70000 60000 50000 40000 30000 20000 10000 0 EXTERNAL Total Bytes Xmitted vs. Aggregation Function Holistic Unique • Aggregate & depth
dependent benefit!
Distributive Algebraic MAX AVERAGE
Aggregation Function
DISTINCT MEDIAN
Outline
• • • • • • TinyDB – And demo!
Aggregate Queries ACQP Break Adaptive Operator Placement …
Acquisitional Query Processing (ACQP)
Traditional DBMS: processes data already in the system Acquisitional DBMS: generates the data in the system!
An acquisitional query processor controls • when, • • where, and with what frequency data is collected Versus traditional systems where data is provided a priori
ACQP: What’s Different?
• Basic Acquisitional Processing – Continuous queries, with rates or lifetimes – Events for asynchronous triggering • Avoiding Acquisition Through Optimization – Sampling as a query operator • Choosing Where to Sample via Co-acquisition – Index-like data structures • Acquiring data from the network – Prioritization, summary, and rate control
Lifetime Queries
• Lifetime vs. sample rate SELECT … EPOCH DURATION 10 s SELECT … LIFETIME 30 days • Extra: Allow a MAX SAMPLE PERIOD – Discard some samples – Sampling cheaper than transmitting
(Single Node) Lifetime Prediction SELECT nodeid, light LIFETIME 24 Weeks
Voltage vs. Time, Measured Vs. Expected
Lifetime Goal = 24 Weeks (4032 Hours. 15 s / sample) 1000 900 800 R 2 = 0.8455
Voltage (Expected) Voltage (Measured) Linear Fit 700 600 500 400 300 1030 1010 990 970 950 0 100 Expected Measured 200 300 Insufficient Voltage to Operate (V = 350) 0 500 1000 1500 2500 3000 3500 4000
Operator Ordering: Interleave Sampling + Selection SELECT FROM WHERE light, mag sensors pred1(mag) AND pred2(light) EPOCH DURATION 1s
Traditional DBMS
(pred1) (pred2)
At 1 sample / sec, total power savings could be as much as 3.5mW Comparable to processor!
Correct ordering (unless pred1 is very selective and pred2 is not):
(pred1) (pred2) ACQP Costly (pred2) mag light (pred1) Cheap light mag mag light
Exemplary Aggregate Pushdown
SELECT WINMAX(light,8s,8s) FROM sensors WHERE mag > x EPOCH DURATION 1s WINMAX (mag>x) • Novel, general pushdown technique
Traditional DBMS
WINMAX (mag>x) ACQP mag (light > MAX) • Mag sampling is the most expensive operation!
light mag light
• •
Event Based Processing
Epochs are synchronous Might want to issue queries in response to asynchronous events – Avoid unneccessary “polling” CREATE TABLE SIZE birds(uint16 cnt) 1 CIRCULAR ON EVENT SELECT bird-enter(…) b.cnt+1 FROM ONCE birds AS OUTPUT INTO b b In-network storage Placement subject to optimization
• •
Attribute Driven Network Construction
Goal: co-acquisition -- sensors that sample together route together Observation : queries often over limited area – Or some other subset of the network • E.g. regions with light value in [10,20] • Idea : build network topology such that like valued nodes route through each other – For range queries – Relatively static attributes (e.g. location) • Maintenance Issues
Tracking Co-Acquisition Via Semantic Routing Trees • Idea: send range queries only to participating nodes –Parents maintain ranges of descendants SELECT WHERE … a > 5 AND a < 12 4 • Precomputed intervals a:[1,10] 1 2 a:[7,15] 3 a:[20,40] • Reported by children as they join Excluded from query broadcast and result collection!
Parent Selection for SRTs
0 • Idea : Node picks parent whose ancestors’ interval most overlap its descendants’ interval 1 [1,10] 2 [7,15] 3 [20,40] Other selection policies in paper!
4 [3,6] [3,6] [1,10] = [3,6] [3,6] [7,15] = ø [3,6] [20,40] = ø
Simulation Result
450 400 350 300 250 200 150 100 50 0 Active Nodes vs. Query Range Best Case (Expected) Closest Parent All Nodes 0 0.05
0.1
0.2
Query Size as % of Value Range
0.5
1
(Uniform value distribution, 20x20 grid, ideal connectivity to (8) neighbors )
Outline
• • • • • • TinyDB – And demo!
Aggregate Queries ACQP Break Adaptive Operator Placement …
Outline
• • • • • • TinyDB – And demo!
Aggregate Queries ACQP Break Adaptive Operator Placement …
• • •
Adaptive & Decentralized Operator Placement
IPSN 2003 Paper
Main Idea
– Place operators near data sources – Greater operator rate Closer placement For each operator – Explore candidate neighbors – Migrate to lower cost placements – Via extra messages Rate A Rate B Proper placement depends on path lengths and relative rates!
“Adaptivity” in Databases
• Adaptivity : changing query plans on the fly – Typically at the physical level • Where the plan runs • Ordering of operators • Instantiations of operators, e.g. hash join vs merge join – Non-traditional • Conventionally, complete plans are built prior to execution • Using cost estimates (collected from history) – Important in volatile or long running environments • Where a priori estimates are unlikely to be good • E.g., sensor networks
• •
Adaptivity for Operator Placement
Adaptivity comes at a cost – Extra work on each operator, each tuple – In a DBMS, processing per tuple is small • 100’s of instructions per operator – Unless you have to hit the disk!
Costs in this case?
– Extra communication hurts • Finding candidate placements (exploration) – Cost advertisements from local node – New costs from candidates • Moving state (migration) – Joins, windowed aggregates
Do Benefits Justify Costs?
• • • Not Evaluated!
– 3x reduction on messages vs. external – Excluding exploration & migration costs Seems somewhat implausible, especially given added complexity – Hard to make migration protocol work • Depends on ability to reliably quiesce child ops.
What else could you do?
– Static placement
Summary
• • • Declarative QP – Simplify data collection in sensornets – In-network processing, query optimization for performance Acquisitional QP – Focus on costs associated with sampling data • New challenge of sensornets, other streaming systems?
Adaptive Join Placement – In-network optimization – Some benefit, but practicality unclear – Operator pushdown still a good idea
Open Problems
• • • Many; a few: – In-network storage and operator placement – Dealing with heterogeneity – Dealing with loss Need real implementations of many of these ideas See me! ([email protected])
Questions / Discussion
Making TinyDB
REALLY
Work
• • • • Berkeley Botanical Garden First “real deployment” Requirements: – At least 1 month unattended operation – Support for calibrated environmental sensors – Multi-hop routing What we started with: – Limited power management, no time synchronization – Motes crashed hard occasionally – Limited, relatively untested multihop routing
• • •
Power Consumption in Sensornets
Waking current ~12mA – Fairly evenly spread between sensors, processor, radio Sleeping current 20-100uA Power consumption dominated by sensing, reception: – 1s Power up on Mica2Dot sensor board – Most mote apps use “always on” radio • Completely unstructured communication • Bad for battery life
Why Not Use TDMA?
• CSMA is very flexible: easy for new nodes to join – Reasonably scalable (relative to Bluetooth) • CSMA implemented, available – We wanted to build something that worked
Power Management Approach
Coarse-grained communication scheduling Mote ID Epoch (10s -100s of seconds) 1 … zzz … … zzz … 2 3 4 5 time 2-4s Waking Period
Benefits / Drawbacks
• • Benefits – Can still use CSMA within waking period • No reservation required: new nodes can join easily!
– Waking period duration is easily tunable • Depending on network size Drawbacks – Longer waking time vs. TDMA?
• Could stagger slots based on tree-depth – No “guaranteed” slot reservation • Nothing is guaranteed anyway
Challenges
• • • • Time Synchronization Fault recovery Joining, starting, and stopping Parameter Tuning – Waking period: Hardcode at 4s?
Time Synchronization
• • All messages include a 5 byte time stamp indicating system time in ms – Synchronize (e.g. set local system time to timestamp) with • Any message from parent • Any new query message (even if not from parent) – Punt on multiple queries – Timestamps written just after preamble is xmitted All nodes agree that the waking period begins when (system time % epoch dur = 0) – And lasts for WAKING_PERIOD ms
Wrinkles
• • • If node hasn’t heard from it’s parent for k epochs – Switch to “always on” mode for 1 epoch If waking period ends – Punt on this epoch Query dissemination / deletion via always-on basestation and viral propagation – Data messages as advertisements for running queries – Explicit requests for missing queries – Explicit “dead query” messages for stopped queries • Don’t request the last dead query
Results
• • 30 Nodes in the Garden – Lasted 21 days – 10% duty cycle • Sample period = 30s, waking period = 3s – Next deployment: • 1-2% duty cycle • Fixes to clock to reduce baseline power • Should last at least 60 days Time sync test: 2,500 readings – 5 readings “out of synch” – 15s epoch duration, 3s waking period
Garden Data
36m 33m: 111 32m: 110 30m: 109,108,107 20m: 106,105,104 10m: 103, 102, 101
Humidity vs. Time
101 104 109 110 111 55 45 35 95 85 75 65
Temperature vs. Time
33 28 23 18 13 8 7/7/03 9:40 7/7/03 13:41 7/7/03 17:43 7/7/03 21:45 7/8/03 1:47 7/8/03 5:49 7/8/03 9:51
Date
7/8/03 13:53 7/8/03 17:55 7/8/03 21:57 7/9/03 1:59 7/9/03 6:01 7/9/03 10:03
Data Quality
12000
Sensor Id vs. Number of Results, Summarized by Parent Sensor
100 % of Results 10000 8000 62% 75% 6000 61% 53% 54% 37% 51% 4000 57% 38% 52% 35% 55% 48% 42% 22% 2000 0 1 2 3
Parent Sensor:
4 0 5 1 3% 2 6 4 7 8 9
Sensor Id
5 6 8 10 9 11 10 12 11 13 14 12 15 13 16