In-Network Query Processing Sam Madden CS294-1 9/30/03 Outline • TinyDB • • • • • – And demo! Aggregate Queries ACQP Break Adaptive Operator Placement …

Download Report

Transcript In-Network Query Processing Sam Madden CS294-1 9/30/03 Outline • TinyDB • • • • • – And demo! Aggregate Queries ACQP Break Adaptive Operator Placement …

In-Network Query Processing

Sam Madden CS294-1 9/30/03

Outline

• • • • • • TinyDB – And demo!

Aggregate Queries ACQP Break Adaptive Operator Placement …

Outline

• • • • • • TinyDB – And demo!

Aggregate Queries ACQP Break Adaptive Operator Placement …

Programming Sensor Nets Is Hard

– Months of lifetime required from small batteries – 3-5 days naively; can’t recharge often – Interleave sleep with processing – Lossy, low-bandwidth, short range communication » Nodes coming and going » Multi-hop High-Level Abstraction – Remote, zero administration deployments – Highly distributed environment – Limited Development Tools » Embedded, LEDs for Debugging!

A Solution: Declarative

Queries

Users specify the data they want – Simple, SQL-like queries – Using predicates, not specific addresses – Our system: TinyDB • Challenge is to provide: – Expressive & easy-to-use interface – High-level operators • “ Transparent Optimizations ” that many programmers would miss – Sensor-net specific techniques – Power efficient execution framework

TinyDB Demo

TinyDB Architecture

SELECT AVG(temp) WHERE light > 400 Queries Results

Multihop Network Query Processor

Agg avg(temp) T :1, AVG : 225 T :2, AVG : 250

Schema: •“Catalog” of commands & attributes ~10,000 Lines Embedded C Code

400

Tables

light >

Samples got(‘temp’) getTempFunc(…) (3x larger than 2 nd Name : temp Time to sample : 50 uS Cost to sample : 90 uJ Calibration Table :

3

Units : Deg. F Error : ± 5 Deg F largest TinyOS Program) … TinyDB

Declarative Queries for Sensor Networks

“Find the sensors in bright nests.” • 1 Examples: SELECT nodeid, nestNo, light FROM sensors WHERE light > 400 EPOCH DURATION 1s

0 0 1 1

Sensors

Epoch Nodeid nestNo Light 1 2 1 2 17 25 17 25 455 389 422 405

Aggregation Queries

2 SELECT AVG(sound) FROM sensors EPOCH DURATION 10s “Count the number occupied nests in each loud region of the island.” 3 SELECT region, CNT (occupied) AVG (sound) FROM sensors GROUP BY region HAVING AVG (sound) > 200 EPOCH DURATION 10s Epoch region CNT(…) 0 North 3 0 1 1 South 3 North 3 South 3 AVG(…) 360 520 370 520 Regions w/ AVG(sound) > 200

• • • • •

Benefits of Declarative Queries

Specification of “whole-network” behavior Simple, safe Complex behavior via multiple queries, app logic Optimizable – Exploit (non-obvious) interactions – E.g.: • ACQP operator ordering, Adaptive join operator placement, Lifetime selection, Topology selection Versus other approaches, e.g., Diffusion • Black box ‘filter’ operators • Intanagonwiwat , “Directed Diffusion”, Mobicomm 2000

Outline

• • • • • • TinyDB – And demo!

Aggregate Queries ACQP Break Adaptive Operator Placement …

Tiny Aggregation (TAG)

• • Not in today’s reading In-network processing of aggregates – Common data analysis operation • Aka gather operation or reduction in || programming – Communication reducing • Operator dependent benefit • Exploit query semantics to improve efficiency!

Madden, Franklin, Hellerstein, Hong. Tiny AGgregation (TAG), OSDI 2002 .

Query Propagation Via Tree Based Routing

• Tree-based routing – Used in: • Query delivery • Data collection – Topology selection is important; e.g.

• Krishnamachari, DEBS 2002 , Intanagonwiwat, ICDCS 2002 , Heidemann, SOSP 2001 • LEACH/SPIN, Heinzelman et al . MOBICOM 99 • SIGMOD 2003 – Continuous process • Mitigates failures Q:SELECT … R:{…} E Q R:{…} Q B D Q Q Q A Q Q R:{…} C Q Q R:{…} Q F Q

Basic Aggregation

• In each epoch: – Each node samples local sensors once – Generates partial state record ( PSR ) • local readings • readings from children – Outputs PSR during assigned comm. interval • Communication scheduling for power reduction • • At end of epoch, PSR for whole network output at root New result on each successive epoch • Extras: – Predicate-based partitioning via GROUP BY 2 4 1 5 3

Illustration: Aggregation

SELECT COUNT(*) FROM sensors

Sensor #

1 2 3 4

1

5 4 3 2 1 4

2 1 3 4 1 5

Illustration: Aggregation

SELECT COUNT(*) FROM sensors

Sensor #

4 3 2 1 4 1 2 3

2

4

1

5

2 1 4 2 3 5

Illustration: Aggregation

SELECT COUNT(*) FROM sensors

Sensor #

4 3 2 1 4 1

1

2

3

3

2

4

1

5

2 1 1 3 3 4 5

Illustration: Aggregation

SELECT COUNT(*) FROM sensors

Sensor #

4 3 2 1 4

5

1

1

2

3

3

2

4

1

5

2 5 1 4 3 5

Illustration: Aggregation

SELECT COUNT(*) FROM sensors

Sensor #

4 3 2 1 4

5

1

1

2

3

3

2

4

1

5

1 2 1 3 4 1 5

Aggregation Framework

• As in extensible databases, TAG supports any aggregation function conforming to:

Agg n ={f init , f merge , f evaluate } F init {a 0 }

F merge {,}

F evaluate {}

aggregate value

Partial State Record (PSR) Example: Average

AVG init {v} AVG merge {, } AVG evaluate {}

< S 1 + S 2 , C 1 + C 2 >

S/C

Restriction: Merge associative, commutative

Types of Aggregates

• SQL supports MIN, MAX, SUM, COUNT, AVERAGE • Any function over a set can be computed via TAG • In network benefit for many operations – E.g. Standard deviation, top/bottom N, spatial union/intersection, histograms, etc. – Compactness of PSR

Taxonomy of Aggregates

• TAG insight: classify aggregates according to various functional properties – Yields a general set of optimizations that can automatically be applied Properties Partial State Monotonicity Exemplary vs. Summary Duplicate Sensitivity

Drives an API!

Partial State

• Growth of PSR vs. number of aggregated values (n) – Algebraic: |PSR| = 1 (e.g. MIN) – Distributive: – Holistic: – Unique: • d = # of distinct values – Content Sensitive: |PSR| = c (e.g. AVG) |PSR| = n (e.g. MEDIAN) “Data Cube”, Gray et. al |PSR| = d (e.g. COUNT DISTINCT) |PSR| < n (e.g. HISTOGRAM) Property Partial State Examples MEDIAN : unbounded, MAX : 1 record Affects Effectiveness of TAG

Benefit of In-Network Processing

Simulation Results 2500 Nodes 50x50 Grid Depth = ~10 Neighbors = ~20 Uniform Dist over [0,100] 100000 90000 80000 70000 60000 50000 40000 30000 20000 10000 0 EXTERNAL Total Bytes Xmitted vs. Aggregation Function Holistic Unique • Aggregate & depth

dependent benefit!

Distributive Algebraic MAX AVERAGE

Aggregation Function

DISTINCT MEDIAN

Outline

• • • • • • TinyDB – And demo!

Aggregate Queries ACQP Break Adaptive Operator Placement …

Acquisitional Query Processing (ACQP)

Traditional DBMS: processes data already in the system Acquisitional DBMS: generates the data in the system!

An acquisitional query processor controls • when, • • where, and with what frequency data is collected Versus traditional systems where data is provided a priori

ACQP: What’s Different?

• Basic Acquisitional Processing – Continuous queries, with rates or lifetimes – Events for asynchronous triggering • Avoiding Acquisition Through Optimization – Sampling as a query operator • Choosing Where to Sample via Co-acquisition – Index-like data structures • Acquiring data from the network – Prioritization, summary, and rate control

Lifetime Queries

• Lifetime vs. sample rate SELECT … EPOCH DURATION 10 s SELECT … LIFETIME 30 days • Extra: Allow a MAX SAMPLE PERIOD – Discard some samples – Sampling cheaper than transmitting

(Single Node) Lifetime Prediction SELECT nodeid, light LIFETIME 24 Weeks

Voltage vs. Time, Measured Vs. Expected

Lifetime Goal = 24 Weeks (4032 Hours. 15 s / sample) 1000 900 800 R 2 = 0.8455

Voltage (Expected) Voltage (Measured) Linear Fit 700 600 500 400 300 1030 1010 990 970 950 0 100 Expected Measured 200 300 Insufficient Voltage to Operate (V = 350) 0 500 1000 1500 2500 3000 3500 4000

Operator Ordering: Interleave Sampling + Selection SELECT FROM WHERE light, mag sensors pred1(mag) AND pred2(light) EPOCH DURATION 1s

Traditional DBMS

 (pred1)  (pred2)

At 1 sample / sec, total power savings could be as much as 3.5mW Comparable to processor!

Correct ordering (unless pred1 is very selective and pred2 is not):

 (pred1)  (pred2) ACQP Costly  (pred2) mag  light (pred1) Cheap light mag mag light

Exemplary Aggregate Pushdown

SELECT WINMAX(light,8s,8s) FROM sensors WHERE mag > x EPOCH DURATION 1s  WINMAX  (mag>x) • Novel, general pushdown technique

Traditional DBMS

 WINMAX  (mag>x) ACQP  mag (light > MAX) • Mag sampling is the most expensive operation!

light mag light

• •

Event Based Processing

Epochs are synchronous Might want to issue queries in response to asynchronous events – Avoid unneccessary “polling” CREATE TABLE SIZE birds(uint16 cnt) 1 CIRCULAR ON EVENT SELECT bird-enter(…) b.cnt+1 FROM ONCE birds AS OUTPUT INTO b b In-network storage Placement subject to optimization

• •

Attribute Driven Network Construction

Goal: co-acquisition -- sensors that sample together route together Observation : queries often over limited area – Or some other subset of the network • E.g. regions with light value in [10,20] • Idea : build network topology such that like valued nodes route through each other – For range queries – Relatively static attributes (e.g. location) • Maintenance Issues

Tracking Co-Acquisition Via Semantic Routing Trees • Idea: send range queries only to participating nodes –Parents maintain ranges of descendants SELECT WHERE … a > 5 AND a < 12 4 • Precomputed intervals a:[1,10] 1 2 a:[7,15] 3 a:[20,40] • Reported by children as they join Excluded from query broadcast and result collection!

Parent Selection for SRTs

0 • Idea : Node picks parent whose ancestors’ interval most overlap its descendants’ interval 1 [1,10] 2 [7,15] 3 [20,40] Other selection policies in paper!

4 [3,6] [3,6]  [1,10] = [3,6] [3,6]  [7,15] = ø [3,6]  [20,40] = ø

Simulation Result

450 400 350 300 250 200 150 100 50 0 Active Nodes vs. Query Range Best Case (Expected) Closest Parent All Nodes 0 0.05

0.1

0.2

Query Size as % of Value Range

0.5

1

(Uniform value distribution, 20x20 grid, ideal connectivity to (8) neighbors )

Outline

• • • • • • TinyDB – And demo!

Aggregate Queries ACQP Break Adaptive Operator Placement …

Outline

• • • • • • TinyDB – And demo!

Aggregate Queries ACQP Break Adaptive Operator Placement …

• • •

Adaptive & Decentralized Operator Placement

IPSN 2003 Paper

Main Idea

– Place operators near data sources – Greater operator rate  Closer placement For each operator – Explore candidate neighbors – Migrate to lower cost placements – Via extra messages Rate A Rate B Proper placement depends on path lengths and relative rates!

“Adaptivity” in Databases

• Adaptivity : changing query plans on the fly – Typically at the physical level • Where the plan runs • Ordering of operators • Instantiations of operators, e.g. hash join vs merge join – Non-traditional • Conventionally, complete plans are built prior to execution • Using cost estimates (collected from history) – Important in volatile or long running environments • Where a priori estimates are unlikely to be good • E.g., sensor networks

• •

Adaptivity for Operator Placement

Adaptivity comes at a cost – Extra work on each operator, each tuple – In a DBMS, processing per tuple is small • 100’s of instructions per operator – Unless you have to hit the disk!

Costs in this case?

– Extra communication hurts • Finding candidate placements (exploration) – Cost advertisements from local node – New costs from candidates • Moving state (migration) – Joins, windowed aggregates

Do Benefits Justify Costs?

• • • Not Evaluated!

– 3x reduction on messages vs. external – Excluding exploration & migration costs Seems somewhat implausible, especially given added complexity – Hard to make migration protocol work • Depends on ability to reliably quiesce child ops.

What else could you do?

– Static placement

Summary

• • • Declarative QP – Simplify data collection in sensornets – In-network processing, query optimization for performance Acquisitional QP – Focus on costs associated with sampling data • New challenge of sensornets, other streaming systems?

Adaptive Join Placement – In-network optimization – Some benefit, but practicality unclear – Operator pushdown still a good idea

Open Problems

• • • Many; a few: – In-network storage and operator placement – Dealing with heterogeneity – Dealing with loss Need real implementations of many of these ideas See me! ([email protected])

Questions / Discussion

Making TinyDB

REALLY

Work

• • • • Berkeley Botanical Garden First “real deployment” Requirements: – At least 1 month unattended operation – Support for calibrated environmental sensors – Multi-hop routing What we started with: – Limited power management, no time synchronization – Motes crashed hard occasionally – Limited, relatively untested multihop routing

• • •

Power Consumption in Sensornets

Waking current ~12mA – Fairly evenly spread between sensors, processor, radio Sleeping current 20-100uA Power consumption dominated by sensing, reception: – 1s Power up on Mica2Dot sensor board – Most mote apps use “always on” radio • Completely unstructured communication • Bad for battery life

Why Not Use TDMA?

• CSMA is very flexible: easy for new nodes to join – Reasonably scalable (relative to Bluetooth) • CSMA implemented, available – We wanted to build something that worked

Power Management Approach

Coarse-grained communication scheduling Mote ID Epoch (10s -100s of seconds) 1 … zzz … … zzz … 2 3 4 5 time 2-4s Waking Period

Benefits / Drawbacks

• • Benefits – Can still use CSMA within waking period • No reservation required: new nodes can join easily!

– Waking period duration is easily tunable • Depending on network size Drawbacks – Longer waking time vs. TDMA?

• Could stagger slots based on tree-depth – No “guaranteed” slot reservation • Nothing is guaranteed anyway

Challenges

• • • • Time Synchronization Fault recovery Joining, starting, and stopping Parameter Tuning – Waking period: Hardcode at 4s?

Time Synchronization

• • All messages include a 5 byte time stamp indicating system time in ms – Synchronize (e.g. set local system time to timestamp) with • Any message from parent • Any new query message (even if not from parent) – Punt on multiple queries – Timestamps written just after preamble is xmitted All nodes agree that the waking period begins when (system time % epoch dur = 0) – And lasts for WAKING_PERIOD ms

Wrinkles

• • • If node hasn’t heard from it’s parent for k epochs – Switch to “always on” mode for 1 epoch If waking period ends – Punt on this epoch Query dissemination / deletion via always-on basestation and viral propagation – Data messages as advertisements for running queries – Explicit requests for missing queries – Explicit “dead query” messages for stopped queries • Don’t request the last dead query

Results

• • 30 Nodes in the Garden – Lasted 21 days – 10% duty cycle • Sample period = 30s, waking period = 3s – Next deployment: • 1-2% duty cycle • Fixes to clock to reduce baseline power • Should last at least 60 days Time sync test: 2,500 readings – 5 readings “out of synch” – 15s epoch duration, 3s waking period

Garden Data

36m 33m: 111 32m: 110 30m: 109,108,107 20m: 106,105,104 10m: 103, 102, 101

Humidity vs. Time

101 104 109 110 111 55 45 35 95 85 75 65

Temperature vs. Time

33 28 23 18 13 8 7/7/03 9:40 7/7/03 13:41 7/7/03 17:43 7/7/03 21:45 7/8/03 1:47 7/8/03 5:49 7/8/03 9:51

Date

7/8/03 13:53 7/8/03 17:55 7/8/03 21:57 7/9/03 1:59 7/9/03 6:01 7/9/03 10:03

Data Quality

12000

Sensor Id vs. Number of Results, Summarized by Parent Sensor

100 % of Results 10000 8000 62% 75% 6000 61% 53% 54% 37% 51% 4000 57% 38% 52% 35% 55% 48% 42% 22% 2000 0 1 2 3

Parent Sensor:

4 0 5 1 3% 2 6 4 7 8 9

Sensor Id

5 6 8 10 9 11 10 12 11 13 14 12 15 13 16