Data Stream Sharing

Download Report

Transcript Data Stream Sharing

Lehrstuhl Informatik III: Datenbanksysteme
Data Stream Sharing
Richard Kuntschke and Alfons Kemper
Fakultät für Informatik
Technische Universität München
Germany
15/07/2015
Data Stream Sharing
1
Lehrstuhl Informatik III: Datenbanksysteme
Overview

Introduction and Motivation

Subscription Language

Data Stream Sharing

Conclusion and Outlook
15/07/2015
Data Stream Sharing
2
Lehrstuhl Informatik III: Datenbanksysteme
Overview

Introduction and Motivation

Subscription Language

Data Stream Sharing

Conclusion and Outlook
15/07/2015
Data Stream Sharing
3
Lehrstuhl Informatik III: Datenbanksysteme
Preliminaries

The StreamGlobe Data Stream Management System (DSMS)
(VLDB 2005)

Grid-based P2P network
 Super-Peers
 Thin-Peers
 Speaker-Peer

XML Data Streams

(W)XQuery Subscriptions

Query processing with FluX (VLDB 2004)
15/07/2015
Data Stream Sharing
4
Lehrstuhl Informatik III: Datenbanksysteme
15/07/2015
Data Stream Sharing
5
Lehrstuhl Informatik III: Datenbanksysteme
Goals and Challenges

Optimize incrementally registered
subscriptions

Find suitable input data in the network

Reduce network traffic and peer load
15/07/2015
Data Stream Sharing
6
Lehrstuhl Informatik III: Datenbanksysteme
Optimization Techniques

Data Stream Sharing


In-network query processing
Multi-subscription optimization

Treat queries and data streams symmetrically

Cost-based optimizer
15/07/2015
Data Stream Sharing
7
Lehrstuhl Informatik III: Datenbanksysteme
Motivation
photons
SP4
photons
SP6
P0
Query 4
SP0
SP2
SP6
P0
Query 4
SP0
P4
Super-Peer
Backbone
SP4
SP2
Super-Peer
Backbone
Query 2
P4
Query 2
Query 1
Query 1
SP5
SP7
P1
P2
Query 3
SP1
SP3
SP5
P1
SP3
P3
P3
With Data Stream Sharing
Data Stream Sharing
P2
Query 3
SP1
Without Data Stream Sharing
15/07/2015
SP7
8
Lehrstuhl Informatik III: Datenbanksysteme
Overview

Introduction and Motivation

Subscription Language

Data Stream Sharing

Conclusion and Outlook
15/07/2015
Data Stream Sharing
9
Lehrstuhl Informatik III: Datenbanksysteme
Example Data Stream
photons
photon*
coord
cel
ra
15/07/2015
dec
phc
en
det_time
det
dx
dy
Data Stream Sharing
10
Lehrstuhl Informatik III: Datenbanksysteme
Example Data
Vela
Supernova
Remnant
15/07/2015
RXJ0852.0-4622
Supernova
Remnant
Data Stream Sharing
11
Lehrstuhl Informatik III: Datenbanksysteme
WXQuery (Windowed XQuery)
<photons>
for $p in stream("photons")/photons/photon
where $p/coord/cel/ra >= 120.0
and $p/coord/cel/ra <= 138.0
and $p/coord/cel/dec >= -49.0
and $p/coord/cel/dec <= -40.0
return
<vela>
{$p/coord/cel/ra} {$p/coord/cel/dec}
{$p/phc} {$p/en} {$p/det_time}
</vela>
</photons>
15/07/2015
Data Stream Sharing
12
Lehrstuhl Informatik III: Datenbanksysteme
WXQuery (Windowed XQuery)
<photons>
for $w in stream("photons")/photons/photon
[coord/cel/ra >= 120.0 and
coord/cel/ra <= 138.0 and
coord/cel/dec >= -49.0 and
coord/cel/dec <= -40.0]
|/photon/det_time diff 20 step 10|
let $a := avg($w/photon/en)
return
<avg_en> {$a} </avg_en>
</photons>
15/07/2015
Data Stream Sharing
13
Lehrstuhl Informatik III: Datenbanksysteme
Data Windows
|/photon/det_time diff 20 step 10|
0
10
20
30
40
det_time
20
30
40
det_time
|count 4 step 2|
0
15/07/2015
10
Data Stream Sharing
14
Lehrstuhl Informatik III: Datenbanksysteme
Overview

Introduction and Motivation

Subscription Language

Data Stream Sharing

Conclusion and Outlook
15/07/2015
Data Stream Sharing
15
Lehrstuhl Informatik III: Datenbanksysteme
Query 1
<photons>
for $p in stream("photons")/photons/photon
where $p/coord/cel/ra >= 120.0
and $p/coord/cel/ra <= 138.0
and $p/coord/cel/dec >= -49.0
and $p/coord/cel/dec <= -40.0
return
<vela>
{$p/coord/cel/ra} {$p/coord/cel/dec}
{$p/phc} {$p/en} {$p/det_time}
</vela>
</photons>
15/07/2015
Data Stream Sharing
16
Lehrstuhl Informatik III: Datenbanksysteme
Abstract Properties of Query 1
Condition
Operator
σ
Properties
Stream
Query 1
photons
15/07/2015
138.0
ra
-40.0
0
-120.0
49.0
dec
Operator
Condition
π
{ra●,dec●,phc●,en●,det_time●}
Data Stream Sharing
17
Lehrstuhl Informatik III: Datenbanksysteme
Query 2
<photons>
for $w in stream("photons")/photons/photon
[coord/cel/ra >= 120.0 and
coord/cel/ra <= 138.0 and
coord/cel/dec >= -49.0 and
coord/cel/dec <= -40.0]
|/photon/det_time diff 20 step 10|
let $a := avg($w/photon/en)
return
<avg_en> {$a} </avg_en>
</photons>
15/07/2015
Data Stream Sharing
18
Lehrstuhl Informatik III: Datenbanksysteme
Abstract Properties of Query 2
Condition
138.0
ra
Operator
Properties
DataStream
Query 3
photons
σ
-40.0
0
-120.0
49.0
dec
Condition
en
Operator
avg●
Condition
|/photon/det_time diff 20 step 10|
15/07/2015
Data Stream Sharing
19
Lehrstuhl Informatik III: Datenbanksysteme
Data Stream Discovery and Cost Model

Data Stream Discovery




Start at origin of referenced stream
Search forward (BFS or DFS) in the network
graph
Pruning
Cost Model

Parameters


15/07/2015
Network traffic
Computational load on peers
Data Stream Sharing
20
Lehrstuhl Informatik III: Datenbanksysteme
Data Stream Discovery Example
photons
SP4
SP6
P0
Query 4
SP0
SP2
Super-Peer
Backbone
P4
Query 2
Query 1
SP5
SP7
P1
Query 3
SP1
15/07/2015
P2
SP3
Data Stream Sharing
P3
21
Lehrstuhl Informatik III: Datenbanksysteme
Window-based Aggregation
Query 3
Query 4
{
{
10
2
4
1
20
3
6
5
8
7
10
9
60
1
2
40
det_time
Query 3: |/photon/det_time diff 20 step 10|
Query 4: |/photon/det_time diff 60 step 40|
•60 mod 20 = 0
•40 mod 10 = 0
•20 mod 10 = 0
15/07/2015
•60 div 20 = 3
•40 div 10 = 4
•20 div 10 = 2
Data Stream Sharing
22
Lehrstuhl Informatik III: Datenbanksysteme
Performance Evaluation – Preliminaries



4 x 4 Grid Topology
16 Super-Peers
2 Data Streams



Real astrophysical data
photons data streams
100 Queries



Randomly generated
Query Templates for Selection/Projection/Aggregation
queries
Constant values for selection predicates and data window
definitions randomly chosen from predefined set
15/07/2015
Data Stream Sharing
23
Lehrstuhl Informatik III: Datenbanksysteme
Performance Evaluation – Peer Load
Average CPU Load (%)
Data Shipping
Query Shipping
Stream Sharing
12
10
8
6
4
2
0
SP0
SP2
SP4
SP6
SP8
SP10
SP12
SP14
Peers
15/07/2015
Data Stream Sharing
24
Lehrstuhl Informatik III: Datenbanksysteme
Performance Evaluation – Network Traffic
Data Shipping
Query Shipping
Stream Sharing
Network Traffic (MBit)
200
150
100
50
0
SP0
SP2
SP4
SP6
SP8
SP10
SP12
SP14
Peers
15/07/2015
Data Stream Sharing
25
Lehrstuhl Informatik III: Datenbanksysteme
Overview

Introduction and Motivation

Subscription Language

Data Stream Sharing

Conclusion and Outlook
15/07/2015
Data Stream Sharing
26
Lehrstuhl Informatik III: Datenbanksysteme
Conclusion

What has been presented:





Subscription language
Properties approach
Cost model
Algorithms for data stream sharing
Data Stream Sharing takes three steps:



Properties construction
Identification of shareable streams through
properties matching
Plan generation, installation, and execution
15/07/2015
Data Stream Sharing
27
Lehrstuhl Informatik III: Datenbanksysteme
Outlook

Advanced Data Stream Sharing





Improved properties structure
Support for nested queries
Data stream widening
Dynamic optimizer
Scalability


Hierarchical network organization
Fully distributed network organization
15/07/2015
Data Stream Sharing
28