Transcript ppt-slides
Data on Air: Organization and Access T. Imielinski, S. Viswanathan, and B.R. Badrinath Presented by Qinhai Xia Motivation Power conservation: •Processor: AT&T Hobbit chip Active mode -- 25mw,Doze mode -- 50 uw •Broadcasting: Quotrex system over FM channel Parameter of Concern Channel Tuning time : T1 T2 T1 + T2 + T3 + … + Tn T3 Client Tn Latency: Tn - T1 Some Definitions Bucket bcast I 0 I 1 I 2 Index Segment • Bucket ID • Bcast pointer • Index pointer • Bucket type D 3 D 4 D 5 D 6 D 7 Data Segment D 8 D 9 Latency OPT File Previous bcast ---------- Latency is the best: No overhead for index Latency = Data/2 + C Tuning time = Data/2 + C Next bcast Tuning OPT File Previous bcast Index ---------- Tuning is the best: Latency = (Data + Index) / 2 + (Data + Index) / 2 + C = Data + Index + C Tuning time = k + C k: number of levels in the index tree Next bcast (1, m) Indexing Data 1 Previous bcast Data 2 Index 1 Index 2 Data m Index ---------3 Index m Next bcast Tune Next in Continuous Index Pointer Client Active Doze Client Active Doze Retrieving Client Active (1, m) Indexing Continue Analysis: Latency = (Index + Data/m)/2 + (m*Index + Data)/2 + C = ((m+1) * Index + (1/m + 1) * Data)) / 2 + C Tuning Time = 1 + K + C K: level of the index tree Distributed Indexing Nonreplicated Distribution : Different index segments are disjoint Entire Path Replication: The path from the root to an index bucket B is replicated Partial Path Replication (Distributed Indexing) Between two index buckets B and B’, it is enough to replicate just the path from the least common ancestor Comparison R a1 b1 c1 c2 c3 0 ----- 8 b2 c4c5 c6 9 ----- 17 b3 c7 c8 c9 18 ----- 26 Nonreplicated Distribution R a1 b1 c1 c2 c3 0 ----- 8 R a1b2 c4c5 c6 9 ----- 17 R a1b3 c7 c8 c9 18 ----- 26 Full Path Replication R a1 b1 c1 c2 c3 0 ----- 8 a1 b2 c4 c5 c6 9 -----17 a1 b3 c7 c8 c9 18 -----26 c c c c c c c c c R a2 b4 10 11 12 27 ----- 35 a2 b5 13 14 15 36 -----44 a2 b6 16 17 18 45 -----53 Partial Path Replication (Distributed Indexing) Distributed Indexing Analysis r: level of index tree Level[r]: number of nodes on the rth level of the index tree Index[r]: the size of top r level of the index tree Indexr: additional index overhead Index = Level[r+1] - 1 Latency = ((Index - Index[r])/ Level[r+1] + Data/Level[r+1] + Data + Index + Indexr) /2 + C Tuning Time = 2 + k + C (1, M) vs. Distributed Indexing Distributed indexing usually has a much lower latency then (1,m) Distributed indexing just has one more bucket overhead than (1, m) Nonclustered Indexing a2 b1 c1 a2 b1 c3 a4 b1 c2 IS Meta Segment a3 b2 c1 a1 b2 c2 •Pointer to the next segment •Offset to DS (K < P < L) a1 b3 c1 a4 b3 c2 a2 b3 c1 a1 b1 c2 a4 b1 c3 Meta Segment a3 b3 c3 a1 b3 c3 a4 b3 c2 •Pointer to the next IS for value b Nonclustered Indexing Protocol Meta Segment Previous bcast Index b=1 Index b=2 Meta Segment Index ---------b=3 Index b=1 ---------- Tune Next in Continuous Index Pointer Client Active Doze Client Active Doze Retrieving Client Active Nonclustered Indexing Analysis Latency 1 Index Index[ r ] Data *( 2 Level[ r 1] M * Level[ r 1] M *( Index Index ) Data ) TuningTime 1 ( K 1) C M Real Example Quotrex System Information: 160,000 bytes Bucket length: 128 bytes Bandwidth: 10 kbps Algorithm Latency Seconds Tuning Time Tuning Time Power (J) Active Mode Doze Mode Consumption Latency_OPT 62.5 62.5 0 15.625 Tuning_OPT 130.3 0.4 129.9 0.106 (1, m) m = 5 90.9 0.5 90.4 0.13 Distributed Indexing 68.9 0.6 68.3 0.153 Non Cluster Latency_OPT 125 125 0 31.25 Non Cluster Tuning_OPT 187.5 2.2 185.3 0.559 Non Cluster Indexing 132.4 2.4 130 0.607