iSEE: Efficient k-Nearest-Neighbor Monitoring over Moving Obejcts Wei Wu, Kian-Lee Tan

Download Report

Transcript iSEE: Efficient k-Nearest-Neighbor Monitoring over Moving Obejcts Wei Wu, Kian-Lee Tan

iSEE: Efficient k-Nearest-Neighbor
Monitoring over Moving Obejcts
[SSDBM 2007]
Wei Wu, Kian-Lee Tan
National University of Singapore
Problem Settings



Given a query point q, continuously report k
nearest objects of q
Objects and query points move in an
unpredictable fashion
Objects and query points can be indexed in main
memory
Related Work



Divide space into conceptual rectangles
Initialize heap with first level rectangles and cq
For top entry e in heap unless mindist(e,q) > best_dist;


If e is rectangle, enheap all cells in e
If e is cell; check objects in it
Conceptual Partitioning: An efficient Method for Continuous nearest neighbor monitoring
Kyriakos Mouratidis, Marios Hadjieleftheriou, Dimitris Papadias
[SIGMOD 2005]
Motivations

CPM visits un-necessary cells during update
If some nearest neighbor moves away from the query, the new
answer cannot be among any cell c for which maxdist(c,q)<
best_dist where best_dist is the distance of kth nearest neighbor
from q
All the shaded cells are visited
Ideally, the update should start from
the cells that intersect the circle
Motivations

Heap size is large because rectangles do not approximate circles
very well
All the cells in any rectangle Rec are inserted in heap when r becomes greater than
mindist(Rec,q)
All cells in the four rectangles are enheaped. Ideally, only the cells that
intersect the circle should be enheaped
How to alleviate these problems?
CircularTrip
VOB
Visit Order Build (VOB)




Each group LiGj has either 4 or 8 cells
The cells in any group LiGj have similar min-dist from q
The min-dist of a group LiGj from q is smaller than min-dist of Li+1Gj
The min-dist of a group LiGj from q is smaller than min-dist of LiGj+1
Min-dist of LiGj is the minimum of all min-distances between the cells in LiGj from q.
Initial Computation
• Initially en-heap cq and group L1G1 in heap with its min-dist
• For each de-heaped entry
• If it is a cell c
• Look in c for potential NNs
• store c in visit_list
• If it is a group LiGj
• en-heap all cells in LiGi with their min-distance
• en-heap next level group Li+1Gj with its min-dist
• if i=j
• en-heap next group of same level LiGj+1
Data Structure
•Each cell in grid stores object list and influence list
•best_NN stores the Nearest neighbors among the visited cells
• search heap H contains the cells and groups that were en-heaped
but not de-heaped (Enables quick updates)
• visit_list stores min and max-distances of all the cells that were deheaped
Update Handling
• If an object x moves inside the circle
• include x in result and delete current kth-NN
• (Update influence List) go backward (descending order) in visit_list deleting q from
all the cells c for which min-dist(c,q) > new best_dist
•If a result object x moves outside the circle
• delete x from the result set
• start from the beginning of visit_list and skip the cells for which max-dist(c,q)<
best_dist
Experiments
Experiments
Memory Comparison with CPM
• Data structure of iSEE is same as CPM but
• CPM has a larger search heap
• visit_list of both CPM and iSEE contains same number of cells but
• iSEE also stores max-dist for cells
• Let r be the distance of kth NN from q
• CPM memory
•search heap 2.(4r2 – πr2) = 1.71r2
• visit_list 2(πr2)
• Total: 8r2
• iSEE memory:
• visit_list 3(πr2)
• Total: 3(πr2) + search heap
iSEE vs CircularTrip



CircularTrip doesn’t need any visit_list or search heap
When a query moves, iSEE computes from scratch whereas CircularTrip
uses previous information
VOB is almost as expensive as CircularTrip (if two consecutive
circularTrips visit some (at most 27%) cells twice VOB computes max-dist
for each cell) Let C be the number of cells that are needed to be visited during
computation, CircualrTrip computes distances for at most 1.27C cells whereas VOB computes
distances for 2C cells.

Update of influence list by iSEE is faster than CircularTrip because they
store visit_list (but lazy update approach can be used in both of these
algorithms)
CircularTrip uses less memory (50%-85% of iSEE) and the running
time of both algorithm is estimated to be similar
CircularTrip is more flexible



ArcTrip returns all the cells that intersect a circle and lie in a
specified angle range <θ1,θ2>
ArcTrip can be used to continuously monitor constrained nearest
neighbor queries optimally (visiting minimal set of cells)
Extension of iSEE for constrained NN queries makes it in-efficient
Recall that six constrained NN queries are needed to be continuously
monitored for continuous monitoring of RNN queries
CircularTrip is more flexible



ArcTrip can be used to monitor constrained NN queries over
irregular regions efficiently
CircularTrip can also be used to efficiently monitor farthest
neighbor queries
Farthest neighbors in constrained regions can also be monitored
For all algorithms, CircularTrip preserves its property that it needs no
book-keeping information and visits minimum number of cells