Spatial Data Structures

Download Report

Transcript Spatial Data Structures

Snehal Thakkar
Spatial Data Structures
Hanan Samet
Computer Science Department
University of Maryland
Snehal Thakkar
1
Spatial Data Structures
•
•
•
•
•
•
•
Introduction
Spatial Indexing
Region Data
Point Data
Rectangle Data
Line Data
Conclusion
Snehal Thakkar
2
Introduction
• Spatial Objects
Points, Lines, Regions, Rectangles …..
• Spatial Indexing
Unlike conventional data sort has to be on space
occupied by data
• Hierarchical Data Structures
Based on recursive decomposition, similar to divide
and conquer method
Snehal Thakkar
3
Spatial Indexing
• Mapping Spatial Data into Point
- Same, Higher or Lower Dimension
- Good storage purposes, queries like intersect
- Problems with queries like nearest
• Bucketing Methods
- Grid file, BANG file, LSD trees, Buddy trees….
- Buckets based on not the representative point,
but based on actual space.
Snehal Thakkar
4
R-tree
• Based on Minimum Bounding Rectangle
R1
R3
a b
R2
R4
R5
d g h
c i
R6
e f
Snehal Thakkar
5
R-Trees (Continued)
• Organize spatial objects into d-dimensional rectangles.
• Each node in the tree corresponds to smallest ddimensional rectangle that encloses child nodes.
• If an object is spatially contained in several nodes, it is
only stored in one node.
• Tree parameters are adjusted so that small number of
pages are visited during a spatial query
• All leaf nodes appear at same level
• Each leaf node is (R,O) where R is smallest rectangle
containing O, e.g. R3,R4……
Snehal Thakkar
6
R-trees (Continued)
• Each non-leaf node is (R,P) where R is smallest
rectangle containing all child rectangles, e.g. R1,R2
• R-tree of order (m,M) means that each node in the tree
has between floor M/2 and M nodes, with exception of
root node. Root node has two entries unless it is a leaf
node.
• R-tree is not unique, rectangles depend on how objects
are inserted and deleted from the tree.
• Problem is that to find some object you might have to
go through several rectangles or whole database.
Snehal Thakkar
7
+
R
- Trees
• Decomposition of Space into Disjoint Cells
R1
R3
d g h
R2
R4
c h i
R5
a b e i
R6
c f i
Snehal Thakkar
8
+
R
Trees (Continued)
• R+-tree and Cell Trees used approach of discomposing
space into cells
• R+-trees deals with collection of objects bounded by
rectangles
• Cell tree deals with collection of objects bounded by
convex polyhedra
• R+-trees is extension of k-d-B-tree.
• Try not to overlap the rectangles.
• If object is in multiple rectangles, it will appear
multiple times.
Snehal Thakkar
9
+
R Trees(Continued)
•
•
•
•
•
Multiple paths to object from the root
Height of the tree is increased
Retrieval times are smaller
When summing the objects, needs eliminate duplicates
It is not possible to guarantee that all properties of Btrees is fulfilled without going through difficult insert
and deletion routines.
• It is data-dependent, so depending on how you insert or
delete records R+-tree will be different.
Snehal Thakkar
10
More Spatial Indexing
• Uniform Grid
- Ideal for uniformly distributed data
- More data-independence then R+-trees
- Space decomposed on blocks on uniform size
- Higher overhead
• Quadtree
- Space is decomposed based on data points
- Sensitive to positioning of the object
- Width of the blocks is restricted to power of two
- Good for Set-theory type operations, like
composition of data.
Snehal Thakkar
11
Region Data
• Focus on Interior Representation
• Represented as Image array of pixels
• Runlength Code
- Break array into 1*m blocks, row representation
• Metal Axis Transformation (MAT)
- Union of Maximal Square blocks
- Blocks may overlap
- Block are specified by center and radius
Snehal Thakkar
12
More Region Data
• Region Quadtree
- Is Metal Axis Transformation
- Whose blocks are required to be disjoint
- To have standard sizes(squares whose sides are
power of two)
- To be at standard locations
- Based on successive subdivision of image array
into four equal size quadrants.
Snehal Thakkar
13
Region Quadtree
A
2 3
1
4 5
7 8
6 9 10 13 14
15 16
11 12 17 18 19
NW
1
2
3
NE
B
4
SW
SE
C
5
F
6 D 11 12
7
8
Snehal Thakkar
9 10
13 14 E 19
15 16 17 18
14
Region Quadtree (Continued)
• Each leaf node is either Black or White
• All non-leaf nodes are Gray(Circle is
previous example
• You can also use it for non-binary images
• Resolution of the decomposition may be
governed by data or predetermined
• Can be used for several object
representations.
Snehal Thakkar
15
Variations of Quadtree
• Point Quadtree
- Quadtree with rectangular quadrants
- Adoption of Binary Search Tree to two dimensions or
more
- Useful for location based queries like where is nearest
theatre from the location.
- Descending the tree till you find the node for location
based queries.
- For nearest neighbor, search is continued in the
neighborhood of the node containing object.
- Feature based queries tough because index is based on
spatial occupancy not on features.
Snehal Thakkar
16
Variations of Quadtree
• Pyramid
- Exponentially tapering stack of arrays, each one
quarter size of previous
- Useful for feature based queries like where does
wheat grow in California.
- Nodes that are not at maximum level of
resolution contain summary information
• Octree
- Three dimensional analog of quadtree
- Recursively subdivide into eight octants
Snehal Thakkar
17
More Variations of Quadtree
• Locational Code Based Quadtree
- Treats image as a collection of leaf nodes, each encoded by pair of
numbers
- First is base 4 number, sequence of directional codes that locates
leaf from the root
- Second depth at which node is found or size
• DF-expression
- Represents the image in form of traversal of nodes of its quadtree
- Very Compact storage, each node type can be encoded with two
bits.
- Not easy to use when random access to nodes is required.
Snehal Thakkar
18
Searching with Quadtree
• Useful for performing set operations
• When performing intersection, it only
returns black node when both quadtrees
have black nodes.
• Operation is performed using three
quadtrees.
• Worst case scenario is sum of nodes in
two quadtrees
Snehal Thakkar
19
Algorithms with Quadtree
• Most algorithms are preorder traversals
• Execution time is linear function of
number of nodes
• Quadtree Complexity Theorem
- Number of nodes in quadtree representation is
O(p+q) for 2q*2q image with perimeter p measured in
pixel width.
- It also holds for more dimensions.
Snehal Thakkar
20
Point Data
• PR Quadtree
- Regular decomposition of space into quadrants
- Organized same way as the region quadtree
- Leaf nodes are either empty or contain data point and its
co-ordinates
- A quadrant contains at most one data point
- Shape of the tree is independent of the order in which
points are inserted
- If points are close together then decomposition can be
deep
- Can use quadrants with capacity c
- Good for search within specified distance of given record
Snehal Thakkar
21
PR-tree (Continued)
(50,50)
(75,75) (25,25)
(75,25)
(20,88)
(0,100)
(100,100)
(88,65)
(52,15)
(92,1)
(0,0)
(100,0)
Snehal Thakkar
22
Rectangle Data
• Used to approximate other objects in the
image and in VLSI design rule checking
• If environment is static, solution is based
on use of plane sweep paradigm
• Any addition to database forces reexecution of algorithm on whole database
Snehal Thakkar
23
Rectangle Data (Continued)
• Grid File Based Approach
- Each rectangle reduced to a point in higher
dimension
- Made up of Cartesian product of two one
dimensional intervals
- Each interval is represented by center and extent
- Set of intervals is represented by Grid File
- Grid File uses two dimensional array of grid
blocks called Grid Directory
Snehal Thakkar
24
Rectangle Data (Continued)
• Grid File Based Approach (Continued)
- Grid Directory has address of the bucket
- Set of linear scales is kept in the core to access
grid block in the grid directory
- Guarantees access to record in two operations
- First operation to access the grid block
- Second operation to access the grid bucket
Snehal Thakkar
25
Rectangle Data (Continued)
• MX-CIF Quadtree
- Based on Quadtree
- Decomposition of space into rectangles
- Each rectangle is associated with a quadtree node
corresponding to the smallest block which
contains it in its entirety
- Subdivision stops when nodes block contains no
rectangles or at predetermined size
- Rectangles can be associated with terminal and
non-terminal nodes
Snehal Thakkar
26
MX-CIF Quadtree
{A,E}
B
A
C
{G}
D
{B,C,D}
F
G
{F}
E
Snehal Thakkar
27
Line Data
• PM1 quadtree
- Based on regular decomposition of space
- Partitioning occurs as long as a block contains
more than one line segment unless the line
segments are incident at a vertex in the block
- Vertex-based implementation
- Useful because space requirements for polyhedral
objects are smaller then conventional octree
Snehal Thakkar
28
1
PM
Quadtree(Continued)
Snehal Thakkar
29
Line Data (Continued)
• PMR Quadtree
- Edge-based variant of PM quatree
- Uses probabilistic splitting rule
- Block contains variable number of line segments
- Each line segment is inserted into all blocks that
it intersects or occupies
- If block has more line segments than permitted, it
is divided into four blocks once and only once
- During deletion line segment is removed from all
blocks and blocks are checked for merging
Snehal Thakkar
30
PMR Quadtree
Snehal Thakkar
31
PMR Quadtree
Snehal Thakkar
32
PMR Quadtree
Snehal Thakkar
33
Conclusion
•Questions ?
•Comments ?
•Email me at [email protected]
Snehal Thakkar
34