Geospatial Indexing and GIS Applications

Download Report

Transcript Geospatial Indexing and GIS Applications

Software Group – Spatial Information Management
Putting the world in your Database:
The Informix Spatial and Geodetic DataBlades
Robert Uleman
Consulting IT Specialist
Worldwide Information Management Sales Support –
Spatiotemporal Technology
© 2004 IBM Corporation
Software Group – Spatial Information Management
Agenda
 Overview
–
GIS: Geographic Information Systems
–
Spatial data in an object-relational DBMS
–
The Spatial DataBlade
–
Competitive differences
 Details
–
Spatial SQL syntax
–
Spatial indexing
–
Geodetic: round-earth spatial
2
© 2003 IBM Corporation
Software Group – Spatial Information Management
Geographic Information System
3
© 2003 IBM Corporation
Software Group – Spatial Information Management
4
© 2003 IBM Corporation
Software Group – Spatial Information Management
Architectural Evolution of GIS Data Management
1st
Generation:
GIS
Application
SQL
2nd Generation:
GIS
Application
GIS
Application
3rd
Generation:
Proprietary
GIS API
RDBMS
GIS
Data
Engine
Spatial
Features
File
System
Attributes
SQL
Proprietary
spatial
structures
GIS
Data
Engine
RDBMS
Open or
proprietary
SQL
Open
“Spatial”
Application
5
Proprietary
GIS API
Proprietary
data format
Spatial business logic
Spatial
features,
indexes in
BLOBs
Spatially
enabled
DBMS
Spatial types
functions
indexes
© 2003 IBM Corporation
Software Group – Spatial Information Management
What’s Special about Spatial?
 Traditionally not supported by relational databases
 Requires new indexing techniques
 Voluminous data
– Individual values can get arbitrarily large:
• Large, convoluted lines and polygons (e.g., a coastline)
• Raster images
– Lots of features
 Maps can effectively represent lots of information
– Much more than spreadsheets or reports
– Queries often retrieve many more rows than “normal”
queries
 Individual operations may be computationally expensive
– WHERE clause predicates
 Transactions generally long, unlike OLTP
– Resembles code revision control in software development
6
© 2003 IBM Corporation
Software Group – Spatial Information Management
Extender/DataBlade Introduction: Component Technology
Spatial
Geodetic
Grid
Your idea goes here
IDS
Server Subsystems
Connectivity
Backup
7
Restore
© 2003 IBM Corporation
Software Group – Spatial Information Management
Extender/DataBlade Elements
Types
Functions
New
Extender/
DataBlade
Casts
Aggregates
Indexes
Tables
Client Code
8
© 2003 IBM Corporation
Software Group – Spatial Information Management
If Integer were not built in …
Domain
– Whole numbers, up to some maximum magnitude
Data types
– Smallint, Integer, Bigint
– Representations: ASCII ([+|-]d..),
binary (2’s complement, byte order)
Functions and operators
– Add(+), Subtract(-), Multiply(*), Abs, Mod, …
– Equal(=), LessThan(<), GreaterThanOrEqual(>=), …
Index support
– B-Tree
9
© 2003 IBM Corporation
Software Group – Spatial Information Management
Integer not built in? Not so far-fetched
 Illustra: Pure object-relational database
– Commercialization of UC Berkeley Postgres project
– Acquired by Informix in 1996, Informix acquired by IBM in 2001
– Postgres continues as open-source PostgreSQL
– No built-in data types; everything is bound at runtime
 Land Information New Zealand: Fraction data type “5/7”
– Avoid roundoff in cumulative subdivision of property
– Legally mandated improvement in area/tax calculation
precision
 Dates
– Birthdate: understands that February 28 is a birthday for
someone born on February 29
– Tradingdate: skips weekends, holidays
10
© 2003 IBM Corporation
Software Group – Spatial Information Management
User-Defined Data Types
Data Type: Descriptor assigned to a column or a variable
name
John T. Smith
compensation
349,876 yen
location
(123 256)
image
jobs_held
Clerk, Administrator, Manager
CREATE TABLE employee (
name
varchar(30),
compensation
salary_t,
location
point,
jobs_held
set(varchar(30)),
picture
image
);
11
© 2003 IBM Corporation
Software Group – Spatial Information Management
Some SQL Queries
 Location-Based Services: List Points of Interest
– SELECT name, description, address
FROM
restaurants
WHERE Overlaps(location, box(getGPS(), 2000, 2000))
AND category = ‘chinese’
AND docContains(menu, ‘Peking duck’);
Name
Dynastie
Description
Chinese restaurant in
glass pyramid
Address
Misburger Str. 81
 List volcanic eruptions in a region of interest
– SELECT
FROM
WHERE
name, year, mag, location
volcano
ST_Within(location,
'polygon((-125 43,-125 46,-120 46,-120 43,-125 43))')
ORDER BY name, year;
name
HOOD MOUNT
HOOD MOUNT
12
year
1854
1859
mag
0
2
location
POINT (-121.69999 45.36000)
POINT (-121.69999 45.36000)
© 2003 IBM Corporation
Software Group – Spatial Information Management
The Informix Spatial DataBlade
 OpenGIS Standard Spatial Types and Functions
– Certified compliant with OpenGIS Simple Features Specification
– Geometric data types
• ST_Point, ST_Linestring, ST_Polygon, ST_Geometry, etc.
– Spatial functions
• ST_Distance, ST_Intersects, ST_Within, etc.
– Standard Data representations
• Well-Known Binary, Well-Known Text, ESRI Shape
 Tailored to ESRI’s ArcSDE 9.x (spatial database gateway)
– Additional functions, support for annotation, SDE format, etc.
– Based on ESRI’s geometry engine (Shape library)
– Consistent results of spatial operations in all software tiers:
database, middle (ArcSDE), client (ArcGIS, ArcIMS)
13
© 2003 IBM Corporation
Software Group – Spatial Information Management
DB2 Spatial Extender, IDS Spatial DataBlade
 Developed, supported and maintained by IBM
– Wrapped around ESRI’s geometry engine (Shape library)
– Spatial index: R-tree
– Spatially aware optimizer
• Recognizes spatial operators and index
• Cost, selectivity provided by R-tree
– Administration tools: Blade Manager
– Utilities: Shape file import, export
 Strategic alliance with ESRI
– Close relationship in engineering, marketing, and sales
– Software (Data, WebSphere), Hardware, Services
14
© 2003 IBM Corporation
Software Group – Spatial Information Management
Spatial/OpenGIS SQL Data Types
ST_Geometry
ST_Curve
ST_Surface
ST_LineString ST_Polygon
“Abstract” Classes
ST_GeomCollection
ST_Point
ST_MultiCurve
ST_MultiSurface
ST_MultiLineString
ST_MultiPolygon
ST_MultiPoint
Instantiable Classes
15
© 2003 IBM Corporation
Software Group – Spatial Information Management
OpenGIS Spatial SQL Functions
ST_Intersects(geometry1,geometry2)
?
16
© 2003 IBM Corporation
Software Group – Spatial Information Management
ArcSDE Architecture for Informix
Client
Application
ArcGIS
Open
SQL
Applications
Family
API
SDE
Client API
ArcExplorer
ArcSDE
Server
TCP/IP
ESQL/C
ODBC
JDBC
JDBC
Caching
ODBC
Compression
Connection pooling
(Projections)
(Long transactions)
Spatial
(Raster support)
OpenGIS
Spatial
Queries
DataBlade
Geodatabase,
business rules,
custom types
17
R-tree index
IDS
© 2003 IBM Corporation
Software Group – Spatial Information Management
ArcSDE Architecture for Others
Oracle
Spatial
Queries
OpenGIS
Spatial
Queries
Spatial
DataBlade
Informix
18
FAT
ArcSDE
Server
ArcSDE
Server
ArcSDE
Server
SQL
Applications
SQL
Applications
SQL
Applications
Oracle
Spatial
NO Spatial
Queries
<nothing>
Index tables
Oracle
SQLServer,
Oracle “binary”
© 2003 IBM Corporation
Software Group – Spatial Information Management
The Informix Spatial Advantage
 Standard, intuitive syntax
– Easy development, maintainable, fewer bugs
– Result of true Object-Relational extensibility
– OpenGIS Simple Features conformance for interoperability
with other compliant systems
 Performance
– Spatial index and functions integrated into server at code
interface level, not based on tables and SQL
 Tight cooperation and integration with ESRI
19
© 2003 IBM Corporation
Software Group – Spatial Information Management
SQL Comparisons: Query
21
Oracle Spatial
DB2 Spatial
SELECT A.Feature_ID
FROM TARGET A
WHERE
sdo_relate(
A.shape,
mdsys.sdo_geometry(
2003,
NULL,
NULL,
mdsys.sdo_elem_info_array(
1,1003,1
),
mdsys.sdo_ordinate_array(
x1,y1, x2,y2, x3,y3, x4,y4, …
)
),
'mask=anyinteract querytype=window‘
) = 'TRUE‘
;
SELECT A.Feature_ID
FROM A
WHERE
ST_Overlaps(
A.shape,
ST_GeomFromText(
‘ST_POLYGON(
x1 y1, x2 y2, x3 y3, x4 y4 …
)’,
5
-- OpenGIS requirement
)
)
;
© 2003 IBM Corporation
Software Group – Spatial Information Management
Spatial queries
 Compute the percentage overlap of imagery that covers part of a
region of interest
– ROI = Ontario Province, with 25 km buffer, but only over Canada
 SELECT
i.id AS image_id,
ST_Area(ST_MultiPolygon
ST_Intersection( i.footprint, p.shape )
::ST_Polygon
) / ST_Area( i.footprint ) * 100 AS "%overlap"
FROM images i, provinces p
WHERE ST_Overlaps(
i.footprint,
ST_Difference(
ST_Buffer( p.shape, 25, 'KILOMETRE' ),
(SELECT shape FROM countries WHERE name = 'USA')
) AND p.name = 'Ontario'
ORDER BY 2 DESC;
22
© 2003 IBM Corporation
Software Group – Spatial Information Management
The B-tree index
“Polygon”
A through Z
A-I
A-B
C-E
F-I
S-Z
J-R
J-L
M-O
P-R
S - Sp
Sq - U
V-Z
B-tree indexes rapidly reduce the number of items to search
through in a selection process and are the industry standard for
alpha-numeric data. But how can spatial data be sorted???
25
© 2003 IBM Corporation
K8
A Simple R-tree
L4
K = Key bounding box
L = Leaf node bounding box
N = Node (internal) bounding box
N2
K1
K9
L1
K10
K2
L5
K4
K11
L2
L3
Index structure
K5
K3
N1
K6
N1
K7
N2
Search
Object
L1
L2
L3
L4
L5
Data space
K1
K2
K3 K3
K4
K5
K6
K7
K8
K9
K10 K11
Software Group – Spatial Information Management
Planar Coordinates
Northing
Easting
5,000,000
500,000
27
© 2003 IBM Corporation
Software Group – Spatial Information Management
Spherical Coordinates
+90
(90° N)
R
latitude
longitude
0
-90
(90° W)
28
© 2003 IBM Corporation
Software Group – Spatial Information Management
+90
Flattening the Earth
?
 Plane Geometry on lat-long
0
– Singularities and scale
distortion at and toward the poles
– Wrap-around
at 180º longitude
?
?
-90
-180
0
?
+180
– Poor location of lines, edges, intersections
 Local/Regional Projections
– Limited valid range
UTM
32
UTM
33
– Map edge-matching problems
– Non-uniform scale
 Indexing: it gets worse!
– Multiple “bounding boxes” or complete loss of selectivity
29
© 2003 IBM Corporation
Software Group – Spatial Information Management
±180°
±180°
0°
Single
bounding
box: high
selectivity,
low
complexity
ROUND
+90
FLAT
↑
Y
0
Single
bounding
strip: low
selectivity
30
-90
-180
Split
bounding
boxes: high
complexity
-90
0
+90
→X
+180
© 2003 IBM Corporation
Software Group – Spatial Information Management
Geodetic DataBlade/Extender
 Latitude-longitude (‘geodetic’) coordinates, ellipsoidal datum
 Uniform accuracy and resolution around the globe (“world to cm”)
 No scale singularities and map edges
 Based on Hipparchus geometry engine by Geodyssey Ltd.
 Integrated time and floating-point dimensions for single-index
searches and true spatio-temporal data management
 Powerful indexes for high performance:
– Voronoi Tessellation – adaptive space partitioning
– R-tree – self-tuning multidimensional index (up to 5
dimensions)
 Unique to IBM
31
© 2003 IBM Corporation
Software Group – Spatial Information Management
Connect the dots...
Flat plane: line segments
connecting vertices are
straight lines
32
Ellipsoid: line segments
connecting vertices are
geodesics
© 2003 IBM Corporation
Software Group – Spatial Information Management
Connect the dots, continued
Add vertices if you want a line segment
to follow a parallel (line of constant latitude)
meridians
parallels
33
© 2003 IBM Corporation
Software Group – Spatial Information Management
Distances
What is the distance from Anchorage to Tokyo?
The shortest path is the shorter of the
two possible geodesic paths:
the thick part of the great circle
34
© 2003 IBM Corporation
Software Group – Spatial Information Management
Polygons that straddle the 180th meridian
split flat-plane representation into 2 or more pieces
MULTIPOLYGON(
((-180 30,-165 30,-165 40,
-180 40,-180 30)),
((180 30,180 40,165 40,
165 30, 180 30))
)
35
POLYGON(
(165 30, -165 30,
-165 40, 165 40)
)
© 2003 IBM Corporation
Software Group – Spatial Information Management
Polygons that enclose a pole
POLYGON(
(180 -60, -180 -60, -180 -90,
180 -90, -180 -60)
)
extra vertex
36
extra edge
extra vertex
POLYGON(
( 0 -60, -120 -60,
120 -60,
0 -60)
)
extra edge
© 2003 IBM Corporation
Software Group – Spatial Information Management
Hemispheres
3
Western hemisphere,
flat-earth representation:
Western hemisphere,
round-earth representation:
POLYGON((0 -90, 0 90,
-180 90, 180 -90, 0 -90))
POLYGON((0 -30, 0 90, 180 -30))
2
2
3
4
1
1
Note that the same three points, specified in
opposite order, define the eastern hemisphere
37
© 2003 IBM Corporation