HC-Mapper progress, 2004-05

Download Report

Transcript HC-Mapper progress, 2004-05

Spatial Indexing, Search, and Mapping for
Species level databases
Tony Rees, CSIRO Marine and Atmospheric Research
(CMAR), Hobart, Tasmania, Australia
For: AquaSpecies workshop, Los Baños, May 2006
Aspects covered in this talk:
• Approaches to spatial searching
• Coding required to support spatial searches
• Mapping options for the data
• Examples from “OBIS and “CAAB” species-level databases (with
use of c-squares for spatial indexing and mapping)
(1) Typical species-level distribution data – example from OBIS
(Typically patchy / incomplete, however will not worry about this now)
• What search method(s) to offer? e.g...
 named region, e.g. country name/EEZ
 grid square or squares
 user defined area (e.g. bounding box, point + radius, polygon...)
• What should be returned in first instance? i.e...
 build species list for the search region – maybe filtered by category, or
 all the point data – maybe filtered as above
(2) Search by named region
Possible approaches:
• Put everything in a GIS, search against named region’s
polygon at run time
• Classify every point with its relevant region name/s in advance,
store with the point data
• Classify every species with its relevant (unique) region name/s
in advance, store in a new table (as “species-level metadata”)
(3) Search by grid square – example from OBIS Australia site
(could also show the squares on the graphic, as per inset)
Search by grid square
Available options:
• Fixed Size Grid Squares
• Variable Size Grid Squares (nested squares)
• Local or global grid?
• Constant dimensions in degrees or km?
Possible approaches:
• Classify every point with its relevant square ID/s in advance,
store with the point data
• Classify every species with its relevant (unique) square ID/s in
advance, store in a new table (as “species-level metadata”)
“Data level index” example – 1 code (c-square) for every data point
“Metadata level index” example – 1 row (multiple squares) per species
ID (NB, could also disaggregate this to a many:many table if preferred)
(4) Search by user-defined area
Available approaches:
• Bounding box – enter coordinates, drag a rectangle in a java
applet, or select from a list
(most common method)
• Point + radius (normally expressed in distance e.g. km, miles)
(less common, but may match some user expectations;
harder to implement)
• User-defined polygon
(hard to implement in web environment, potentially slow)
... all implemented against latitude/longitude values stored with
the data.
(5) Mapping software
Possible approaches:
• Deploy commercial software in-house (e.g.: ArcIMS)
• Deploy free / open source software in-house (e.g.: MapServer,
c-squares mapper)
• Construct own mapper and deploy
• Send data / squares to remote utility (third party mapper), e.g.:
• BeBIF, CBIF Mappers
• KGS Mapper (Kansas), ACON Mapper (Canada)
• C-squares Mapper (Australia)
• Google Earth (requires client on user’s PC)
Choosing Mapping software
Some aspects to consider...
• Features offered vs. anticipated requirements
• Cost (including indirect costs e.g. person time / complexity to deploy, hardware
requirements, ongoing admin / maintenance needs, also ongoing fees if any)
• System Architecture, e.g. local vs. remote hosting, OGC compliant WMS plus client
vs. self-contained system, etc.
• Performance (speed of rendering e.g. 200, 2000, 20000, 200000 points; ultimate limit;
bandwidth constraints if applicable)
• Map quality, range of options available (including projections, map size / quality,
variety of base maps, available scales, control of symbology / legends, etc.)
• Useability (interface design and ease of use, browser / client machine needs)
• Support (where from, what cost, responsiveness, what dedicated resources /
guarantees)
• Reliability (including system release status, possible points of failure, redundancy / risk
management)
• Compatibility with existing / future project, agency, community practices
• Extensibility to cope with present and future needs (how, who can do it, what process /
timelines available, source code available or not, programming language, etc.)
Possible mapper features...
Basic
• Map data points on one or multiple base maps
• Basic zoom and pan
Intermediate
• Plot multiple data sets (e.g. different species, data sources, time periods),
colour coded as necessary
• Show data that cross the date line / poles as uninterrupted views
• More sophisticated / detailed zoom and pan, improved map quality
• Add / remove layers for display
• Render line, polygon data
• Degree of symbology control, labelling, legends, etc.
• “Click on map” functionality to query underlying data
Advanced
• Full range of projections available
• Ingest external base data layers as images via WMS
• Export species data layers as images via WMS
• Calculate data statistics, summaries on-the-fly
• Full symbology and layer transparency control
A few benchmarks ... from OBIS-SEAMAP (2006) report
comparing MapServer, ArcIMS, and Google Earth
Performance:
Development programming:
Some example map creation applications ...
Museum Victoria Species Mapper – Blue Whale (example of
freeware “fly” mapper with local customisation)
BeBIF Point Data Mapper – Hoplostethus atlanticus (via GBIF) –
10,000 records
CBIF Point Data Mapper – Zeus faber (via GBIF)
CMAR C-squares Mapper – Hoplostethus atlanticus (via OBIS) –
566 squares (representing 10,000 records)
C-squares Mapper – Predicted distribution of Xiphias gladius (via
AquaMaps) – 85,000 squares in 5 colour codes (=probability classes)
ACON Mapper – Hoplostethus atlanticus (via OBIS) – includes
statistics, on-the-fly binning, sort by data provider, etc.
Google Earth – Hoplostethus atlanticus (via GBIF)
True web GIS – Blue Whale data points + on-the-fly user-selectable
layers, e.g. SST data (OBIS-SEAMAP site using MapServer)
(7) Example “CAAB” species name search result (NB, each species name is
associated with stored list of 0.1 degree squares in this database)
Clicking on the map triggers a
spatial query to the underlying
base data table.
URLs mentioned in text:
• AquaMaps: FishBase > tools > AquaMaps (uses c-squares mapper)
• CAAB: http://www.marine.csiro.au/caab/
• C-squares: http://www.marine.csiro.au/csquares/
• FishBase: http://www.fishbase.org/
• GBIF: http://www.gbif.org/ (includes links to BeBIF, CBIF mappers)
• Google Earth: http://earth.google.com/
• Museum Victoria Bioinformatics:
http://www.museum.vic.gov.au/bioinformatics/ > mammals > map searches
• OBIS: http://www.iobis.org (includes links to c-squares, ACON, KGS mappers)
• OBIS Australia: http://www.obis.org.au/
• OBIS-SEAMAP: http://seamap.env.duke.edu/ (MapServer based site)
Additional information on c-squares ...
Overview of the c-squares
hierarchical grid square notation
(refer www.marine.csiro/csquares/
for more information)
• C-squares principle
• The world is first divided into 10x10 degree squares (global total: 648)
• example code: 3414
• Each 10x10 degree square is divided into 4 5x5 degree squares (total: 2,592)
• example code: 3414:1
• Each 5x5 degree square is divided into 25 1x1 degree squares (total: 64,800)
• example code: 3414:132
• Each 1x1 degree square is divided into 4 0.5x0.5 degree squares (total: 259,200)
• example code: 3414:132:3
(etc.). NB, can then search at any higher level of the hierarchy, as required, since all
nested parent codes are included as initial portion of the “child” code.
A simple algorithm will encode lat/lon to c-squares code, and vice versa.
• Choice of resolution for encoding
• Half degree squares (50 km nominal resolution) seems to be good compromise
between spatial resolution and index size for global datasets (0.1x0.1 degrees may
be preferred for regional scale use).
Actual size of half degree squares (e.g. cf.
UK). NB, if data are encoded at this resolution, can
then be queried at one, five or ten degree square sizes
as well.
0.5 degree squares measure
approximately 55 x 35 km at this
latitude.