Efficient Linked-List RDF Indexing in Parliament
Download
Report
Transcript Efficient Linked-List RDF Indexing in Parliament
GEOSPARQL IN
PARLIAMENT
Terra Cognita
Dave Kolas
November 12, 2012
Parliament
Parliament
In
continuous customer use for ~10 years (Originally
DAML-DB)
Triple Store with SPARQL support
Implemented as a persistence layer for Jena/Sesame
Includes spatial and temporal indexing/processing
Open source! http://parliament.semwebcentral.org/
Design
3
Joseki
Spatial Index
Processor
Part of Jena
Parliament
Framework
External Storage
Model
Spatial Index
(deegree)
IndexingGraph
Parliament Graph
Temporal Index
Processor
Parliament (C++)
Temporal Index
(BDB)
Parliament’s Indexing Strategy
Applications often require efficient statement insertion
Goal: Balanced insertion, query performance, and
space required
Parliament stores triples using two components:
Resource dictionary
Statement table
Additional indices can be added for specific purposes
and vocabularies
Spatial Index
Temporal Index
Parliament’s Spatial Index
First created before GeoSPARQL, used terms
derived from GeoRSS
Now supports most of GeoSPARQL specification
Index is based on R tree in deegree library
(deegree.org)
Approach:
Explicit
geometries, no qualitative reasoning
Optimization so far on triple patterns, not functions
GeoSPARQL Implementation
Parliament supports:
Both
GML and WKT literals, and can interchange
between them
All three vocabularies for spatial relations (simple
features, rcc8, and Egenhofer)
Triple-pattern spatial relations
Filter functions for spatial relations and spatial
combinations
A large number of coordinate reference systems
RDFS Reasoning
GeoSPARQL Missing Pieces
The following features of GeoSPARQL are not
currently implemented in Parliement:
Feature-to-feature
spatial relations via query rewriting
Optimization on FILTER functions
Qualitative reasoning
Standard properties for Geometry
dimension,
spatialDimension, isEmpty, isSimple,
hasSerialization
Function
getSRID
Parliament’s Temporal Index
Parallel to spatial index
Terminology taken from OWL-Time (using Allen
relations for overlapping intervals, etc)
Uses Java version of Berkeley DB for persisting
index
Build Process Improvements
Until very recently, GeoSPARQL support was on a
branch, and required building for your desired
platform
GeoSPARQL support has been merged into the
trunk and prebuilt binaries are now available for
Windows, Mac, and Linux
Parliament build structure has been improved again
to require fewer dependencies
Examples
Data on geosparql.bbn.com
Data sets:
USGS
Rails,
data in Atlanta, GA
Rivers
Geonames
data
Administrative
areas
Points for buildings, such as schools
Example Query 1
Find All Schools within Georgia
SELECT DISTINCT ?school
WHERE {
GRAPH <http://example.org/data> {
# get Georgia geometry
gu:_1705317
geo:hasGeometry ?ga_geo .
# get schools within Georgia
?school a gn:Feature ;
geo:hasGeometry ?school_geo ;
gn:featureCode gn:S.SCH .
?school_geo geo:sfWithin ?ga_geo .
}
}
Example Query 2
Find Geonames features within 10k
of the Nixon Grove School
SELECT ?x
WHERE {
GRAPH <http://www.geonames.org> {
<http://sws.geonames.org/4212826/> geo:hasGeometry ?geo1 .
?geo1 geo:asWKT ?wkt1 .
BIND (geof:buffer(?wkt1, 10000, units:metre) as ?buff) .
?x geo:hasGeometry ?geo2 .
?geo2 geo:asWKT ?wkt2 .
FILTER (geof:sfContains(?buff, ?wkt2))
}
}