Implementing Geographical Information System Services for SERVOGrid Marlon Pierce Community Grids Lab Indiana University SERVOGrid Components  Component (“portlet”)-based portals. • OGCE mentioned by Chris Hill  Web Services for.

Download Report

Transcript Implementing Geographical Information System Services for SERVOGrid Marlon Pierce Community Grids Lab Indiana University SERVOGrid Components  Component (“portlet”)-based portals. • OGCE mentioned by Chris Hill  Web Services for.

Implementing Geographical
Information System
Services for SERVOGrid
Marlon Pierce
Community Grids Lab
Indiana University
SERVOGrid Components

Component (“portlet”)-based portals.
• OGCE mentioned by Chris Hill

Web Services for “execution grid” services
• Ant-based job specification
• File transfer
• Distributed session management (“context”).

Geographic Information System (GIS) services
for “data grid” services.
• Web Map Service
• Web Feature Service
• GIS-compatible information services.


Support for streaming, real-time data.
Distributed service management/orchestration
• Using events and data streams.
Guiding Principles

Grids are composed of families of services
• Data, execution, information, …

Use “WS-I+” approach to building service families.
• Build Grids out of Web Service standards conservatively.

WS-Interoperability is the starting point.
• See position paper
http://grids.ucs.indiana.edu/ptliupages/publications/Web
ServiceGrids.pdf

SOAP and WSDL provide universal messaging
framework and service definition language.
• All services should communicate with the same message
format.
• Message delivery is left as an exercise.

Implementations are interesting.
Pattern Informatics (PI)

PI is a technique developed at University of
California, Davis for analyzing earthquake seismic
records to forecast regions with high future
seismic activity.
• They have correctly forecasted the locations of 15 of last
16 earthquakes with magnitude > 5.0 in California.

See Tiampo, K. F., Rundle, J. B., McGinnis, S. A.,
& Klein, W. Pattern dynamics and forecast
methods in seismically active regions. Pure Ap.
Geophys. 159, 2429-2467 (2002).
• http://citebase.eprints.org/cgibin/fulltext?format=application/pdf&identifier=oai%3Aar
Xiv.org%3Acond-mat%2F0102032

PI is being applied other regions of the world,
and John has gotten a lot of press.
• Google “John Rundle UC Davis Pattern Informatics”
Pattern Informatics in a Grid
Environment

PI in a Grid environment:
• Hotspot forecasts are made using publicly available seismic records.


Southern California Earthquake Data Center
Advanced National Seismic System (ANSS) catalogs
• Code location is unimportant, can be a service through remote
execution
• Results need to be stored, shared, modified
• Grid/Web Services can provide these capabilities

Problems:
• How do we provide programming interfaces (not just user interfaces)
to the above catalogs?
• How do we connect remote data sources directly to the PI code.
• How do we automate this for the entire planet?

Solutions:
• Use GIS services to provide the input data, plot the output data


Web Feature Service for data archives
Web Map Service for generating maps
• Use HPSearch tool to tie together and manage the distributed data
sources and code.
WFS
+
Seismic Rec.
WSDL
Aggregating
WMS
Stubs
Stubs
HTTP
SOAP
WSDL
WSDL
WFS
+
Seismic Rec.
WFS
+
State Bounds
“REST”
…
WMS
+
OnEarth
GIS Behind the Scenes


The web features are served up by a Web Feature Service.
Web Map Service aggregates maps
• NASA OnEarth + our own renderings.

We re-implement Open Geospatial Consortium standards
using Web Service Standards.
• SOAP messages, WSDL service definitions.
• Will allow us to separate messages from HTTP transport layer
in future.

More WMS Info:
• http://grids.ucs.indiana.edu/ptliupages/publications/acm-gissayar.pdf.
• http://grids.ucs.indiana.edu/ptliupages/publications/Geoinform
atics05_asayar.pdf.

More WFS Info:
• http://grids.ucs.indiana.edu/ptliupages/publications/gwpap243
.pdf

More general info, software, demos:
http://www.crisisgrid.org
Tying It All Together: HPSearch

HPSearch is an engine for orchestrating
distributed Web Service interactions
• It uses an event system and supports both file transfers
and data streams.
• Legacy name

HPSearch flows can be scripted with JavaScript
• HPSearch engine binds the flow to a particular set of
remote services and executes the script.

HPSearch engines are Web Services, can be
distributed interoperate for load balancing.
• Boss/Worker model


ProxyWebService: a wrapper class that adds
notification and streaming support to a Web
Service.
More info: http://www.hpsearch.org
Data can be stored and
retrieved from the 3rd part
repository (Context Service)
WS Context
WFS
(Tambora)
(Gridfarm001)
NaradaBroker network:
Used by HPSearch
engines as well as for
data transfer
WMS
Data Filter
HPSearch
(Danube)
(TRex)
Virtual
Data
flow
WMS submits script
execution request
(URI of script,
parameters)
HPSearch hosts an
AXIS service for
remote deployment of
scripts
PI Code Runner
(Danube)
 Accumulate Data
 Run PI Code
 Create Graph
 Convert RAW -> GML
HPSearch
(Danube)
GML
(Danube)
Actual Data flow
HPSearch controls the Web services
Final Output pulled by the WMS
HPSearch Engines
communicate using NB
Messaging
infrastructure
Support for Real Time
Applications
RDAHMM: GPS Time Series Segmentation
Slide Courtesy of Robert Granat, JPL
GPS displacement (3D)
length two years.
Divided automatically
by HMM into 7 classes.
Features:
• Dip due to aquifer
drainage (days 120250)
• Hector Mine
earthquake (day 626)
• Noisy period at
end of time series


Complex data with subtle signals is difficult for
humans to analyze, leading to gaps in analysis
HMM segmentation provides an automatic way to
focus attention on the most interesting parts of
the time series
Towards Real-Time RDAHMM



A real-time version of RDHAMM could
potentially be used to detect state change
events in live data from a GPS station.
SCIGN maintains 125+ GPS stations, so
trivially parallel RDAHHM clones can
monitor state changes in the entire
network.
But first we must get the data to
RDAHMM.
NaradaBrokering: Message
Transport for Distributed Services

NB is a distributed
messaging software
system.
• http://www.naradabrokeri
ng.org

NB system virtualizes
transport links between
components.
• Supports TCP/IP, parallel
TCP/IP, UDP, SSL.

See e.g.
http://grids.ucs.indiana.ed
u/ptliupages/publications/A
llHands2005NB-Paper.pdf
for trans-Atlantic parallel
tcp/ip timings.
SOPAC GPS Services
NaradaBrokering topics
More Information


Contact: [email protected]
GIS Work at CGL: www.crisisgrid.org
• Software, demos, publications
• Several recent manuscript submissions are/will
be posted soon.


HPSearch at CGL: www.hpsearch.org
SERVOGrid Web Sites
• Our fine parent project
• http://servo.jpl.nasa.gov/
• http://quakesim.jpl.nasa.gov/
Acknowledgements



Geoffrey Fox, Community Grids Lab
director.
Shrideep Pallickara: NaradaBrokering
design/development lead
Grad Students: Ahmet Sayar, Galip
Aydin, Mehmet Aktas, Harshawadhan
Gadgil
Backup Slides
SERVO Apps and Their Data





GeoFEST: Three-dimensional viscoelastic finite element model for
calculating nodal displacements and tractions. Allows for realistic
fault geometry and characteristics, material properties, and body
forces.
• Relies upon fault models with geometric and material properties.
Virtual California: Program to simulate interactions between vertical
strike-slip faults using an elastic layer over a viscoelastic half-space.
• Relies upon fault and fault friction models.
Pattern Informatics: Calculates regions of enhanced probability for
future seismic activity based on the seismic record of the region
• Uses seismic data archives
RDAHMM: Time series analysis program based on Hidden Markov
Modeling. Produces feature vectors and probabilities for
transitioning from one class to another.
• Used to analyze GPS and seismic catalog archives.
• Can be adapted to detect state change events in real time.
We will focus on the latter two.
Some SERVOGrid
Research Challenges
Problems with Conventional Web
Services

Transport: HTTP Request/Response is a poor
choice for non-trivial data transport.
• Much better to stream out data without knowing
the content-length.

Representation: ASCII XML is inefficient in
obvious and not so obvious ways.
• For example, WS security depends upon
canonicalization to make reproducible message
digests.

Efficiency and performance is not just a high
performance computing problem.
• Needed to support PDAs and other devices
NaradaBrokering and Web
Services

SOAP 1.2 defines a message routing across
distributed SOAP Nodes.
• Naturally maps to an NB implementation.
• This has just been released from www.naradabrokering.org


NB also has support for WS-Eventing and WSReliableMessaging.
More generally, we argue for the use of software
messaging substrates to provide/implement
desirable “quality of service” features
• Transport, routing/addressing, reliability, security, discovery,
etc.
• Specific service capabilities (like “run job”, “move file”,
“query data”) are decoupled from the substrate capabilities.
Efficient XML Representation

The XML Infoset provides an abstract data model.
• SOAP 1.2 is defined using the Infoset.

This separates XML from “angle bracket notation”
restrictions.
• Infoset-compliant binary representations are possible.
• No loss of data, so you can translate between binary and
ascii representations.

Current lab research investigates hand-held
applications.
• See
http://grids.ucs.indiana.edu/ptliupages/publications/OptSOA
P_CTS05.pdf

But easily extensible to high performance transport
problems.
More Information


Contact: [email protected]
GIS Work at CGL: www.crisisgrid.org
• Software, demos, publications
• Several recent manuscript submissions are/will
be posted soon.


HPSearch at CGL: www.hpsearch.org
SERVOGrid Web Sites
• Our fine parent project
• http://servo.jpl.nasa.gov/
• http://quakesim.jpl.nasa.gov/
A Big Picture for SERVOGrid
Science Grid
…
Tsunami
EPR/CIP
…
Earthquake
Response WS
Earthquake
Forecast WS
Collaboration Grid
Portals
Sensor/Database Grid
GIS Grid
Registry
EPR/CIP Grid
Data Access/Storage
Visualization Grid
Compute Grid
Metadata
Core Grid Services
Security
Notification
Workflow
Messaging
Physical Network
Figure 1: Science, Critical Infrastructure Protection (CIP) and Emergency
Preparedness and Response (EPR) Grids built as a Grid of Web Service
(WS) Grids
RDAHMM: SCIGN GPS Network Analysis
Slide Courtesy of Robert Granat, JPL
Now segment all 127
GPS stations
In blue: Number
of stations that change
state on a given day
In red: Seismic activity
Days with many state
changes often do not
correlate with large
earthquakes.



Have found a way to detect regional aseismic signals
This software is being integrated with the Quakesim web portal
Scenarios for use with real time streaming data through the web
portal are currently being investigated
Support for Streaming Data

We use NaradaBrokering messaging software to manage
data streams and filters.
• Open source, Java-based software from the Community Grids
Lab
• Based on topic-based publication/subscription for delivery of
messages from/to multiple endpoints.
• “Message” can be anything, including SOAP and binary data
streams.
• We use this for audio/video collaboration.
• More recently using it to build Web Service messaging
substrates


SOAP 1.2 routing model, WS-Reliability, WS-Eventing
NB ensures reliable delivery of events in the case of broker
or client failures and prolonged entity disconnects.
• Also supports replay.

Implements high-performance protocols (message transit
time of 1 to 2 ms per hop)
GPS Stations

Current implementation provides real-time access to
GP messages to following stations in RYO, ASCII and
GML formats:
r
4c
a
d
SC-2
5
6
e
SC-1
b
SC-8
s
p SC-7 o
q
SSC-C
SSC-A
SC-9
u
t
f
SC-3
g
SSC-D
k
h
SC-5
l
i
SC-4
j
SSC-B
m
SC-6
n
v
w
SC-10
x
z
SC-11
y
SOPAC GPS Services



As a case study we implemented services to provide realtime access to GPS position messages collected from
several SOPAC networks.
Next step is to couple data assimilation tools (such as
RDAHMM) to real-time streaming GPS data.
Next steps
• Programming APIs: currently we assume the subscriber speaks
NaradaBrokering Java APIs (either NB’s native API or Java
Messaging Service).

Need to investigate appropriate Web Service standards and C/C++
bindings.
• SOAP enveloping of the GML message stream.
• A Sensor Collection Service will be implemented to provide
metadata about GPS sensors in SensorML.
Position Messages



SOPAC provides 1-2Hz real-time
position messages from various GPS
networks in a binary format called
RYO.
Position messages are broadcasted
through RTD server ports.
We have implemented tools to
convert RYO messages into ASCII
text and another that converts ASCII
messages into GML.
Real-Time Access to Position
Messages



We have a Forwarder tool that connects to
RTD server port to forward RYO messages
to a NB topic.
RYO to ASCII converter tool subscribes
this topic to collect binary messages and
converts them to ASCII. Then it publishes
ASCII messages to another NB topic.
ASCII to GML converter subscribes this
topic and publishes GML messages to
another topic.