Implementing Geographical Information System Services for SERVOGrid Marlon Pierce Community Grids Lab Indiana University SERVOGrid Components Component (“portlet”)-based portals. • OGCE mentioned by Chris Hill Web Services for.
Download ReportTranscript Implementing Geographical Information System Services for SERVOGrid Marlon Pierce Community Grids Lab Indiana University SERVOGrid Components Component (“portlet”)-based portals. • OGCE mentioned by Chris Hill Web Services for.
Implementing Geographical Information System Services for SERVOGrid Marlon Pierce Community Grids Lab Indiana University SERVOGrid Components Component (“portlet”)-based portals. • OGCE mentioned by Chris Hill Web Services for “execution grid” services • Ant-based job specification • File transfer • Distributed session management (“context”). Geographic Information System (GIS) services for “data grid” services. • Web Map Service • Web Feature Service • GIS-compatible information services. Support for streaming, real-time data. Distributed service management/orchestration • Using events and data streams. Guiding Principles Grids are composed of families of services • Data, execution, information, … Use “WS-I+” approach to building service families. • Build Grids out of Web Service standards conservatively. WS-Interoperability is the starting point. • See position paper http://grids.ucs.indiana.edu/ptliupages/publications/Web ServiceGrids.pdf SOAP and WSDL provide universal messaging framework and service definition language. • All services should communicate with the same message format. • Message delivery is left as an exercise. Implementations are interesting. Pattern Informatics (PI) PI is a technique developed at University of California, Davis for analyzing earthquake seismic records to forecast regions with high future seismic activity. • They have correctly forecasted the locations of 15 of last 16 earthquakes with magnitude > 5.0 in California. See Tiampo, K. F., Rundle, J. B., McGinnis, S. A., & Klein, W. Pattern dynamics and forecast methods in seismically active regions. Pure Ap. Geophys. 159, 2429-2467 (2002). • http://citebase.eprints.org/cgibin/fulltext?format=application/pdf&identifier=oai%3Aar Xiv.org%3Acond-mat%2F0102032 PI is being applied other regions of the world, and John has gotten a lot of press. • Google “John Rundle UC Davis Pattern Informatics” Pattern Informatics in a Grid Environment PI in a Grid environment: • Hotspot forecasts are made using publicly available seismic records. Southern California Earthquake Data Center Advanced National Seismic System (ANSS) catalogs • Code location is unimportant, can be a service through remote execution • Results need to be stored, shared, modified • Grid/Web Services can provide these capabilities Problems: • How do we provide programming interfaces (not just user interfaces) to the above catalogs? • How do we connect remote data sources directly to the PI code. • How do we automate this for the entire planet? Solutions: • Use GIS services to provide the input data, plot the output data Web Feature Service for data archives Web Map Service for generating maps • Use HPSearch tool to tie together and manage the distributed data sources and code. WFS + Seismic Rec. WSDL Aggregating WMS Stubs Stubs HTTP SOAP WSDL WSDL WFS + Seismic Rec. WFS + State Bounds “REST” … WMS + OnEarth GIS Behind the Scenes The web features are served up by a Web Feature Service. Web Map Service aggregates maps • NASA OnEarth + our own renderings. We re-implement Open Geospatial Consortium standards using Web Service Standards. • SOAP messages, WSDL service definitions. • Will allow us to separate messages from HTTP transport layer in future. More WMS Info: • http://grids.ucs.indiana.edu/ptliupages/publications/acm-gissayar.pdf. • http://grids.ucs.indiana.edu/ptliupages/publications/Geoinform atics05_asayar.pdf. More WFS Info: • http://grids.ucs.indiana.edu/ptliupages/publications/gwpap243 .pdf More general info, software, demos: http://www.crisisgrid.org Tying It All Together: HPSearch HPSearch is an engine for orchestrating distributed Web Service interactions • It uses an event system and supports both file transfers and data streams. • Legacy name HPSearch flows can be scripted with JavaScript • HPSearch engine binds the flow to a particular set of remote services and executes the script. HPSearch engines are Web Services, can be distributed interoperate for load balancing. • Boss/Worker model ProxyWebService: a wrapper class that adds notification and streaming support to a Web Service. More info: http://www.hpsearch.org Data can be stored and retrieved from the 3rd part repository (Context Service) WS Context WFS (Tambora) (Gridfarm001) NaradaBroker network: Used by HPSearch engines as well as for data transfer WMS Data Filter HPSearch (Danube) (TRex) Virtual Data flow WMS submits script execution request (URI of script, parameters) HPSearch hosts an AXIS service for remote deployment of scripts PI Code Runner (Danube) Accumulate Data Run PI Code Create Graph Convert RAW -> GML HPSearch (Danube) GML (Danube) Actual Data flow HPSearch controls the Web services Final Output pulled by the WMS HPSearch Engines communicate using NB Messaging infrastructure Support for Real Time Applications RDAHMM: GPS Time Series Segmentation Slide Courtesy of Robert Granat, JPL GPS displacement (3D) length two years. Divided automatically by HMM into 7 classes. Features: • Dip due to aquifer drainage (days 120250) • Hector Mine earthquake (day 626) • Noisy period at end of time series Complex data with subtle signals is difficult for humans to analyze, leading to gaps in analysis HMM segmentation provides an automatic way to focus attention on the most interesting parts of the time series Towards Real-Time RDAHMM A real-time version of RDHAMM could potentially be used to detect state change events in live data from a GPS station. SCIGN maintains 125+ GPS stations, so trivially parallel RDAHHM clones can monitor state changes in the entire network. But first we must get the data to RDAHMM. NaradaBrokering: Message Transport for Distributed Services NB is a distributed messaging software system. • http://www.naradabrokeri ng.org NB system virtualizes transport links between components. • Supports TCP/IP, parallel TCP/IP, UDP, SSL. See e.g. http://grids.ucs.indiana.ed u/ptliupages/publications/A llHands2005NB-Paper.pdf for trans-Atlantic parallel tcp/ip timings. SOPAC GPS Services NaradaBrokering topics More Information Contact: [email protected] GIS Work at CGL: www.crisisgrid.org • Software, demos, publications • Several recent manuscript submissions are/will be posted soon. HPSearch at CGL: www.hpsearch.org SERVOGrid Web Sites • Our fine parent project • http://servo.jpl.nasa.gov/ • http://quakesim.jpl.nasa.gov/ Acknowledgements Geoffrey Fox, Community Grids Lab director. Shrideep Pallickara: NaradaBrokering design/development lead Grad Students: Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawadhan Gadgil Backup Slides SERVO Apps and Their Data GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • Relies upon fault models with geometric and material properties. Virtual California: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space. • Relies upon fault and fault friction models. Pattern Informatics: Calculates regions of enhanced probability for future seismic activity based on the seismic record of the region • Uses seismic data archives RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. • Used to analyze GPS and seismic catalog archives. • Can be adapted to detect state change events in real time. We will focus on the latter two. Some SERVOGrid Research Challenges Problems with Conventional Web Services Transport: HTTP Request/Response is a poor choice for non-trivial data transport. • Much better to stream out data without knowing the content-length. Representation: ASCII XML is inefficient in obvious and not so obvious ways. • For example, WS security depends upon canonicalization to make reproducible message digests. Efficiency and performance is not just a high performance computing problem. • Needed to support PDAs and other devices NaradaBrokering and Web Services SOAP 1.2 defines a message routing across distributed SOAP Nodes. • Naturally maps to an NB implementation. • This has just been released from www.naradabrokering.org NB also has support for WS-Eventing and WSReliableMessaging. More generally, we argue for the use of software messaging substrates to provide/implement desirable “quality of service” features • Transport, routing/addressing, reliability, security, discovery, etc. • Specific service capabilities (like “run job”, “move file”, “query data”) are decoupled from the substrate capabilities. Efficient XML Representation The XML Infoset provides an abstract data model. • SOAP 1.2 is defined using the Infoset. This separates XML from “angle bracket notation” restrictions. • Infoset-compliant binary representations are possible. • No loss of data, so you can translate between binary and ascii representations. Current lab research investigates hand-held applications. • See http://grids.ucs.indiana.edu/ptliupages/publications/OptSOA P_CTS05.pdf But easily extensible to high performance transport problems. More Information Contact: [email protected] GIS Work at CGL: www.crisisgrid.org • Software, demos, publications • Several recent manuscript submissions are/will be posted soon. HPSearch at CGL: www.hpsearch.org SERVOGrid Web Sites • Our fine parent project • http://servo.jpl.nasa.gov/ • http://quakesim.jpl.nasa.gov/ A Big Picture for SERVOGrid Science Grid … Tsunami EPR/CIP … Earthquake Response WS Earthquake Forecast WS Collaboration Grid Portals Sensor/Database Grid GIS Grid Registry EPR/CIP Grid Data Access/Storage Visualization Grid Compute Grid Metadata Core Grid Services Security Notification Workflow Messaging Physical Network Figure 1: Science, Critical Infrastructure Protection (CIP) and Emergency Preparedness and Response (EPR) Grids built as a Grid of Web Service (WS) Grids RDAHMM: SCIGN GPS Network Analysis Slide Courtesy of Robert Granat, JPL Now segment all 127 GPS stations In blue: Number of stations that change state on a given day In red: Seismic activity Days with many state changes often do not correlate with large earthquakes. Have found a way to detect regional aseismic signals This software is being integrated with the Quakesim web portal Scenarios for use with real time streaming data through the web portal are currently being investigated Support for Streaming Data We use NaradaBrokering messaging software to manage data streams and filters. • Open source, Java-based software from the Community Grids Lab • Based on topic-based publication/subscription for delivery of messages from/to multiple endpoints. • “Message” can be anything, including SOAP and binary data streams. • We use this for audio/video collaboration. • More recently using it to build Web Service messaging substrates SOAP 1.2 routing model, WS-Reliability, WS-Eventing NB ensures reliable delivery of events in the case of broker or client failures and prolonged entity disconnects. • Also supports replay. Implements high-performance protocols (message transit time of 1 to 2 ms per hop) GPS Stations Current implementation provides real-time access to GP messages to following stations in RYO, ASCII and GML formats: r 4c a d SC-2 5 6 e SC-1 b SC-8 s p SC-7 o q SSC-C SSC-A SC-9 u t f SC-3 g SSC-D k h SC-5 l i SC-4 j SSC-B m SC-6 n v w SC-10 x z SC-11 y SOPAC GPS Services As a case study we implemented services to provide realtime access to GPS position messages collected from several SOPAC networks. Next step is to couple data assimilation tools (such as RDAHMM) to real-time streaming GPS data. Next steps • Programming APIs: currently we assume the subscriber speaks NaradaBrokering Java APIs (either NB’s native API or Java Messaging Service). Need to investigate appropriate Web Service standards and C/C++ bindings. • SOAP enveloping of the GML message stream. • A Sensor Collection Service will be implemented to provide metadata about GPS sensors in SensorML. Position Messages SOPAC provides 1-2Hz real-time position messages from various GPS networks in a binary format called RYO. Position messages are broadcasted through RTD server ports. We have implemented tools to convert RYO messages into ASCII text and another that converts ASCII messages into GML. Real-Time Access to Position Messages We have a Forwarder tool that connects to RTD server port to forward RYO messages to a NB topic. RYO to ASCII converter tool subscribes this topic to collect binary messages and converts them to ASCII. Then it publishes ASCII messages to another NB topic. ASCII to GML converter subscribes this topic and publishes GML messages to another topic.