Integrating Geographical Information Systems and Grid Applications Marlon Pierce ([email protected]) Contributions: Yili Gong, Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawardhan Gadgil, Zhigang Qi Community Grids Lab Indiana.

Download Report

Transcript Integrating Geographical Information Systems and Grid Applications Marlon Pierce ([email protected]) Contributions: Yili Gong, Ahmet Sayar, Galip Aydin, Mehmet Aktas, Harshawardhan Gadgil, Zhigang Qi Community Grids Lab Indiana.

Integrating Geographical
Information Systems and Grid
Applications
Marlon Pierce ([email protected])
Contributions: Yili Gong, Ahmet Sayar, Galip Aydin,
Mehmet Aktas, Harshawardhan Gadgil, Zhigang Qi
Community Grids Lab
Indiana University
Project Funding: NASA AIST
Some Project Organizational Details

We need a project-wide mailing list or two.


Code repositories?





I can set these up really quickly at IU.
I have started using SourceForge and SVN for several projects,
have been generally happy.
I think its good for visibility of the project, good to show program
managers.
SF also has project management stuff like bugzillas.
But licensing model may not work for JPL.
I’ve also become a recent convert to using Wikis for group
editable web pages.



We should do this or its equivalent.
I have one at www.crisisgrid.org but it is trivial to make a new one
on www.servogrid.org also.
Very low maintenance.
Something New: Using the TeraGrid

The NSF TeraGrid is an administrative federation of
supercomputing facilities across the country.


SDSC, NCSA, IU, ANL, PU, ORNL, TACC, PSC.
Four useful TG facts




Almost any US researcher can apply to get 30,000 hours
(somewhat painful web forms to fill out). You can get more
hours if you apply.
This researcher can share his allocation with others (1
page form--I used to give John, Gleb, Terry and others
accounts).
All TG machines try to have the same software
environments.
All come with Globus installed.
Problems with TeraGrid

TeraGrid is still broken up into fiefdoms


There is no way to do the following query:


Articles of confederation instead of constitution.
“Dear TG, I want to run the follow GeoFEST
simulation. It will require the following resources.
Please submit to the best available machine.
Love, Marlon.”
You still have to login to a specific machine
and submit to its specific queuing system.

PBS, LoadLeveler, LSF, etc.
Our Solution: Condor-G

Condor is a famous scheduler/cycle scavenger from U
Wisconsin.



Condor-G is a bit different



It is a condor client interface that can submit Globus jobs.
Globus in turn can hide differences between queuing systems.
You ony need Condor-G installed on one machine


To use it, run condor software on all nodes.
Has a “matchmaker” component that matches a user’s request to
available resources. “Classads” in condor-speak.
Can be anywhere.
Both Condor and Condor-G have a Web Service interface
called Birdbath.

We have built portlets out of these things.
Condor Only
Condor-G and Globus
(Portal)
Client
(Portal)
Client
Condor
Condor
Master
Condor
-G
Condor
Condor
Globus
Globus
PBS
LSF
Condor
What’s the Problem?

The problem is that the Condor matchmaker only
works for Condor.




Condor daemons on various machines report back to the
collector at regular intervals.
Condor-G needs an external provider since Condor
is only installed in one place.
We are solving this problem by using GPIR (a
resource monitoring tool installed on the TG) to
construct classads and publish to the matchmaker.
We have prototyped this for GeoFEST, but need to
take it to some sort of “production” level.
Bigger Research Issue: Generalized
Matchmaking


Condor matchmaking is only good for running
jobs.
More generally you want to do Web Service
matchmaking on a Grid.



May be “find me best machine to run GeoFEST”.
May be “find me QuakeTables service with
Australian faults”
Workflow also needs matchmaking, and
matchmaking should be decoupled from
workflow execution.
Workflow and Matchmaking
Matchmaking
Workflow
User Layer
QuakeTables
Service
QuakeTables
Australia
QuakeTables
California
VC Service
IU’s Big Red
NCSA’s Cobalt
QuakeSim/SERVO IT/CS
Development Overview

Portlet-based portal components allow different portlets to be
exchanged between projects.




Form-based portlets --> Interactive Maps
These are clients to services
Sensor Grid: Topic based publish-subscribe systems support
operations on streaming data.
Web services allow request/response style access to data
and codes.



GIS services (WMS, WFS)
“Execution grid” services for running codes and moving files.
Information services (WS-Context) and Web Service workflow
(HPSearch)
Portlets and Portals
Portlets and Portals

Portlets are a standard way for Java web applications to be
shared between different portal containers.

A portlet may be a web application such as a Google map client
that I want to put into container.





Will inherit login, access control, layout management, etc.
We will show some demos for RDAHMM and ST-Filter later.
We use Java Server Faces for development, so there may be
some solvable interoperability issues.
The main point is that portlets allow REASON and QuakeSim
to exchange user interface components.
We still need to develop client libraries and Web Services
Sensor Grid Overview
Sensor Grid Overview

QuakeTables and Web Feature Service provide access to archival data.


Our Sensor Grid architecture supports access to real-time data.






Integrated with all 70 stations of CRTN.
Consists of chains of filters communicating on a network through a
publish/subscribe broker.


Faults, GPS time series, Seismic records
Each filter does a single task and passes the data along.
Filters are also web services, but the communication is currently proprietary.
Could be adapted to use SOAP and Axis 2 one way communication model, but
this is an academic exercise.
Filters can be applications, like RDAHMM.
Scripps collaborators have a prototype command line client if you want to
pipe and grep.
Or you can develop your own stream sink.
SensorGrid Architecture
Real-Time Services for GPS Observations


Real-time data processing is supported by
employing filters around publish/subscribe
messaging system.
The filters are small applications extended
from a generic Filter class to inherit publish
and subscribe capabilities.
Input Signal
Filter
Output Signal
Filter Chains
NaradaBrokering Topics
Real-Time positions on Google maps
Real-Time Station Position Changes
RDAHMM + Real-Time GPS Integration