Concurrent Web Map Cache Server: A “Web 2.0 Meets SOA” Case Study Zao Liu, Marlon Pierce, Sunghoon Ko, Geoffrey Fox Community Grids Laboratory Indiana.

Download Report

Transcript Concurrent Web Map Cache Server: A “Web 2.0 Meets SOA” Case Study Zao Liu, Marlon Pierce, Sunghoon Ko, Geoffrey Fox Community Grids Laboratory Indiana.

Concurrent Web Map Cache Server:
A “Web 2.0 Meets SOA” Case Study
Zao Liu, Marlon Pierce, Sunghoon Ko, Geoffrey Fox
Community Grids Laboratory
Indiana University
&
Neil Devadasan
Polis Center
Indiana University Purdue University Indianapolis
GIS Map Servers as a Web 2.0 Case Study
 There are several different products for creating online maps and allowing interaction with Geographical
Information System (GIS) data bases.
• ESRI, Autodesk, Open Geospatial Consortium
• These follow a classic user driven request/response style
model.
 Google Maps (released in 2005).
• Highly interactive AJAX style clients replaced stodgy userdriven request/response.
• See http://www.collab-ogce.org/GGF15 Workshop for more
information.
• More importantly, any one could use the JavaScript API to
make really sophisticated applications.
Building a Hybrid System
 Google Maps provide a highly interactive user interface and
capabilities (geolocations, directions)
 But GIS services have much more detailed local
information.
• Indiana has orthophotography with much higher zoom levels than
Google maps.
 http://www.indiana.edu/~gisdata/05orthos.html
• Local county servers have many interesting map layers not in
Google
 Parcels/property lines, school district lines
• And these tie into feature services with interesting data like pinpoint
addresses, tax assessments, etc.
 So obviously it makes sense to adopt the Google approach
but enhance it with local data.
• Ultimately we hope to ties this into representations of scientific data
generated on the Grid.
Google Maps Server
Marion County
Map Server
(ESRI ArcIMS)
Must provide adapters
for each Map Server
type .
Cache Server requests
map tiles at all zoom
levels with all layers.
These are converted
to uniform projection,
indexed, and stored.
Overlapping images
are combined.
Hamilton
County Map
Server
(AutoDesk)
Adapter
Adapter
Adapter
Cache Server
Cass County
Map Server
(OGC Web Map
Server)
The tile server fulfills
Google map calls with
cached tiles at the
requested bounding
box that fill the
bounding box.
Tile Server
Browser +
Google Map API
Browser client fetches
image tiles for the
bounding box using
Google Map API.
Building a Cache Server
Reverse engineering Map Server
requests.
Federating GIS Servers Around Indiana
Indiana has 92 counties

•
Examples

•
•
•
ESRI ArcIMS and ArcMap Server
 Marion, Vanderburgh, Hancock, Kosciusco, Huntington, Tippecanoe
Autodesk MapGuide
 Hamilton, Hendricks, Monroe, Wayne
WTH Mapserver™ Web Mapping Application (OGC Minnesota Map Server)
 Fulton, Cass, Daviess, City of Huntingburg
Also there are state-wide GIS servers

•
•

Approximately 15 have public GIS map servers.
Orthophotography from Indiana University
Indiana Geological Survey
These are not normally interoperable.
Map Requests for ArcIMS and ArcMap
 Map image requests for ESRI ArcIMS and ArcMap are based on ArcXML
ESRI sends SOAP-like XML over HTTP. Note the
generated image is left on the server. Client has to
retrieve it in a separate step. Server cleans up images
every 10 minutes or so (a configurable parameter).
Map Requests for other type of Servers
 Map requests for other type of servers are using HTTP GET method.
Request for AutoDesk MapGuide Server
http://litemap.co.hamilton.in.us:8080/liteview/servlet/MapGuideLiteView?VERSION=1.1.1&REQ
UEST=Gemap&LAYERS=COUNTY_PLAN.MWF\parcels&SRS=EPSG:4326&BBOX=86.0009765625
,40.06125658140474,85.99960327148438,40.062307630891&WIDTH=256&HEIGHT=256&FOR
MAT=image/png&BGCOLOR=0xFFFFFF&TRANSPARENT=TRUE&WMTVER=1.1.1&STYLES=
Request for WTH Web Map Server
http://thinkopengis.wthengineering.com/cgibin/mapserv.exe?map=cass0805.map&VERSION=1.
1.1&REQUEST=GetMap&LAYERS=parcels,roads,highways&SRS=EPSG:4326&BBOX=86.4596523
336861,40.6980435683496,86.3180175693406,40.7924667445799&WIDTH=600&HEIGHT=400
&FORMAT=image/png&BGCOLOR=0xFFFFFF&TRANSPARENT=FALSE&WMTVER=1.1.1
 Trick is to figure out the correct format for the name/value pairs.
 Requesting format of MapGuide and WMS is almost the same
 Map image is directly returned in the HTTP response (GIF, JPG, etc)
County Boundaries
.
To take advantage of highly
accurate local data for use
statewide, a variety of technical
issues must be overcome such
as:
•Projecting the information to a
single coordinate system
•Standardizing symbols
•Retrieving individual Layers
Caching for Performance
County
Server
Performance is constrained by the
performance of the individual servers.
Cache Server
Obviously this is not suitable for
AJAX style applications. We need
to pre-fetch and store as tiles.
Building a Tiling Server
Reverse engineering Google’s map
server.
Tiling Strategy
 Google Maps works by delivering map tiles that fill a bounding box.
 Google Map API 2.0 lets you point at your own tile server.
• We use this to serve up our own map data together with Google maps.
 To do this, all tiles should be saved as the same bounding box as Google map
tiles.
• Tiles must have same size, projection, and coordinate values as underlying Google
base maps.
 Each tile uses tile ID and zoom level as its name (ie, tile36.48.10), so no
database is needed to find the tile.
• Our tile naming convention based on Google’s lat/lon to transverse Mercator
projection.
• tileX.Y.Zoom, but must first convert lat and lon to rectangular coordinates.
• Naming convention discussed on next slide. See
http://mapki.com/wiki/Lat/Lon_To_Tile
 Using zoom level as the first-level directory and layer as the second-level
directory to store tiles.
Converting bounding box to Google tile values
Google uses an x,y coordinate system combined with a zoom value to specif
y the tiles to retrieve from the server. These coordinates are calculated usin
g an algorithm which can be found in GoogleMapki. See: http://www.code
project.com/useritems/googlemap.asp
Find the
bounding box of
Indiana which
covers all the
state.
Convert the
latitude value
and longitude
value of the
bounding box
into the Google
map tile
coordinate
values.
Identify the tile
coordinate values
in the bounding
box.
Convert the tile
ID into the
latitude and
longitude values.
Use Indiana
Geology Survey’s
service to match
county to tile.
Store in
database.
We can pre-fetch
tiles by sending
request to the
county server
that these tiles
belong to and
fetch the tile
back.
Steps for fetching image tiles from county map servers that match Google map tiles.
Example Tile: http://mt0.google.com/mt?n=404&v=w2.37&x=0&y=0&zoom=16.
A (-88.2, 42.4)
Naming Tiles Example
Convert A, B lat/lon values to Google map tile values at a
given zoom level. Value for A is (36,47), B is (37,49)
Tile 1
(36,47)
Tile 4
(37,47)
Calculating how many tiles there are in this bounding box
and identify each tile’s value. In this figure, there are (3746+1)*(49-37+1) = 6 tiles (our choice).
Tile 2
(36,48)
Tile 5
(37,48)
For each tile in the bounding box, we can convert it’s tile
coordinate values into lat/lon values.
Using each tile lat/lon values to construct requests for
IGS boundary services. IGS Services will tell which
counties are at least partially in this tile. The site for IGS
services is: http://igs.indiana.edu/
Tile 3
(36,49)
Tile 6
(37,49)
Save the tile-county mapping in our database. For a given
tile name, we can look up the county.
Bounding box of Indiana at zoom level 10
B(-84.6, 37.1)
Combine Google map with county parcel data
Map servers typically contain base maps
and optional layers.
• Parcel boundaries, roads, and
township boundaries are layers.
We cache each layer separately.
Layers and base maps are combined
dynamically using Java Advanced Image
libraries.
Matching Projections: EPSG4326 to Mercator
 County map over-layer from IGS is in EPSG4326 projection. Must convert to
Mercator to match Google.
Combine tiles at County Boundaries
Marion
County
Hancock
County
County boundary tiles need to
be combined to one tile by use
Java Advanced Image Library
Next Steps
 Caching more regions in Indiana and elsewhere.
• If county uses ESRI or OGC map server, current agent plugins can be used.
• We would like to do California next.
 Use to represent outputs of scientific applications.
• Contour plots, vector maps, and other types of layers for displaying results of
geophysical applications.
• Dynamic (“real time” layers) to display streaming data from instruments and
applications.
 Find a way to keep current with county servers, especially when the county
server change layer id.
• Recent Monroe county example
 The tiling services should support multiple server styles
• URLs for REST/AJAX style clients
• WSDL and SOAP for formal Web Services
• Support OGC and ESRI clients.
 Improve collaborative clients
Observations and Conclusions
 Web 2.0 approaches are very compatible with SOA in general...
• Although the details are important.
 But we have to do a better job making services.
• We are very good at making complicated interfaces to simple services.
• Programmable Web lists 350+ simple public APIs to complicated services.
 Science Gateways will change dramatically.
• We have been burdened down by security issues that Google et al ignore.
• Portals tend to be dominated by server-side, Enterprise standards, while mashups favor thicker browser clients and looser standards
 Judge a service by popularity rather than number of pages in the specification.
 And it’s the data...we need to find better ways to use Grids like the
TeraGrid to populate data services.
• Computations will always require expertise.
• Grid software is useful for computing experts.
• But not everyone needs this. We need to think of better ways for archiving and
delivering computational results.
More Information
 [email protected]
 See demo:
• http://156.56.104.164/demo/indianaViewer.html
 Collaborative version:
• http://156.56.104.164/samples/CollabmapUpdate/indian
aViewer.html
• Need a) Flash, and b) a friend to also try.
• Buggy still, so you have to login at the same time.
Comparison of state and county data





10 foot contours (1990)
Missing local roads
No parcels
No point addresses
Jurisdictional boundaries (2001)
1 foot contours (2006)
Local roads (2006)
Parcels (2006)
Point addresses (2006)
Jurisdictional boundaries (2006)
Basic Problem: Data Federation
 Integrated GIS systems have obvious benefits but
inevitably systems are developed by various state
and local government agencies.
• Bottom up rather than top down
 This tends to give excellent local information but it
breaks down at the county boundary.
Considerations
 We assume heterogeneity in GIS map and feature servers.
• Must find a way to federate existing services
 We must reconcile ESRI, OGC, Google Map, and other technical
approaches.
• Make a clean distinction between clients and services
• Must try to take advantage of Google, ESRI, etc rather than compete.
 We must have good performance and interactivity.
• Servers must respond quickly--launching queries to 20 different map
servers is very inefficient.
• Clients should have simplicity and interactivity of Google Maps and similar
AJAX style applications.
Backup Slides
Developing issues
 Integrating GIS map servers is not trivial
•
Different county map servers may use different technologies and web services.
•
Interoperability of Geospatial Referencing: Different coordinate systems and projections are used
by the different county web services.
•
Semantic Interoperability: Different attribute names for layers are used by the different county
web services.
 Our solution: create a virtual map server to act as an agent server
•
Translates map requests from generic format to the format expected by the specific map server.
•
Provides a common language and programming interface for constructing clients
•
Projecting the information to a single coordinate system
•
Standardizing symbol
 The agent server by itself will work but performance is not good
•
Must wait for slowest server to respond
•
Failure prone: a county server may not respond at all
•
Adds additional overhead for combining images
Caching Server
 The agent server runs offline to harvest map images from county map
servers.
• Images are stored as tiles.
• Tiles at county boundaries may be combined for greater storage and
performance efficiency.
 Clients connect to the cache server instead of the agent server.
 The cache server constructs the requested image from pre-fetched tiles.
• Inspired by Google Maps approach
• Enable more interactive clients
 Image construction may be parallelized/multi-threaded for greater
performance.
• Potentially takes advantage of new multi-core server architectures from Sun,
Intel, and AMD.
Two Phase Approach: Caching and Tiling
 Federation through caching:
•
WMS and WFS resources are queried and results are stored on the cache servers.
•
WMS images are stored as tiles.
 These can be assembled into new images on demand (c. f. Google Maps).
 Projections and styling can be reconciled.
 We can store multiple layers this way.
•
We build adapters that can work with ESRI and OGC products; tailor to specific counties.
 Tiling:
•
Client programs obtain images directly from our tile server.
 That is, don’t go back to the original WMS for every request.
•
Similar approaches can be used to mediate WFS requests.
•
The tile server can re-cache and tile on demand if tile sections are missing.
 Google Map Clients can work with tiling server.
Cache Server architecture
Tile-County
Mapping Process
IGS boundarychecking service
County Map
Fetcher thread
County Map
Fetcher thread
Database
County Map
Fetcher thread
County Boundary
Fetcher thread
which need to use
JAI library to
combine tiles
from different
counties.
County Web
Map Services
County Web
Map Services
County Web
Map Services
County Web
Map Services
County Web
Map Services
1 - 28
Building Indiana Map client Using Google Map API
 Google Map API V2 enables us to add custom map directly
on Google Map .
• See
http://mapki.com/wiki/Add_Your_Own_Custom_Map
 To utilize Google map API to build Indiana map client, there
are several steps:
• Register for using Google map API
• Create a collection of copyrights.
• Give those copyright collections to all tile layers that we want to add on
Google Map.
• Pass those tile layer(s) off to create a map type.
• Once a map type has been created, it can be added to our map instance in
our JavaScript code.
Example code for building custom map on Google Map
 Create a GCopyright
var copyright = new GCopyright(1, new GLatLngBounds(new GLatLng(-90, -180), new GLatLng(90, 180)), 0, “@Indiana
Geology Survey");
 Add the GCopyright to a GCopyrightCollection
var copyrightCollection = new GCopyrightCollection('Chart'); copyrightCollection.addCopyright(copyright);
 Create a GTileLayer array
var tilelayers = [new GTileLayer(copyrightCollection , 3, 11)]; tilelayers[0].getTileUrl = CustomGetTileUrl;
function CustomGetTileUrl(a,b) {
var z = 17 - b;
var f = "/maps/?x="+a.x+"&y="+a.y+"&zoom="+z;
return f;
}
 Create a GMapType
var custommap = new GMapType(tilelayers, new GMercatorProjection(12), "Chart", {errorMessage:"No data available"});
 Add the custom map type to the map
map.addMapType(custommap);
 Once the maptype added to the map container, this type layer could be controlled as
Google map.
Collaborative Indiana Map

Flex data service from Adobe which is the basic technology used to support collaboration
of Indiana map clients.
•
Enables innovative applications to be delivered in the browser in a reliable and scalable manner.
•
Enables data to automatically be pushed to the client application without polling.
•
Enables a client application to concurrently share data with other clients or other servers. This
model enables new application concepts like “co-browsing” and synchronous collaboration.
 The Flex module adds the following collaboration features to Google Maps:
•
Map sharing: Maps are kept in sync (in real time) between users involved in a collaboration
session.
•
Videoconferencing (Webcam sharing and VOIP): You can share your Webcam and microphone
to add video and audio to your collaboration session.
•
White-board: Collaborating users can draw on the map. For example you could draw potential
directions, etc. The users’ whiteboards are kept in sync in real time.
•
Chat-board: Collaborating users can chat on the chat-board in real time.
•
Cursor sharing: When you move your mouse, other users see the movements of your mouse and
what you are pointing at.
 Using flex API to communicate with Ajax and Javascript to implement rich web
applications.
Flex runtime architecture
 To enable collaboration, clients need
flash player first installed.
 To the clients who subscribe to the same
channel which configured in the flex data
services, the events invoked in one client
could broadcast to other clients to
enable collaboration.
 Clients can make direct calls to Java
objects as well as subscribe to real-time
data feeds, send messages to other
clients, and integrate with existing Java
Message Service (JMS) messaging
systems.
Storage of caching entire state
 Currently storing 15 counties at 13 zoom levels for 13 layers. It is takes ~250
GB.
 Takes about 2.5-3 TB to store the entire state to zoom level 13 this way.
• There are 48410476 tiles for zoom levels 0-13, 162561384 tiles for 0-14 levels
(nearly 12 TB).
• There are ~10 layers for each scale
 Aerial photo layer tiles take 25~30 KB
 Other layers (parcels, roads) are much smaller: 30~36 KB for all remaining 9
layers per tile
 So we need almost 60KB * 48410476 tiles to store all map data
 Layers from Google (Hybrids, Street, Google Satellite) don’t need to
be cached.
• This is large but possible.
 We can easily spread our caching server over multiple hosts to store even
higher magnification scales.
 Efficient tiling storage can save disk space.
Current Progress
 Supports ESRI and OGC servers
• Now 17 counties is being cached. (Marion, Monroe are fully cached for 13
zoom levels)
 7 layers has been proved that they can be easily cached.
• Aerial photo layer, street , interstate layer, parcel, parcel ID, county
boundary, school).
• 3 more layers can be easily shown in client without caching. (Google Map,
Google Satellite, Hybrids).
 Querying parcel information across boundary. ( MARION-HANCOCK boundary)
 Support Geocode querying.
 Higher resolution than Google Satellite.
 Google Map-like interaction.
 Performance and Reliability.
• Cache Server still work even the county server doesn’t work.
• Much faster response to the client.
Tradeoffs of Caching
 Cached images must be store somewhere.
 More zoom levels, much more disk space is needed.
• For 12 levels, 500-600 GB.
• For 13 levels, 2.5-3 TB.
• For 14 levels, about 12 TB. (It may be not necessary to cache this
zoom level for all counties. We can cache this level for the
requirement of some place.
 Difficulty of map re-projection.
 Latency of keeping update with county servers.
 Inconsistencies in available layers.