CUAHSI OnLine: Bringing Data and Modeling Services to the

Download Report

Transcript CUAHSI OnLine: Bringing Data and Modeling Services to the

CUAHSI HIS Service
Oriented Architecture
Ilya Zaslavsky, David R. Maidment, David G. Tarboton,
Michael Piasecki, Jon Goodall, David Valentine, Thomas
Whitenack, Jeffery S. Horsburgh, Tim Whiteaker
and the entire CUAHSI HIS Team
CUAHSI
HIS
http://his.cuahsi.org/
Sharing hydrologic data
Support
EAR 0622374
CUAHSI Hydrologic Information System
Services-Oriented Architecture
HydroCatalog
Data Discovery and
Integration
WaterML,
Other OGC
Standards
HydroServer
Data Publication
ODM
Data Services
HydroDesktop
Data Analysis and
Synthesis
Geo Data
Information Model and Community Support Infrastructure
What is a “service oriented architecture”?
"Things should be made as simple as possible,
but no simpler."
• A design strategy for information
systems that enables loose
coupling among components
• Essential relationships and dependencies
shall be preserved, non-essential
can be discarded
• Service == unit of work, performed based on a
contract between service provider and
service consumer
–
–
–
–
Hides the internal workings of service
Implementation/platform-independent
Presents a relatively simple interface
Can be published, discovered and invoked using this interface
• Everything is a service: data, models, visualization, ……
What makes an open communitydriven hydrologic information system
• Agreeing on standards for information models and
services: WaterML, WaterOneFlow services, OGC
specs
• Making the services easily discoverable, sharing and
indexing a lot of quality data: HISCentral
• Reliable core services: monitoring; logging/reporting;
user support; high availability
• Sharing code: Codeplex, etc.
4
WaterML as a Web Language
Streamflow data in WaterML language
Discharge of the San Marcos River at
Luling, June 28 - July 18, 2002
First presented as an OGC Discussion Paper in 2007
Adopted by USGS, NCDC, multiple academic groups, internationally
M-WRIIMs System Implementation
Feng-Chia University, Taiwan
– Presented 6/16/2011, HydroDWG
Site n
WRIIMs
requesting
responding
WaterML
(Water Markup Language)
water quality, real time
and historic data
OGC
®
On site sensor query interface and results
6
HIS Central Catalog
• Integrates data services
from multiple sources
Service Registry
Hydrotagger
• Supports concept based
data discovery
WaterML
GetSites
GetSiteInfo
GetVariableInfo
GetValues
WaterOneFlow
Web Service
Harvester
Water Metadata
Catalog
Search Services
Discovery and Access
CUAHSI
Data
Server
3rd Party
Server
e.g. USGS
Hydro
Desktop
http://hiscentral.cuahsi.org
HIS Central Content
Map integrating
NWIS, STORET,
& Climatic Sites
Public Services
69 public services
18,000+ variables
1.96+ million sites Available via HISCentral
discovery services
23.3 million series
Referencing 5.2 billion data values
Available via GetValues requests
80
70
60
50
40
30
20
10
0
69
56
39
28
2008 2009 2010 2011
Growth in GetValues calls for all
services reporting to HIS Central
May-June
2011
Federal Agency Water Data Services at HISCentral
Network Name
Site Count
Value Count
(thousands)
Earliest Observation
Notes
31,800
304,000
10/18/1847
WaterML-compliant GetValues service
from NWIS, catalog ingested
236,000
78,000
01/11/1900
SOAP wrapper over WQX services, catalog
ingested
NWISUV
11,800
169,000
120 DAYS
WaterML-compliant GetValues Service,
catalog ingested
NCDC ISH
11,600
3,000*
1/1/2005
WaterML-compliant GetValues service
from NCDC
NCDC ISD
24,800
18,200
1/1/1892
WaterML-compliant GetValues service
from NCDC
NWISIID
376,000
86,500
9/1/1867
SOAP wrapper over NWIS web site,
catalog ingested
NWISGW
834,000
8,490
1/1/1800
SOAP wrapper over NWIS web site,
catalog ingested
1,300
264,000
1/1/2000
WaterML compliant REST services from
the Army Corps of Engineers
NWISDV
EPA
RIVERGAGES
* Estimated
Hydrologic Ontology
http://hiscentral.cuahsi.org/startree.aspx
acre feet
micrograms per
kilogram
acre-feet
micrograms per
kilgram
Semantic heterogeneity:
FTU
NTU
water data source use their own
mho
Siemens
vocabularies, which makes it difficult
ppm
mg/kg
to discover and interpret data
Dissloved oxygen
Solutions:
controlled vocabularies
community vocabulary of hydrologic parameters, semantic tagging, and
semantic query rewriting
HydroTagger
Each Variable is connected to a corresponding Concept
http://water.sdsc.edu/hiscentral/startree.aspx
HISCentral Hosting
Facility
• Redundant
• Continuously
monitored (R-U-On)
• Synchronized
databases
• Fail over
management
• Monitoring of
external servers
• Usage reporting
R-U-On Service
Server
Monitored REST Endpoints
Waterdata.usgs.gov
R-U-On
Net Montiors
Server Monitor
Monitored Websites
And HydroServers
River
R-U-On
Server
Monitor
Usage
Logger
Server
CUAHSI
R-U-On
HIS Central
Montior
R-U-On
Process
Monitor
HIS Central
Client Code
Hiscentral.cuahsi.org
WebService
Water.sdsc.edu
R-U-On
Server
Monitor
Disrupter
WebService
R-U-On
Server
Monitor
Kyle.ucsd.edu
R-U-On
Server
Monitor
R-U-On
Server
Monitor
Mirroring
DataStore
R-U-On
Process
Monitor
DataStore
DataStor
R-U-On
Process
Monitor
Service Monitors
14
CZO Data Publication System
CZO Data Repository and Indexing (CZO Central)
External crossproject registries
Harvester
Archive
Shared
vocabularies
Ontology
CZO Web-based
Data Discovery
System
Standard CZO Services
CZO
Metadata
DataNet
CZO
Data Products
CZO Desktop
Applications
CZO
Desktop
Matlab
Standard CZO data display formats
Web site
Web site
Web site
R
Excel
ArcGIS
Local CZO DB
Local CZO DB
Local CZO DB
Spatial, hydrologic, geophysical, geochemical, imagery, spectral…
Modeling
International Standardization of WaterML
Hydrology Domain Working Group
- working on WaterML 2.0
- organizing Interoperability Experiments focused
on different sub-domains of water
- towards an agreed upon feature model,
observation model, semantics and service stack
Timeline
WaterML 2
SWG
(Mar 2011)
Groundwater IE
–
–
GSC+USGS
Dec 09 – Dec 10
Iterative Development
Surface Water IE
–
–
CSIRO+many
Jun 10 – Sep 11
Forecasting IE
–
–
NWS+Deltares?
Sep 11 – Sep 12?
Water Quality IE
Water Use IE
June’11
http://external.opengis.org/twiki_public/bin/view/HydrologyDWG/WebHome
New requirements, and the path forward
• Transition to OGC model – for better interoperability, including
international: what are new service interfaces; how we transition
an operational system?
• Federation of catalogs – since many data providers stand up
catalogs, also better scalability: what is the suggested
combination of catalog technologies and interfaces?
• Recognition that we don’t need to search over all services: what are
the better search patterns (e.g. 3-step data access: identify
services, then extract time series metadata, and then request data
content for the time series)?
• Recognition that we can (and need to) rely on common
implementations of mature, modular standard specifications: what is
an appropriate operational governance model for distribution of
roles and responsibilities within such a modular system?
The Migration Path (1)
Goal: smooth transition of the operational HIS
• Step 1: Prototyping a new infrastructure and assimilating results of
international validation of new OGC specifications:
A client developed at UT-Austin that implements the Who (data service
providers) – What (variables) - Where (locations) search pattern using
OGC CSW and WFS services. The CSW interface provides federation of
catalog services, while WFS is used to relay time series catalogs
A Kisters WISKI-based client demonstrating access to
WFS (for locations of sampling features) and SOS (for
observational data encoded in WaterML 2.0), developed
as part of Hydrology DWG’s Surface Water IE
The Groundwater (2009-2010) and Surface Water (2010-2011) Interoperability Experiments of the
OGC/WMO Hydrology Domain Working Group have demonstrated serving water data encoded in
WaterML2 using SOS1 and SOS2 services.
The Migration Path (2)
• Step 2: Settle on a time series catalog information model that can be
relayed via common WFS implementations
• Step 3: Create WFS interfaces over observation networks in the HIS
Central catalog, integrated with HIS Central administration interface
An observation network page in HISCentral
administration interface for network #52
(Little Bear River)
http://hiscentral.cuahsi.org/pub_network.aspx?n=52
http://hiscentral.cuahsi.org/wfs/52/cuahsi.wfs?request=getCapabilities
An additional WFS
endpoint for this network
The Migration Path (3)
• Step 4: Make the networks registry in HISCentral CSW compatible
• Step 5: Establish a distributed system of federated hydrologic
catalogs, using the CSW standard
The Migration Path (4)
• Step 6: Create WaterML2/SOS endpoints, initially for networks already
registered in the HIS Central Metadata Catalog at SDSC:
The Migration Path (5)
• Step 7: Integrate the WaterML2/SOS2 endpoints in
HydroServer software stack
• Step 8: Integrate WFS-based series catalog in
HydroServer software stack
• Step 9: Update HISCentral harvesting routines to rely on
WFS services
• Step 10: Update HydroDesktop client to interact with
CSW and WFS services
– These are to be completed
Conclusions
• HISCentral maintains a large collection
of hydrologic time series from distributed
data sources, both academic
and government
– Supports data discovery queries and
vocabulary queries
– Monitors and validates services
– Regular harvesting of registered services
– Supports variety of clients
– High-availability setup
Catalog
Server
Data
Services
Desktop
• Water data exchange standards are the backbone of HIS SOA:
– The specifications have seen wide adoption
• One of the benefits of SOA: smooth migration to a new set of
standards (OGC)
• Building a community hydrologic information system:
– Sharing data and code; reliable core services; access to large volumes of quality data