Defining what you want, how metadata drives OPeNDAP queries.
Download
Report
Transcript Defining what you want, how metadata drives OPeNDAP queries.
OPeNDAP
(The Open-source Project for a
Network Data Access Protocol)
APAC Tutorial
October 12, 2007
[email protected]
1
APAC, Perth, WA 20071012
Some Definitions
DAP = Data Access Protocol
Model used to describe the data;
Request syntax and semantics; and
Response syntax and semantics.
OPeNDAP
The software;
Numerous reference implementations;
Core/libraries and services.
OPeNDAP Inc.
OPeNDAP is a 501 c(3) non-profit corporation;
Formed to maintain, evolve and promote the
discipline neutral DAP that was the DODS core
infrastructure.
APAC, Perth, WA 20071012
2
Some Definitions
Syntax
The computer representation of a data object - the
data types and structures at the computer level; e.g.,
T is a floating point array of 20 by 40 elements.
Semantics
The information about the contents of an object; e.g.,
T is sea surface temperature in degrees Celsius for a
certain region of the Earth.
3
APAC, Perth, WA 20071012
Distributed Oceanographic Data System
(DODS)
Conceived in 1993 at a workshop held at URI.
Objectives were:
– to facilitate access to PI held data as well as data held
in national archives and
– to allow the data user to analyze data using the
application package with which he or she is the most
familiar.
Basic system designed and implemented in 1993-1995 by
Gallagher and Flierl with NASA and NOAA funding.
From 1994 to present it has been extended with NASA,
NOPP, NSF and NOAA funding.
APAC, Perth, WA 20071012
4
Considerations with regard to the
development of OPeNDAP
Many data providers
Many data formats
Many different client types
Many different semantic representations of
the data
Many different security requirements
APAC, Perth, WA 20071012
5
Broad Vision
A world in which a single data access protocol
is used for the exchange of data between
network based applications regardless of
discipline.
A layer above TCP/IP providing for syntactic and
semantic consistency not available in existing
protocols such as FTP.
APAC, Perth, WA 20071012
6
Practical Considerations
The broad vision:
Is syntactically achievable, but
Is not semantically achievable, at least not
in the near term.
7
APAC, Perth, WA 20071012
OPeNDAP Mission Statement
To maintain, evolve and promote a data
access protocol (DAP) for the syntactically
consistent exchange of data over the network.
The DAP should provide syntactic interoperability
across disciplines and allow for semantic
interoperability within disciplines*****.
8
APAC, Perth, WA 20071012
OPeNDAP Vision Statement
To achieve the mission:
OPeNDAP will be Non-profit.
Easier to obtain federal funds.
The DAP is more likely to be adopted.
OPeNDAP software will be open source.
More likely to be adopted.
Need community contributions to software.
OPeNDAP will mix implementation with research.
Implementation - to encourage use.
Research - to keep the protocol current.
APAC, Perth, WA 20071012
9
OPeNDAP Vision Statement
(cont)
OPeNDAP will rely primarily on federal funding.
Unlikely to obtain private funding for middleware.
Development to be use case driven.
Aligned with Vision/Mission.
Strategic direction will be sought from an Advisory Board
consisting of data system experts and with input from you
the community of developers.
OPeNDAP will seek partners.
OPeNDAP will utilize community working groups to
develop ‘standards’ related to the DAP, OPeNDAP
APAC, Perth, WA 20071012
10
OPeNDAP tactics
• The fundamental objective of OPeNDAP and
OPeNDAP Inc. is to facilitate internet access to
scientific data
• This is done by:
– Providing a protocol (DAP) to access data over the internet,
– Hiding the format (and organization) in which the data are
stored from the user, and
– Providing subsetting (and other) capabilities for the data at
the server
• OPeNDAP is based on a multi-tier architecture
• OPeNDAP software is open source
• Working groups formed on specific topics
APAC, Perth, WA 20071012
11
OPeNDAP relies on projects
• To guide user-based requirements for
application needs as well as OPeNDAP
core development (use cases)
• To provide maintenance and evolution of
the core software and documentation
• Currently: 7 active projects, covers next
~2-3 years, 2 pending proposals
• ……
12
APAC, Perth, WA 20071012
Success
What constitutes success of the OPeNDAP mission?
Adoption of the DAP across a broad range of disciplines
with extensive use in several of these.
In order to achieve this the DAP must do the following:
It must be sufficiently flexible, all encompassing, etc.
that it can be used across a broad range of data
types.
Its implementation must be robust, secure, easy to
use, provide for a broad range of services, etc.
The funding stream must be robust.
Active and engaged developer and user community
APAC, Perth, WA 20071012
13
Risks
It is still the case that
Some other data access protocols are seen as
more attractive regardless of whether or not they
are, or
Other data access protocols are developed
because their community is not aware of
OPeNDAP or of what its capabilities are.
We will compare and contrast some of these
today
APAC, Perth, WA 20071012
14
To Succeed
OPeNDAP must make sure that:
It’s (server and client) capabilities are well
known across a broad range of disciplines.
The data model is inclusive.
The implementation is robust and meets user’s
needs.
The DAP coexists with other protocols.
It has a robust funding base.
It has extensive documentation.
APAC, Perth, WA 20071012
15
Robust Funding?
What is an appropriate level of activity (funding) for
OPeNDAP? What should OPeNDAP be doing?
Core only? + Clients and servers? + Demonstration
projects?
Should OPeNDAP be a small staff (core only) or…?
What priority should be assigned to the elements
currently being developed? Which of these go beyond the
core?
DAP4
Toolkits - Matlab, IDL,…
AIS - metadata consistency
Grid capabilities
THREDDS - Catalogs
OTHERS?
APAC, Perth, WA 20071012
16
Releases/Support
Periodic releases
User services - OPeNDAP ([email protected])
User support - first line of defense
Manages the opendap-dev discussions
17
APAC, Perth, WA 20071012
Binaries Generated
There are approximately 80 binaries built on a nightly basis.
They are built for the following platforms/operating systems:
Linux
FC4
FC5
MacOS-X (universal binaries when possible)
Windows XP, win32
Java 1.5 (Tomcat 5.5)
IRIX (in four variants), Solaris, AIX, OSF
APAC, Perth, WA 20071012
18
Communication
Website (http://www.opendap.org and test.opendap.org)
SVN - Code repository
(http://scm.opendap.org:8090/trac/browser/)
Trac - Task/milestone repository
(http://scm.opendap.org:8090/trac/)
Telecons
Management - Weekly
Developers - Weekly on Monday 11am MT
Twiki -> MediaWiki
Management
Developers
Coming soon http://docs.opendap.org
opendap-dev e-mail list - main mechanism for messages
APAC, Perth, WA 20071012
19
OPeNDAP Community Working Groups
Modeled after best practices of IETF, W3, OGC, IEEE,
ISO, and others
Working Groups:
Authentication
Security
Server-side Functions
Virtualization (Aggregation)
Server-side processing
Geospatial Interoperability
Hyrax and *DS (TDS, GDS, FDS, etc.)
Semantics
DAPPER
netCDF C++ client
Response types
Metrics
Asynchronous transactions
DAP4
Relational Database access via DAP
20
Wiki (http://docs.opendap.org/index.php/Working_Groups)
APAC, Perth, WA 20071012
OPeNDAP Community Working Groups
Terms of Reference (abridged)
http://docs.opendap.org/index.php/Terms_of_Reference
Each OPeNDAP Working Group is established to apply
members' expertise in their focus area to produce specific
deliverables or outcomes
Types: Software and Documentation
Minimum 3 people
6 month time-frame
21
APAC, Perth, WA 20071012
OPeNDAP System Elements
The OPeNDAP data access protocol is
used by a variety of system elements.
Servers
Processing Servers
Aggregating Servers - OPeNDAP chains
Clients
Ancillary Information Services
Browser Interfaces
22
Data System Integrators (ODC)
APAC, Perth, WA 20071012
Servers
Servers receive requests and provide
responses via the DAP.
Servers convert the data from the form in
which they are stored to the OPeNDAP data
model.
Servers provide for subsetting of the data
and more.
23
APAC, Perth, WA 20071012
OPeNDAP Servers
ESML
netCDF HDF4
Data
Data
General
netCDF
Data
HDF5
DSP
Tables
SQL
FITS
CDF
Flat
Binary
CEDAR
Data
Data
Data
Data
Data
Data
Data
Data
HDF5
HDF4
JGOFS
DSP
FITS
JDBC
FreeForm
CDF
CEDAR
24
APAC, Perth, WA 20071012
OPeNDAP Servers (specialized)
pyDAP
ESG
FDS
GDS
DAPPER
CODAR
TDS
Data
Data
Data
Data
Data
Data
Data
General
netCDF
OPeNDAP
netCDF
OPeNDAP
GRIB
BUFR
OPeNDAP
netCDF
OPeNDAP
CODAR
netCDF
OPeNDAP
25
APAC, Perth, WA 20071012
Servers
Servers may also provide other services
Directory traversal.
Browser-based form to build URL.
ASCII version of data.
Metadata associated with the data.
26
Server side functions.
APAC, Perth, WA 20071012
OPeNDAP Aggregation Servers
pyDAP
ESG
FDS
GDS
DAPPER
CODAR
TDS
JGOFS
Data
Data
Data
Data
Data
Data
Data
Data
General
netCDF
OPeNDAP
netCDF
OPeNDAP
GRIB
BUFR
OPeNDAP
netCDF
OPeNDAP
CODAR
netCDF
OPeNDAP
General
27
APAC, Perth, WA 20071012
The Aggregation Server: An Example
netCDF Data Set
File
File
DSP Data Set
File
File
File
File
DSP
Aggregation
Server
Local
OPeNDAP
HTML, GIF
Matlab
Client
Matlab
APAC, Perth, WA 20071012
28
OPeNDAP System Elements
The OPeNDAP data access protocol is
used by a variety of system elements.
Servers
Processing Servers
Aggregating Servers - OPeNDAP chains
Clients
Ancillary Information Services
Browser Interfaces
29
Data System Integrators (ODC)
APAC, Perth, WA 20071012
Clients
Clients make requests and receive
responses via the DAP.
Clients convert data from the OPeNDAP
data model to the form required in the client
application.
30
APAC, Perth, WA 20071012
OPeNDAP Clients
netCDF C
Ferret
GrADS
netCDF Java
IDV
VisAD
NCL
Client
IDL
Client
Matlab
Client
ncBrowse
Access
IDL
pyDAP
Excel
ArcGIS
31
NCL
Matlab
APAC, Perth, WA 20071012
OPeNDAP System Elements
The OPeNDAP data access protocol is
used by a variety of system elements.
Servers
Processing Servers
Aggregating Servers - OPeNDAP chains
Clients
Ancillary Information Services
Browser Interfaces
32
Data System Integrators (ODC)
APAC, Perth, WA 20071012
Ancillary Information Service
•
•
•
•
Current capability: Attributes only
Client-side only
Local and remote resources
Local resource databases
The AIS enables users to augment the metadata for a
data source in a controlled way without requiring
write access to the original data. By using the DAP,
users are also isolated from data format issues.
APAC, Perth, WA 20071012
33
AIS enhancements
• Remote resource databases
• AIS server
• AIS for variables
These enhancements will greatly expand the usefulness of the AIS:
Remote resource databases and an AIS server will enable third-party
‘AIS sites’ which may be sponsored by project offices or institutions, et
cetera.
AIS for variables will enable adding metadata which are stored as ‘data.’
34
APAC, Perth, WA 20071012
Proposed AIS Server
Client linked
w/DAP
Software
0
3
1
AIS
Server
Data
Source
2
AIS
Resource
0. Client requests metadata from the AIS server (which appears no different from
any other DAP server).
1. The AIS server gets metadata from data source
2. The AIS server gets matching the AIS resource using the AIS database and
merges it into the metadata.
3. The AIS server returns resulting the metadata object.
APAC, Perth, WA 20071012
35
OPeNDAP System Elements
The OPeNDAP data access protocol is
used by a variety of system elements.
Servers
Processing Servers
Aggregating Servers - OPeNDAP chains
Clients
Ancillary Information Services
Browser Interfaces
36
Data System Integrators (ODC)
APAC, Perth, WA 20071012
OCAPI
A pure OPeNDAP C API (OCAPI) for
the client-side
Applications:
DAP-aware ‘commands’ for commercial
analysis programs (e.g., IDL)
Scripting tools (e.g., Perl)
APAC, Perth, WA 20071012
37
ODC
- a Data System Integrator
GCMD
OPeNDAP
GFDL
netCDF
URI
HDF
GSFC
Binary
NVOD
S
ODC
Matlab
Ferret
VisAD
GrADS
IDV
Access
IDL
Excel
ncBrowse
APAC, Perth, WA 20071012
38
The Data Access Protocol (DAP)
The DAP has been designed to be as
general as possible without being
constrained to a particular discipline or
world view.
The DAP is a discipline neutral data access
protocol; it is being used in astronomy,
medicine, earth science,…
APAC, Perth, WA 20071012
39
Data Access Protocol (DAP2) - Current
DAP2 currently a NASA/ESE ‘Standard’
Current server (OPeNDAP 3.x; aka SERVER3)
DAP3
XML responses (implemented)
40
APAC, Perth, WA 20071012
DAP4 - Late 2007 (?)
DAP4 improvements over DAP3:
Additional datatypes
Swath
Blob - GIF, MPEG,…
Additional functionality
Check sum
Modulo
The additional datatypes will enable the DAP to
be used in a wider variety of circumstances and
are a direct response to users’ requests.
APAC, Perth, WA 20071012
41
OPeNDAP’s Hyrax (‘Server4’)
• Uses a modular architecture to support
different application-level protocols
– Data access using DAP
– Catalogs using THREDDS
– Browsing using HTML and ASCII
• Modules for data access
– Different file types
– Potential for database and scripting
• Modules for commands
– Commands provide varying operations for different
protocols
APAC, Perth, WA 20071012
42
Hyrax Architecture
OLFS
BES
Data
OPeNDAP Lightweight Front end Server (OLFS)
Receives requests and asks the BES to fill them
Uses Java Servlets
Does not directly ‘touch’ data
Back End Server (BES)
Reads data files, Databases, et c., returns info
May return DAP objects or other data
APAC, Perth, WA 20071012
43
Response to client
GridFTP
DAP2
Request Formulation**
HTTP
DAP2
DAP2
SOAP-DAP
BES
Request from client
OPeNDAP Lightweight Front end Server
THREDDS
Info output
HTML output
ASCII output
APAC, Perth, WA 20071012
44
BES
BES Framework
BES Commands/
XML Documents
NetCDF3
PPT*
Initialization/
Termination
DAP2
Access
Data
Catalogs
HDF4
FreeForm
Network Protocol and
Process start/stop
activities
Commands**
…
Data Store Interfaces
45
*PPT is built in (other protocols)
**Some commands are built in
APAC, Perth, WA 20071012
Today’s Overview
• DAP Servers and Services
• DAP Clients and Services
46
APAC, Perth, WA 20071012