Transcript Slide 1

Recent Work in Progress
John Caron, June 3, 2003
• THREDDS development
– Dynamic Catalogs: DQC, Resolvers
– IDD Data Server
– ADDE Cataloger
• NetCDF development
– NetCDF Markup Language (NcML)
– More efficient Java I/O (NIO)
– NetCDF/DODS/HDF5 Data Models
THREDDS Catalogs
HTTP Server
CatalogRef.xml
Catalog
Generator
Catalog.xml
CatalogRef.xml
Data Server
DODS, ADDE, FTP, HTTP
Client
Application
Datasets
hostname.edu
Dynamic Catalogs = Services
HTTP Tomcat Server
Catalog.xml
Catalog
Generator
Catalog Service
CatalogRef.xml
Query Resolver Service
DQC.xml
Resolver Service
URI
URL
Data Server
DODS, ADDE, FTP, HTTP
Datasets
hostname.edu
Client
Application
Dataset Query Capability (DQC)
• XML document.
• Describes what the user can ask for as a set of
orthogonal “selections”.
• On the client, a “query URL” is formed based on
the user’s choices, and sent to the server.
• The “query resolver” server finds which datasets
satisfy the query and returns a list of real dataset
URLs.
•The DQC describes the queries that the server is
capable of responding to.
Resolver Services
• Logical Dataset, eg “latest ETA model run”
• Dataset with Service type “Resolver”
• On the client, the URI of the logical
dataset is sent to the server
• The server finds what is available and
returns a list of real dataset URLs.
ADDE Cataloger
HTTP Tomcat Server
Catalog.xml
Catalog Service
ADDE
CatalogRef.xml
Cataloger
Query Resolver Service
DQC.xml
Client
Xxxxx
Xxxxx
Xxxx
Application
ADDE Data
Server
hostname.edu
Datasets
IDD
Summary IDD Data Server
Get as much of the IDD Data feeds available
via THREDDS as possible.
– NCEP model data (catgen) (DODS)
– Level 3 NEXRAD (custom server/DQC) (ADDE)
– SSEC/Unidata Satellite data (ADDE
Cataloger) (ADDE)
– Text Data: Metars, Surface Obs, etc
(DQC/custom server), returns text or XML.
– Profiler Data (custom server/DQC) (ADDE)
NetCDF 3
NetCDF
File
OpenDAP
Dataset
Local file
NcML
Dataset XML
OpenDAP
HTTP protocol
NetCDF-3 library
Virtual dataset
API
Client
Application
protocol
NetCDF Markup Language
XML representation of netCDF metadata, uses
XML Schema
• Core: existing netCDF data model
• Coordinate System: general and
georeferencing coordinate system
• Dataset: redefine, aggregate, subset
• Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan
Davis, Bob Drach (LLNL), Stefano Nativi (Florence),
Russ Rew
NcML Coordinate Systems
Convention Parser
•ATDRadar
•AWIPS
•COARDS
•CF
•CSM
•GDV
•NUWG
•WRF
•Zebra
NetCDF File
OpenDAP Dataset
Netcdf Dataset
NcML Dataset
XML
GeoGrids, GeoTiffs, Geowhiz!
NetCDF File
OpenDAP Dataset
Convention Parser
Netcdf Dataset
GeoGrid factory
VisAD / IDV
GeoGrid
Dataset
WCS Server
GeoTiff Writer
Strange land
of GIS
GeoTiff File
OpenGIS
WCS
NcML Dataset : “virtual view”
NetCDF File
OpenDAP Dataset
NcML
Dataset XML
Dataset XML Parser
Java-netCDF 2.1
Client Application
NetCDF Dataset
NcML Dataset
• Use NcML like CDL, to declare the contents of a
netCDF file.
• Add, delete or rename Variables, Attributes, and
Dimensions
• Subset Variables
• Reorder a Variable’s dimensions
• Aggregate multiple netCDF files, a la DODS
Aggregation Server
• NcML Dataset is a “virtual view” or can make
copy to a local netCDF file.
2: NcML Datasets on a Server
Catalog.xml
DODS Agg/Netcdf
Server
DODS, ADDE, FTP, HTTP
Dataset XML Parser
Client
Application
NcML
Dataset XML
Datasets
hostname.edu
3: NcML Datasets via Catalogs
Catalog.xml
NetCDF File
OpenDAP Dataset
NcML
Dataset XML
Catalog/Dataset XML Parser
Java-netCDF v 2.1.1
Client
Application
NIO
• Rewrite ucar.nc2 I/O layer using java.nio
package (currently using ucar.netcdf)
• Uses memory mapping, bulk I/O transfer
• Prototype has 7x speedup on large files.
• Requires JDK 1.4+
• HTTP access must be rewritten
NIO vs current Java
NIO Current old/new
First access
small (3.9 Mb)
large (240 Mb)
281 671
3334 28221
2.4
8.5
Average next 5 accesses
small
large
54
290
2239 16367
5.4
7.3
• Time in millisecs to sequentially read entire file
• Wintel 2GHz, 1 GB main memory
• Java 1.4.2 -client
NIO vs optimized C
NIO
C
C/NIO
First access
small (3.9 Mb)
large (240 Mb)
281 370
3334 19348
1.3
5.8
Average next 5 accesses
small
large
54
2239
.44
.43
• Java 1.4.2 –client vs. VC 6.0 /O2
24
953
NetCDF Data Model
NetcdfFile
Dimension
Variable
Attribute
DataType
•byte
•char
•short
•int
•float
•double
Attribute
OpenDAP Data Model
Dataset
BaseType
Attribute
BaseType
•primitive (8)
array
Dimension
•string
•array
BaseType
•grid
Attribute
•structure
structure /
sequence
Attribute
•sequence
BaseType
Attribute
HDF5 Data Model
Groups
File directory
structure inside HDF
file.
Datatype
•Fixed point
Dataset
Datatype
•floating point
•date/time
•string
•bit field
DataSpace
Attribute
•Opaque
Data storage
•Compound
•Compact
•Reference
•External
•Enumeration
•Layout
•Variable length
•Array
•Indexed
•Striped
Possible Extensions to netCDF
data model
• Add new data types:
– Strings: variable length arrays of bytes, plus an encoding
attribute.
– Structures: collections of any other element types, allow nested
structures.
– Vector: a variable length 1D array of any type.
• Allow reusable structure definition = user defined data
type.
• Allow unnamed, undeclared dimensions = anonymous
dimensions.
• Allow multiple unlimited dimensions (outer dimension
only)
• Compression. Push scale/offset into library, allow
variable bit sizes.
• Explicit support for coordinate variables/axes.
New NetCDF Data Model
NetcdfFile
Variable
Dimension
Attribute
Structure
DataType
•byte
•short
•int
•long
•float
•double
•String
•Structure
•Vector
DataType
DataType
DataType
Vector
•Length
DataType
Attribute
NetCDF 4
NetCDF
V.1 and 2
File
HDF5
File
OpenDAP
Dataset
OpenDAP
NcML
Dataset XML
Local file or
4.0
HTTP protocol
protocol
NetCDF 4 library
Virtual dataset
API
Client
Application