Transcript Slide 1
Recent Work in Progress John Caron, June 3, 2003 • THREDDS development – Dynamic Catalogs: DQC, Resolvers – IDD Data Server – ADDE Cataloger • NetCDF development – NetCDF Markup Language (NcML) – More efficient Java I/O (NIO) – NetCDF/DODS/HDF5 Data Models THREDDS Catalogs HTTP Server CatalogRef.xml Catalog Generator Catalog.xml CatalogRef.xml Data Server DODS, ADDE, FTP, HTTP Client Application Datasets hostname.edu Dynamic Catalogs = Services HTTP Tomcat Server Catalog.xml Catalog Generator Catalog Service CatalogRef.xml Query Resolver Service DQC.xml Resolver Service URI URL Data Server DODS, ADDE, FTP, HTTP Datasets hostname.edu Client Application Dataset Query Capability (DQC) • XML document. • Describes what the user can ask for as a set of orthogonal “selections”. • On the client, a “query URL” is formed based on the user’s choices, and sent to the server. • The “query resolver” server finds which datasets satisfy the query and returns a list of real dataset URLs. •The DQC describes the queries that the server is capable of responding to. Resolver Services • Logical Dataset, eg “latest ETA model run” • Dataset with Service type “Resolver” • On the client, the URI of the logical dataset is sent to the server • The server finds what is available and returns a list of real dataset URLs. ADDE Cataloger HTTP Tomcat Server Catalog.xml Catalog Service ADDE CatalogRef.xml Cataloger Query Resolver Service DQC.xml Client Xxxxx Xxxxx Xxxx Application ADDE Data Server hostname.edu Datasets IDD Summary IDD Data Server Get as much of the IDD Data feeds available via THREDDS as possible. – NCEP model data (catgen) (DODS) – Level 3 NEXRAD (custom server/DQC) (ADDE) – SSEC/Unidata Satellite data (ADDE Cataloger) (ADDE) – Text Data: Metars, Surface Obs, etc (DQC/custom server), returns text or XML. – Profiler Data (custom server/DQC) (ADDE) NetCDF 3 NetCDF File OpenDAP Dataset Local file NcML Dataset XML OpenDAP HTTP protocol NetCDF-3 library Virtual dataset API Client Application protocol NetCDF Markup Language XML representation of netCDF metadata, uses XML Schema • Core: existing netCDF data model • Coordinate System: general and georeferencing coordinate system • Dataset: redefine, aggregate, subset • Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan Davis, Bob Drach (LLNL), Stefano Nativi (Florence), Russ Rew NcML Coordinate Systems Convention Parser •ATDRadar •AWIPS •COARDS •CF •CSM •GDV •NUWG •WRF •Zebra NetCDF File OpenDAP Dataset Netcdf Dataset NcML Dataset XML GeoGrids, GeoTiffs, Geowhiz! NetCDF File OpenDAP Dataset Convention Parser Netcdf Dataset GeoGrid factory VisAD / IDV GeoGrid Dataset WCS Server GeoTiff Writer Strange land of GIS GeoTiff File OpenGIS WCS NcML Dataset : “virtual view” NetCDF File OpenDAP Dataset NcML Dataset XML Dataset XML Parser Java-netCDF 2.1 Client Application NetCDF Dataset NcML Dataset • Use NcML like CDL, to declare the contents of a netCDF file. • Add, delete or rename Variables, Attributes, and Dimensions • Subset Variables • Reorder a Variable’s dimensions • Aggregate multiple netCDF files, a la DODS Aggregation Server • NcML Dataset is a “virtual view” or can make copy to a local netCDF file. 2: NcML Datasets on a Server Catalog.xml DODS Agg/Netcdf Server DODS, ADDE, FTP, HTTP Dataset XML Parser Client Application NcML Dataset XML Datasets hostname.edu 3: NcML Datasets via Catalogs Catalog.xml NetCDF File OpenDAP Dataset NcML Dataset XML Catalog/Dataset XML Parser Java-netCDF v 2.1.1 Client Application NIO • Rewrite ucar.nc2 I/O layer using java.nio package (currently using ucar.netcdf) • Uses memory mapping, bulk I/O transfer • Prototype has 7x speedup on large files. • Requires JDK 1.4+ • HTTP access must be rewritten NIO vs current Java NIO Current old/new First access small (3.9 Mb) large (240 Mb) 281 671 3334 28221 2.4 8.5 Average next 5 accesses small large 54 290 2239 16367 5.4 7.3 • Time in millisecs to sequentially read entire file • Wintel 2GHz, 1 GB main memory • Java 1.4.2 -client NIO vs optimized C NIO C C/NIO First access small (3.9 Mb) large (240 Mb) 281 370 3334 19348 1.3 5.8 Average next 5 accesses small large 54 2239 .44 .43 • Java 1.4.2 –client vs. VC 6.0 /O2 24 953 NetCDF Data Model NetcdfFile Dimension Variable Attribute DataType •byte •char •short •int •float •double Attribute OpenDAP Data Model Dataset BaseType Attribute BaseType •primitive (8) array Dimension •string •array BaseType •grid Attribute •structure structure / sequence Attribute •sequence BaseType Attribute HDF5 Data Model Groups File directory structure inside HDF file. Datatype •Fixed point Dataset Datatype •floating point •date/time •string •bit field DataSpace Attribute •Opaque Data storage •Compound •Compact •Reference •External •Enumeration •Layout •Variable length •Array •Indexed •Striped Possible Extensions to netCDF data model • Add new data types: – Strings: variable length arrays of bytes, plus an encoding attribute. – Structures: collections of any other element types, allow nested structures. – Vector: a variable length 1D array of any type. • Allow reusable structure definition = user defined data type. • Allow unnamed, undeclared dimensions = anonymous dimensions. • Allow multiple unlimited dimensions (outer dimension only) • Compression. Push scale/offset into library, allow variable bit sizes. • Explicit support for coordinate variables/axes. New NetCDF Data Model NetcdfFile Variable Dimension Attribute Structure DataType •byte •short •int •long •float •double •String •Structure •Vector DataType DataType DataType Vector •Length DataType Attribute NetCDF 4 NetCDF V.1 and 2 File HDF5 File OpenDAP Dataset OpenDAP NcML Dataset XML Local file or 4.0 HTTP protocol protocol NetCDF 4 library Virtual dataset API Client Application