Presentation Title

Download Report

Transcript Presentation Title

Summary of HDF-EOS5 Files,
Data Model and File Format
Abe Taaheri, Raytheon IIS
HDF & HDF-EOS Workshop XI
November 2007
General HDF-EOS5 File Structure
• HDF-EOS5 file is any valid HDF5 file that contains:
– a family of global attributes called: coremetadata.X
Optional data objects:
 family of global attributes called: archivemetadata.X
 any number of Swath, Grid, Point, ZA, and Profile data
structures.

another family of global attributes: StructMetadata.X
• The global attributes provide information on the structure of
HDF-EOS5 file or information on the data granule that file
contains.
• Other optional user-added global attributes such as
“PGEVersion”, “OrbitNumber”, etc. are written as HDF5
attributes into a group called “FILE ATTRIBUTES”
Page 2
General HDF-EOS5 File Structure
• coremetadata.X
Used to populate searchable database tables
within the ECS archives. Data users use this
information to locate particular HDF-EOS5
data granules.
• archivemetadata.X
Represents information that, by definition, will
not be searchable. Contains whatever
information the file creator considers useful
to be in the file, but which will not be directly
accessible by ECS databases.
S
• StructMetadata.X
Describes contents and structure of HDF-EOS
file. e.g. dimensions, compression methods,
geolocation, projection information, etc. that
are associated with the data itself.
Page 3
General HDF-EOS5 File Structure
• An HDF-EOS5 file
– can contain any number of Grid, Point, Swath,
Zonal Average, and Profile data structures
– has no size limits.
 A file containing 1000's of objects could cause
program execution slow-downs
– can be hybrid, containing plain HDF5 objects for
special purposes.
 HDF5 objects must be accessed by the HDF5
library and not by HDFEOS5 extensions.
 will require more knowledge of file contents on
the part of an applications developer or data user.
Page 4
Swath Structure
• Data which is organized by time, or
other track parameter.
• Spacing can be irregular.
• Structure
– Geolocation information stored
explicitly in Geolocation
Field (2-D array)
– Data stored in 2-D or 3-D arrays
– Time stored in 1-D or 2-D array,
– Geolocation/science data
connected by structural metadata
Page 5
Swath Structure
• For a typical satellite swath, an
um
at h
Instrume nt
Profiles
instrument takes a series of scans
perpendicular to the ground track
of the satellite as it moves along
that ground track
tr
Ins
tP
en
Along Trac k
• Or a sensor measures
a vertical profile, instead
of scanning across the
ground track
Page 6
Swath Structure
“SWATHS”
group
• Swath_X groups are created
when swaths are created
•Data/Geo fields’ parent group are
created when fields are defined.
• Swath attributes are set as Object
Attributes.
• Attributes for Data, Profile, or
Gelocation Fields groups are set as
Group Attributes
• Dataset related attributes set for
each data field or geolocation field
are called Local Attributes. They
may contain attributes such as
fillvalue, units, etc.
Object Attribute
<SwathName>:
<AttrName>
“Swath_1”
Group Attribute
<DataFields>:
<AttrName>
Data
Fields
Local Attribute
<FieldName>:
<AttrName>
Data
Field.1
Data
Field.n
“Swath_N”
Profile
Fields
Profile
Field.1
Profile
Field.n
Geolocation
Fields
Longitude
Time
Latitude
Colatitude
HDF5 Group
HDF5
Attribute
HDF5
Dataset
Each Data Field
object can have
Attributes and/or
Dimension Scales
Page 7
Swath Structure
• Geolocation Fields
− Geolocation fields allow the Swath to be accurately tied to particular
points on the Earth’s surface.
− At least a time field (“Time”) or a latitude/longitude field pair
(“Latitude” and “Longitude”). “Colatitude” may be substituted for “Latitude.”
− Fields must be either one- or two-dimensional
− The “Time” field is always in TAI format (International Atomic Time)
Field Name
Data Type
Format
Longitude
float32 or float64
DD*, range [-180.0, 180.0]
Latitude
float32 or float64
DD*, range [-90.0, 90.0]
Colatitude
float32 or float64
DD*, range [0.0, 180.0]
Time
float64
TAI93 [seconds until(-) /
since(+) midnight, 1/1/93]
* DD = Decimal Degree
Page 8
Swath Structure
• Data Fields
− Fields may have up to 8 dimensions.
− For all multi-dimensional fields in scan- or profile-oriented Swaths, the
dimension representing the “along track” dimension must precede the
dimension representing the scan or profile dimension(s) (in C-order).
( e.g. “Bands, DataTrack, DataXtrack” )
− Compression is selectable at the field level within a Swath. All HDF5supported compression methods are available through the HDF-EOS5
library. The compression method is stored within the file. Subsequent
use of the library will un-compress the file. As in HDF5 the data needs
to be chunked before the compression is applied.
− Field names:
* may be up to 64 characters in length.
* Any character can be used with the exception of, ",", ";", " and "/".
* are case sensitive.
* must be unique within a particular Swath structure.
Page 9
Compression Codes
Compression Code
HDFE_COMP_NONE
Value
Explanation
0
No Compression
1
Run Length Encoding Compression (not
supported)
HDFE_COMP_NBIT
2
NBIT Compression
HDFE_COMP_SKPHUFF
3
Skipping Huffman (not supported)
HDFE_COMP_DEFLATE
4
gzip Compression
5
szip Compression, Compression exactly
as in hardware
6
szip Compression, allowing k split = 13
Compression
7
szip Compression, entropy coding method
8
szip Compression, nearest neighbor
coding method
9
szip Compression, allowing k split = 13
Compression, or entropy coding
method
HDFE_COMP_RLE
HDFE_COMP_SZIP_CHIP
HDFE_COMP_SZIP_K13
HDFE_COMP_SZIP_EC
HDFE_COMP_SZIP_NN
HDFE_COMP_SZIP_K13orEC
For Compression the data storage must be CHUNKED first
Page 10
Compression Codes
Compression Code
Value
HDFE_COMP_SZIP_K13orNN
HDFE_COMP_SHUF_DEFLATE
10
szip Compression, allowing k split =
13 Compression, or nearest
neighbor coding method
11
shuffling + deflate(gzip) Compression
12
shuffling + Compression exactly as in
hardware
13
shuffling + allowing k split = 13
Compression
14
shuffling + entropy coding method
15
shuffling + nearest neighbor coding
method
16
shuffling + allowing k split = 13
Compression, or entropy coding
method
17
shuffling + allowing k split = 13
Compression, or nearest neighbor
coding method
HDFE_COMP_SHUF_SZIP_CHIP
HDFE_COMP_SHUF_SZIP_K13
HDFE_COMP_SHUF_SZIP_EC
Explanation
HDFE_COMP_SHUF_SZIP_NN
HDFE_COMP_SHUF_SZIP_K13orEC
HDFE_COMP_SHUF_SZIP_K13orNN
For Compression the data storage must be CHUNKED first
Page 11
Swath Structure
Geoloc ation Dimens ion
0 1 2 3 4 5 6 7 8 9
Mapping
Of fs et: 1
Inc rement: 2
• Dimension maps are
the glue that holds the
SWATH together. They
0 1 2 3 4 5 6 7 8 9 1 01 11 21 31 41 51 61 71 81 9
define the relationship
Data Dimens ion
between data fields and
A “Normal” Dimension Map
geolocation fields by
defining, one-by-one, the
relationship of each
Geoloc ation Dimens ion
dimension of each
0 1 2 3 4 5 6 7 8 9 1 01 11 21 31 41 51 61 71 81 9
geolocation field with the
corresponding dimension
Mapping
Of fs et: -1
in each data field.
0 1 2 3 4 5 6 7 8 9
Inc rement: - 2
Data Dimens ion
A “Backwards” Dimension Map
Page 12
Grid Structure
• Usage - Data which is organized
by regular geographic spacing,
specified by projection parameters.
• Structure
– Any number of 2-D to 8-D data arrays per structure
– Geolocation information contained in projection formula,
coupled by structural metadata.
– Any number of Grid structures per file allowed.
Page 13
Grid Structure
• A grid contains grid corner
locations and a set of
projection equations (or
references to them) along with
their relevant parameters.
• The equations and parameters
can be used to compute the
latitude and longitude for any
point in the grid.
A Data Field in a Mercator-Projected Grid
• Important features of a Grid
data set: the data fields, the
dimensions, and the projection
A Data Field in an Interrupted Goode’s
Homolosine-Projected Grid
Page 14
Grid Structure
Data Field characteristics:
−Fields may have up to 8 dims
− Dim order in field definitions:
- C: “Band, YDim, XDim”
- Fortran: “XDim, YDim, Band”
− Compression is selectable at the
field level within a Grid.
Subsequent use of the library will
un-compress the file. Data needs
to be tiled before the compression
is applied.
− Field names must be unique within a particular Grid structure and are
case sensitive. They may be up to 64 characters in length.
− Any character can be used with the exception of, ",", ";", " and "/".
Page 15
Grid Structure
Dimensions:
• Two predefined dimensions
for Data Fields: “XDim” and
“YDim”.
- defined when the grid is
created
- stored in the structure
metadata.
- relate data fields to each
other and to the geolocation
information
• Fields are Two - eight dimensional
many fields will need not more than three:
the predefined dimensions “XDim” and “YDim”
and a third dimension for depth, height, or band.
Page 16
Grid Structure
• Projection:
− Is the heart of the Grid structure.
− Provides a convenient way to encode geolocation information as
a set of mathematical equations, capable of transforming Earth
coordinates (lat/long) to X-Y coordinates on a sheet of paper
− General Coordinate Transformation Package (GCTP) library
contains all projection related conversions and calculations.
− Supported projections:
Geographic
Mercator
Transverse Mercator
Cylindrical Equal area
Hotin Oblique Mercator
Sinusoidal*
Integerized Sinusoidal
Polar Stereographic
Lambert Azimuthal Equal
Area
Polyconic
Albers Conical Equal Area
Universal Transverse
Mercator
Space Oblique Mercator
Interrupted Goode’s
Homolosine
Lambert Conformal Conic
* Sinusoidal is pseudocylinderical
Page 17
HDF-EOS Point Structure
• Data is specified temporally and/or spatially, but with no
particular organization
• Structure
– Tables used to store science
data at a particular
Lat/Long/Height
– Up to eight levels of
data allowed. Structural
metadata specifies
relationship between levels.
Station
Chicago
Los Angeles
Washington
Miami
Lat
Lon Time
41.49 -87.37 0800
34.03 -118.14 0900
38.50 -77.00 1000
25.45 -80.11 0800
0900
1000
1100
1000
1100
1200
1300
1400
0600
0700
Temp(C)
-3
-2
-1
20
21
22
24
6
8
9
11
12
15
16
Page 18
Point Structure
• Made up of a series of data records taken at [possibly]
irregular time intervals and at scattered geographic locations
• Loosely organized form of geolocated data supported by
HDF-EOS
• Level are linked by a common field name called LinkField
• Usually shared info is
stored in Parent level,
while data values
stored in Child level
• The values for the
LinkFiled in the Parent
level must be unique
Lat
61.12
45.31
38.50
38.39
30.00
37.45
18.00
43.40
34.03
32.45
33.30
42.15
35.05
34.12
46.32
47.36
39.44
21.25
44.58
41.49
25.45
Lon
-149.48
-122.41
-77.00
-90.15
-90.05
-122.26
-76.45
-79.23
-118.14
-96.48
-112.00
-71.07
-106.40
-77.56
-87.25
-122.20
-104.59
-78.00
-93.15
-87.37
-80.11
Tem
15.00
17.00
24.00
27.00
22.00
25.00
27.00
30.00
25.00
32.00
30.00
28.00
30.00
28.00
30.00
32.00
31.00
28.00
32.00
28.00
19.00
Page 19
Point Structure
•
Point structure groups are
created when user creates
“Point_1”, …..
• Data and Linkage groups are
created automatically when the
level is defined
• The order in which the levels
are defined determines the (0based) level index
• FWDPOINTER Linkage will
not be set (acutally first one is
set to (-1,-1)) if the records in
Child level is not monotonic in
LinkFiekd
“POINTS”
Group
Object Attribute
<SwathName>:
<AttrName>
“Point_1”
Group Attribute
<SwathName>:
<AttrName>
Data
Local Attribute
<SwathName>:
<AttrName>
“Point_n”
Linkag
FWD
BCK
POINTER POINTER
HDF5 Group
• A level can contain any
number of fields and records
Level Data
Page 20
Zonal Average (ZA) Structure
• Generalized array structure
with no geolocation linkage
(basically a swath like
structure without geolocation.)
• The interface is designed to
support data that has not
associated with specific
geolocation information.
• Data can be organized by time
or track parameter
• Data spacing can be irregular
• Structure
“ZAS”
group
Object Attribute
<SwathName>:
<AttrName>
“Za_1”
Group Attribute
<DataFields>:
<AttrName>
Data
Fields
Local Attribute
<FieldName>:
<AttrName>
Data
Field.n
“Za_n”
HDF5 Group
– Data stored in
multidimensional arrays
– Time stored in 1-D or 2-D
array
Page 21
“h5dump” output of a simple
HDF-EOS5 file
HDF5 "Grid.he5" {
GROUP "/" {
GROUP "HDFEOS" {
GROUP "ADDITIONAL" {
GROUP "FILE_ATTRIBUTES" {
}
}
GROUP "GRIDS" {
GROUP "TMGrid" {
GROUP "Data Fields" {
DATASET "Voltage" {
DATATYPE H5T_IEEE_F32BE
DATASPACE SIMPLE { ( 5, 7 ) / ( 5, 7 ) }
DATA {
(0,0): -1.11111,-1.11111,-1.11111,-1.11111,-1.11111,
(0,5): -1.11111,-1.11111,
………………………………..
(4,0): -1.11111,-1.11111,-1.11111,-1.11111,-1.11111,
(4,5): -1.11111,-1.11111
}
Page 22
“h5dump” output of a simple
HDF-EOS5 file (cont.)
ATTRIBUTE "_FillValue" {
DATATYPE H5T_IEEE_F32BE
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): -1.11111
}
}
}
}
}
}
}
GROUP "HDFEOS INFORMATION" {
ATTRIBUTE "HDFEOSVersion" {
DATATYPE H5T_STRING {
STRSIZE 32;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
Page 23
“h5dump” output of a simple
HDF-EOS5 file (cont.)
DATASPACE SCALAR
DATA {
(0): "HDFEOS_5.1.10"
}
}
DATASET "StructMetadata.0" {
DATATYPE H5T_STRING {
STRSIZE 32000;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
DATA {
(0): "GROUP=SwathStructure
END_GROUP=SwathStructure
GROUP=GridStructure
GROUP=GRID_1
GridName="TMGrid"
XDim=5
YDim=7
Page 24
“h5dump” output of a simple
HDF-EOS5 file (cont.)
UpperLeftPointMtrs=(4855670.775390,9458558.924830)
LowerRightMtrs=(5201746.439830,-10466077.249420)
Projection=HE5_GCTP_TM
ProjParams=(0,0,0.999600,0,-75000000,0,5000000, 0,0,0,0,0,0)
SphereCode=0
GROUP=Dimension
OBJECT=Dimension_1
DimensionName="Time"
Size=10
END_OBJECT=Dimension_1
OBJECT=Dimension_2
DimensionName="Unlim"
Size=-1
END_OBJECT=Dimension_2
END_GROUP=Dimension
Page 25
“h5dump” output of a simple
HDF-EOS5 file (cont.)
GROUP=DataField
OBJECT=DataField_1
DataFieldName="Voltage"
DataType=H5T_NATIVE_FLOAT
DimList=("XDim","YDim")
MaxdimList=("XDim","YDim")
END_OBJECT=DataField_1
END_GROUP=DataField
GROUP=MergedFields
END_GROUP=MergedFields
END_GROUP=GRID_1
END_GROUP=GridStructure
GROUP=PointStructure
END_GROUP=PointStructure
GROUP=ZaStructure
END_GROUP=ZaStructure
END
"
}
}
}
}
}
Page 26