Metadata - DataONE

Download Report

Transcript Metadata - DataONE

CC image by bonus on Flickr
Lesson 7: Metadata
What is Metadata
• Definition of metadata
• Examine information included in a metadata record
• Examples of metadata standards and how to choose
• Illustrate the value of metadata to data users, data
providers, and organizations
CC image by Alec Couros
on Flickr
• Describe the utility of metadata for a variety of scenarios
What is Metadata
After completing this lesson, the participant will be able to:
• Define science metadata
• Give examples of metadata that you are likely to encounter
in the ‘real world’ (i.e., outside of a research context)
• Identify and list the types of information typically included
in metadata records for environmental datasets
• Identify 3 reasons metadata is of value to data users, data
developers, and organizations
• List 3 uses for metadata, beyond discovery of data
• Identify and describe factors that may determine which
metadata standards are most appropriate for a given
dataset
What is Metadata
Plan
Analyze
Collect
Integrate
Assure
Discover
Describe
Preserve
What is Metadata
What is Metadata
CC image by ISAS on Flickr
CC image by kukkurovaca on Flickr
CC image by acordova on Flickr
CC image by SEDAC on Flickr
CC image by Justin See on Flickr
CC image by CIMMYT on Flickr
Average Temperature of Observation for Each Species
Species
Average
Temperature
Temperature
Standard
Deviation
Number of
Observations
Minimum
Temperature
Maximum
Temperature
Northern
Red-legged Frog
4.4
---
1
4.4
4.4
Tailed Frog
7.0
3.0
3
4
10
Arizona Toad
10.0
---
1
10
10
Strecker's Chorus
Frog
10.5
2.0
11
9
16
Oregon Spotted
Frog
11.0
15.5
2
0
22
New Jersey Chorus
Frog
11.5
4.5
17
3
22
Wood Frog
12.5
5.5
897
0
28.8
Spring Peeper
13.2
5.6
569
-1
32
Red-legged Frog
13.3
5.9
16
4
27
What is Metadata
What is Metadata
CC image by Heather Kennedy on Flickr
• Definition: A collection of data
• Generally datasets can be defined as:
o Spatial – a collection of logically related features arranged in a
prescribed manner such as GIS map layers, water features, etc
o Tabular – a file, spreadsheet, data in a table
o Many tabular datasets are inherently “spatial”, e.g. water-quality
samples associated with stream collection sites
• Elements in a dataset can include:
o Values, measures, points, coordinates, conditions, qualities,
frequencies, or attributes that are a result of an observational study
What is Metadata
• When you provide data to someone else, what types of
information would you want to include with the data?
• When you receive a dataset from an external source, what
types of details do you want to know about the data?
What is Metadata
• Providing data:
o
o
o
o
Why were the data created?
What limitations, if any, do the data have?
What does the data mean?
How should the data be cited if it is re-used in a new study?
• Receiving data:
o
o
o
o
o
o
o
o
What are the data gaps?
What processes were used for creating the data?
Are there any fees associated with the data?
In what scale were the data created?
What do the values in the tables mean?
What software do I need in order to read the data?
What projection are the data in?
Can I give these data to someone else?
What is Metadata
•
•
•
•
•
•
WHO created the data?
WHAT is the content of the data?
WHEN were the data created?
WHERE is it geographically?
HOW were the data developed?
WHY were the data developed?
What is Metadata
Photo by Michelle Chang. All Rights Reserved
Metadata is: Data ‘reporting’
Author(s) Boullosa, Carmen.
Title(s) They're cows, we're pigs /
by Carmen Boullosa
Place New York : Grove Press, 1997.
Physical Descr viii, 180 p ; 22 cm.
Subject(s) Pirates Caribbean Area Fiction.
Format Fiction
What is Metadata
CC image by USDAgov on Flickr
CC image by Mskadu on Flickr
• Metadata is all around…
• A Standard provides a structure to describe data with:
o Common terms to allow consistency between records
o Common definitions for easier interpretation
o Common language for ease of communication
o Common structure to quickly locate information
• In search and retrieval, standards provide:
o Documentation structure in a reliable and predictable format for
CC image by ccarlstead
on Flickr
computer interpretation
o A uniform summary description of the dataset
What is Metadata
What is Metadata
CC image by I like on Flickr
Data
users
Metadata
helps…
Organizations
What is Metadata
CC image by waterlilysage on Flickr
Even if the value of data documentation is recognized,
concerns remain as to the effort required to create metadata
that effectively describe the data.
What is Metadata
Concern
Solution
workload required to capture
accurate robust metadata
incorporate metadata creation
into data development process –
distribute the effort
time and resources to create,
manage, and maintain metadata
include in grant budget and
schedule
readability / usability of metadata
use a standardized metadata
format
discipline specific information and
ontologies
‘profile’ standard to require
specific information and use
specific values
What is Metadata
• Metadata allows data developers to:
o Avoid data duplication
o Share reliable information
o Publicize efforts – promote the work of a scientist
CC image by US Embassy Guyana on Flickr
and his/her contributions to a field of study
What is Metadata
• Metadata gives a user the ability to:
information from both inside and outside an
organization
o Find data: Determine what data exists for a
geographic location and/or topic
o Determine applicability: Decide if a data set
meets a particular need
o Discover how to acquire the dataset you
identified; process and use the dataset
What is Metadata
CC image by ASEE on Flickr
o Search, retrieve, and evaluate data set
• Metadata helps ensure an organization’s
investment in data:
o Documentation of data processing steps, quality
control, definitions, data uses, and restrictions
o Ability to use data after initial intended purpose
o Offers data permanence
o Creates institutional memory
• Advertises an organization’s research:
o Creates possible new partnerships and
collaborations through data sharing
What is Metadata
CC image by mambol on Flickr
• Transcends people and time:
Time of data development
DATA DETAILS
Specific details about problems with individual items or
specific dates are lost relatively rapidly
General details about datasets are
lost through time
Retirement or career change
makes access to “mental
storage” difficult or unlikely
Accident or
technology
change may
make data
unusable
Loss of data
developer leads to
loss of remaining
information
TIME
What is Metadata
(From Michener et al 1997)
DATA DETAILS
Sound information
management, including
metadata development, can
arrest the loss of dataset
detail.
TIME
What is Metadata
• Metadata can support:
collect
o data distribution
o data management
o project management
• If it is:
o considered a component of the data
o created during data development
o populated with rich content
classify
derive
planimetric
imagery
meta
meta
analysis
charette
alternative
committee
review
meta
PLAN
meta
What is Metadata
What is Metadata
• The descriptive content of the metadata file can be used to
identify, assess, and access available data resources.
IDENTIFY
• keywords
• geographic location
• time period
• attributes
What is Metadata
ASSESS
• use constraints
• access constraints
• data quality
• availability/pricing
ACCESS
• online access
• order process
• contacts
• A metadata collection can be published to the internet via:
o website catalog
o web accessible folder (WAF)
o Z39.50 metadata clearinghouse
o metadata service
o geospatial data portal
Internet /
Internet
User Query
What is Metadata
Metadata Collection
Intranet
Dataset
• Examples of metadata search portals:
o Data.gov
http://www.geo.data.gov
o Metacat
• Repository for data and metadata
• http://knb.ecoinformatics.org/index.jsp
o US Geological Survey
• USGS Core Science Metadata Clearinghouse:
http://mercury.ornl.gov/clearinghouse
o ArcGIS Online
• ESRI sponsored national geospatial data portal
http://www.geographynetwork.com
What is Metadata
CC image by RGB12 on Flickr
• Federal e-gov geospatial data portal
What is Metadata
What is Metadata
• Metadata records can used to track data provenance
accuracy
• Data Maintenance:
o Are the data current?
• Do we have data older than ten years?
• was before some political or geophysical event that resulted in significant
change?
o Are the data valid?
• prior to most current source data
• prior to most current methodologies
• Data Update:
o Contact information
o Distribution policies, availability, pricing, URLs
o New derivations of the dataset
What is Metadata
If you create metadata,
you can find your own data
What is Metadata
CC image by Oceanit Daily Photo
on Flickr
If you create metadata,
other people can discover your data
o
o
o
o
o
o
CC image by NASA Goddard Spece Flight
Center on Flickr
• Find your data by:
themes / attributes
geographic location
time ranges
analytical methods used
sources and contributors
data quality
Discoverable data is usable data!
What is Metadata
• Metadata allows you to repeat scientific process if:
o methodologies are defined
o variables are defined
o analytical parameters are defined
INPUT
• Metadata allows you to defend your
scientific process:
o demonstrate process
o increasingly GIS-savvy public
requires metadata for consumer information
RESULTS
What is Metadata
• Metadata is an exercise in data accountability. It requires
you to assess:
o What do you know about the dataset?
o What don’t you know about the dataset?
o What should you know about the dataset?
Are you willing to associate yourself with the metadata
record ?
What is Metadata
• Metadata is a declaration of:
Purpose
What to
o the originator’s intended application of
do…
the data
What not to
Use Constraints
do…
o inappropriate applications of the data
Completeness
o features or geographies excluded from the data
Distribution Liability
o explicit liability of the data producer and assumed liability
of the consumer
What is Metadata
Project
Coordination
What is Metadata
• Metadata records can serve as a project design
document:
o descriptions & intent of project
o geographic and temporal extent of project
o source data of project
o attribute requirements of project
• Benefits:
o expectations are clearly outlined
o metadata is integrated into the process
o provides a medium to record progress
What is Metadata
• Use metadata to monitor:
o data development status
Monitoring requires that
the metadata be actively
maintained and reviewed!
milestones
o QA/QC assessments
o needed changes in approach
time
What is Metadata
• Metadata can be a means to improve communications
among project participants using common:
o
o
o
o
o
descriptions & parameters
keywords, vocabularies, thesauri
contact information
attributes
distribution information
• If reviewed regularly by all participants, metadata created
early and updated during the project improves opportunity
for coordinating:
o source data
o analytical methods
o new information
What is Metadata
• As a key component of the data, metadata should be part of
any data deliverable
• For quality metadata from a deliverable, the record should
provide:
o
o
o
o
o
Citation information
Data quality information
Accurate geospatial information
Clearly defined entities and attributes
Distribution information
What is Metadata
What is Metadata
Image courtesy of Viv Hutchinson
• Dublin Core Element Set
o Emphasis on web resources, publications
o http://dublincore.org/documents/dces/
• FGDC Content Standard for Digital Geospatial Metadata (CSDGM)
o Emphasis on geospatial data
o Biological Data Profile (BDP) of the CSDGM
o Profile to the CSDGM emphasis on biological data (and geospatial)
o http://www.fgdc.gov/metadata/geospatial-metadata-standards
• ISO 19115/19139 Geographic information: Metadata
o Emphasis on geospatial data and services
o http://www.fgdc.gov/metadata/geospatial-metadatastandards#fgdcendorsedisostandards
What is Metadata
• Ecological Metadata Language (EML)
o Focus on ecological data
o http://knb.ecoinformatics.org/eml_metadata_guide.html
• Darwin Core
o Emphasis on museum specimens
o http://rs.tdwg.org/dwc/index.htm
• Geography Markup Language (GML)
o Emphasis on geographic features (roads, highways, bridges)
o http://www.opengeospatial.org/standards/gml
What is Metadata
EML
FGDC
Title
Title
Abstract
Abstract
Entity Description
Entity Type Definition
Intellectual Rights
Use Constraints
What is Metadata
• Many standards collect similar information
• Factors to consider:
o Your data type:
• Are you working mainly with GIS data? Rastor/vector or point
data? Do you have biological or shoreline information in your
dataset?
- Consider the FGDC Content Standard for Digital Geospatial Metadata
with one of its profiles: the Biological Data Profile or the Shoreline Data
Profile.
• Are you working with data retrieved from instruments such as
monitoring stations or satellites? Are you using geospatial data
services such as applications for web-mapping applications or data
modeling?
- If so, then consider using the ISO 19115-2 standard
• Are you mainly working with ecological data?
- Consider Ecological Metadata Language (EML)
What is Metadata
• More Factors to consider:
o Your organization’s policies: do they state which standard to use?
o What resources are available to create metadata?
Examples of Tools:
•
•
•
FGDC CSDGM:
- Mermaid (NOAA) http://www.ncddc.noaa.gov/metadata-standards/mermaid/
- Metavist (Forest Service) http://ncrs.fs.fed.us/pubs/viewpub.asp?key=2737
- TKME (USGS) http://geology.usgs.gov/tools/metadata/tools/doc/tkme.html
EML:
- Morpho (http://knb.ecoinformatics.org/morphoportal.jsp)
ISO: (http://www.fgdc.gov/metadata/iso-metadata-editor-review)
- XML Spy or Oxegyn
- CatMD
o Other factors: Availability of human support; instructional materials;
use of controlled vocabularies; output formats
What is Metadata
• Metadata is documentation of data
• A metadata record captures critical information about the content of a
•
•
•
•
•
dataset
Metadata allows data to be discovered, accessed, and re-used
A metadata standard provides structure and consistency to data
documentation
Standards and tools vary – select according to defined criteria such as
data type, organizational guidance, and available resources
Metadata is of critical importance to data developers, data users, and
organizations
Metadata can be effectively used for:
o data distribution
o data management
o project management
• Metadata completes a dataset.
Creating robust metadata is in your OWN best interest!
What is Metadata
The full slide deck may be downloaded from:
http://www.dataone.org/education-modules
Suggested citation:
DataONE Education Module: Metadata. DataONE. Retrieved
Nov12, 2012. From
http://www.dataone.org/sites/all/documents/L07_Metadata.pp
tx
Copyright license information:
No rights reserved; you may enhance and reuse for
your own purposes. We do ask that you provide
appropriate citation and attribution to DataONE.
What is Metadata