Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues (2004-2005) Marco Pellegrino Eurostat [email protected] Denis Ward OECD [email protected].

Download Report

Transcript Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues (2004-2005) Marco Pellegrino Eurostat [email protected] Denis Ward OECD [email protected].

Statistical Data and Metadata Exchange
SDMX Metadata Common Vocabulary
Status of project and issues (2004-2005)
Marco Pellegrino
Eurostat
[email protected]
Denis Ward
OECD
[email protected]
Overview
SDMX Metadata Common Vocabulary

Background

Objectives and benefits of MCV

Status of the project

Expected benefits and use
1
Terminology problems...
2
Starting point



No universally accepted metadata framework
Semantics is important to interoperation
- registries contain related and sometimes overlapping
information
- data must be kept updated and synchronized with a
minimum effort
Common understanding on meaning of:
- general concepts (metadata, quality,…)
- basic “atomic” metadata components
3
Components of International Standards
BUSINESS
MODELS
SEMANTICS
(content)
SYNTAX
(e.g. XML)
4
Metadata Common Vocabulary

Ultimate goal of project: to develop a common
understanding of standard metadata items focusing
on descriptions of statistical concepts and
methodologies used by statisticians in the collection,
processing and dissemination of statistical data.

Immediate objective: to develop a Metadata glossary
of those standard components, consistent with
existing international standards and with terminology
being used within SDMX organizations, other
international / national agencies and related projects.
5
Main references for standardisation
ISO/IEC 11179, part 4 (Formulation of data definitions)
Recommendations for constructing definitions for data and metadata
ISO/IEC 11179, parts 1 and 3 (Metadata registry)
Definition of main metadata items
Quality glossaries
UN and UN/ECE-CES methodological documents and glossaries (on
metadata modelling, classifications, data editing,…)
SDMX documents (Gesmes/TS users guide, ISO framework)
Definitions of main items for data-metadata exchange
6
Tentative classification of MCV terms
(draft 3rd public release, April 2005)
Specification
Number of terms
Total
%
346
of which:
Synonyms
10
Definitions
339
100
General statistical terminology
215
64
85
25
97
29
83
24
24
7
17
5
of which
Quality (assessment)
Metamodelling
of which
ISO
Data exchange
of which
GESMES/TS
7
Metadata Standard
Components
Administrative,
Sources
Concepts,
coverage,
definitions
Standards
Methodology
(collection,
compilation,...)
Quality
assessment
Metadata elements describing different elements of statistical production cycle
Unambiguous accepted definition of metadata elements located in a
glossary comprising the Metadata Common Vocabulary
8
Fields of MCV glossary
Title (mandatory)
Definition (mandatory)
Context for the definition (optional, but widely used)
Definition source (mandatory)
Links to related terms within the glossary (optional)
URL to more detailed information (optional)
9
Reference Metadata
Reference metadata
Definition: Reference metadata describe statistical concepts, methodologies for the generation of
data and information on data quality.
Source: Statistical Data and Metadata Exchange (SDMX) - BIS, ECB, Eurostat, IBRD, IMF and
OECD, “Framework for SDMX standards”, Version 1.0, First revision December 2004
Hyperlinks: www.sdmx.org, www.sdmx.info
Context: Reference metadata, sometimes generated, collected or disseminated separately from the
data to which they refer can be relevant to all instances of data described: entire collections of data,
data sets from a given country, or for a data item concerning one country and one year.
Preferably, reference metadata should include all of the following: a) "conceptual" metadata,
describing the concepts used and their practical implementation, allowing users to understand what
the statistics are measuring and, thus, their fitness for use; b) "methodological" metadata, describing
methods used for the generation of the data (e.g. sampling, collection methods, editing processes); c)
"quality" metadata, describing the different quality dimensions of the resulting statistics (e.g.
timeliness, accuracy).
Related term:
Metadata, statistical
10
Accuracy
Accuracy
Definition: Accuracy in the general statistical sense denotes the closeness of computations or
estimates to the exact or true values as contrasted with precision, which refers to reproducibility.
Source: The International Statistical Institute, "The Oxford Dictionary of Statistical Terms", edited by
Yadolah Dodge, Oxford University Press, 2003.
Hyperlinks:
Context: Accuracy refers to the closeness between the estimated value and the (unknown) true value
that the statistics were intended to measure (International Monetary Found, "Data Quality Assessment
Framework - DQAF Glossary").
Accuracy of data or statistical information is the degree to which those data correctly estimate or
describe the quantities or characteristics that the statistical activity was designed to measure.
Accuracy has many attributes, and in practical terms there is no single aggregate or overall measure
of it. Of necessity, these attributes are typically measured or described in terms of error, or the
potential significance of error, introduced through individual major sources of error, e.g. coverage,
sampling, non-response, response, processing and dissemination (Statistics Canada," Statistics
Canada Quality Guidelines", 3rd edition, October 1998, page 4, available at
http://www.statcan.ca/english/freepub/12-539-XIE/12-539-XIE.pdf).
Accuracy is the second quality component in the Eurostat Definition.
The third element of the IMF definition of quality is "accuracy and reliability".
Related term:
Quality (Eurostat)
Quality (IMF)
Error, statistical
Reliability (quality)
Error of estimation
Precision
11
Expected benefits and use
The use of MCV terminology would:
•
support standardisation and consistency of metadata
compiled within each institute, when associated to
SDMX standards and “key family” descriptions
•
Facilitate comparisons across geographical entities
•
Facilitate mapping of different metadata systems, as it
can be used independently from any specific metadata
model
12
Metadata Common Vocabulary
For more info:
SDMX: http://www.sdmx.org
OECD: http://cs3-hq.oecd.org/scripts/stats/glossary/index.htm
CODED: http://forum.europa.eu.int/irc/dsis/coded/info/data/coded/en.htm
CIRCA: http://forum.europa.eu.int/Public/irc/dsis/metadata/library
13
Thanks for your attention
Marco Pellegrino
Eurostat
[email protected]
Denis Ward
OECD
[email protected]
14