Quality reporting within the Eurostat and the ESS metadata systems August Götzfried and Håkan Linden Eurostat Unit B6: Reference databases and metadata.

Download Report

Transcript Quality reporting within the Eurostat and the ESS metadata systems August Götzfried and Håkan Linden Eurostat Unit B6: Reference databases and metadata.

Quality reporting within the Eurostat
and the ESS metadata systems
August Götzfried and Håkan Linden
Eurostat Unit B6: Reference databases and metadata
1
Current situation
Within the European Statistical System (ESS) reporting on
statistical data quality exists in many statistical domains….
2
Problem statement
… BUT :
– Quality reports do not exist for all statistical processes within
the ESS;
– No homogeneity between the different report structures
used for data quality reporting;
– Not all the quality related information is made publicly
available;
– No common and standard IT infrastructure is used within
the ESS;
 The new Eurostat vision: “Improving the production
method of EU statistics” requires an improvement action.
3
Progress made since 2008
 2008: introduction of the Euro SDMX Metadata Structure (ESMS) at
Eurostat and the ESS for the production and dissemination of
reference metadata (Commission Recommendation 498/2009)
 01/2009: release of the new version of the ESS quality reporting
documents:
• ESS Standard for Quality Reports (ESQR)
• ESS Handbook for Quality Reports (EHQR)
 Detailed requirements following the European Statistics Code of Practice
 ESS Quality and Performance Indicators (QPI’s) defined
 03/09: EP/Council Regulation 223/2009
 Article 12 defining the quality criteria to be reported
 2009/2010: Development and deployment of the Eurostat Metadata
Handler with:
• EMIS: production and dissemination of ESMS files at Eurostat
• National Reference Metadata Editor (NRME): production, transmission and
dissemination of national ESMS files
4
ESMS and ESQR
 ESMS is more oriented to the USERS of statistics
 to understand the statistical data released
 there is no need for too detailed information on data quality
 21 SDMX cross domain concepts used
 ESQR is more oriented to the PRODUCERS of statistics
 to monitor the quality of the statistics produced in detail
 concentrating on the main quality concepts (being also part
of the ESS Statistics Regulation No 223/2009)
However, there is information on quality criteria
which is common to both ESMS and ESQR.
5
ESMS and ESQR: the starting point
ESQR
I. Introduction to the
Statistical Process and
Its Outputs
ESMS
8. Release policy
1. Contact
TIMELINESS
II. RELEVANCE
2. Metadata update
9. Frequency of
dissemination
III ACCURACY
IV. TIMELINESS
and PUNCTUALITY
3. Statistical
presentation
4. Unit of measure
IXI. Performance, Cost
and Respondent
Burden
X. Confidentiality,
Transparency, Security
10. Dissemination
format
ACCESSIBILITY
VI. COMPARABILITY
and COHERENCE
VIII. Assessment of
User needs and
Perceptions
16. Comparability
COMPARABILITY
V. ACCESSIBILITY
and CLARITY
VII. Trade-offs between
Output Quality
Components
15. Timeliness and
punctuality
11. Accessibility of
documentation
17. Coherence
COHERENCE
18. Cost and burden
CLARITY
5. Reference period
12. Quality
management
19. Data revision
ACCURACY
6. Institutional
mandate
13. Relevance
20. Statistical
processing
RELEVANCE
7. Confidentiality
14. Accuracy and
reliability
21. Comment
ACCURACY
XI. Conclusions
6
The new ESQRS
 Based on the ESQR, a new report structure - the ESS
Standard for Quality Reports Structure: ESQRS - was created
for harmonising the reporting on statistical data quality within
the ESS.
 The ESQRS is using the main statistical data quality criteria as
listed in EP/Council Regulation 223/2009 and as being part of
the ESMS and details them further :
• Relevance
• Accuracy
• Timeliness and Punctuality
• Accessibility and Clarity
• Comparability
• Coherence
 A subset of the Quality Performance Indicators (QPI’s) is also
covered in the new ESQRS.
7
ESQR and ESQRS
ESQR
I. Introduction to the
Statistical Process
and Its Outputs
ESQRS
I
Contact
II
Introduction
III
Relevance (user needs and
perceptions)
IV. TIMELINESS
and PUNCTUALITY
IV
Accuracy
V. ACCESSIBILITY
and CLARITY
V
Timeliness and punctuality
VI. COMPARABILITY
and COHERENCE
VI
Accessibility and clarity
VII. Trade-offs between
Output Quality
Components
VII
Comparability
VIII. Assessment of
User needs and
Perceptions
VIII
Coherence
IX
Comment
II. RELEVANCE
III ACCURACY
IXI. Performance, Cost
and Respondent
Burden
X. Confidentiality,
Transparency, Security
XI. Conclusions
8
The ESQRS
ESQRS
Concept name
I
Introduction
II
Relevance (user needs and
perceptions)
II.1
User needs
II.2
User satisfaction
ESQRS
Concept name
IV
Timeliness and Punctuality
IV.1
Timeliness
IV.1.1
Timelag – first results (T1)
IV.1.2
Timelag – final results (T2)
IV.2
Punctuality
IV.2.1
Punctuality – publication (T3)
II.2.1
User satisfaction index (US1)
II.2.2
User satisfaction survey – date
V
Accessibility and Clarity
II.3
Completeness
V.1
News release
II.3.1
Rate of available statistics (R1)
V.2
Publications
V.2.1
Publications – number (AC1)
III
Accuracy
III.1
Overall accuracy
V.3
On-line database
V.3.1
On-line database – accesses
(AC2)
Micro-data access
III.2
Sampling error
III.2.1
Coefficient of variation (A1)
V.4
III.3
Non-sampling error
V.5
Other
III.3.1
Coverage and other frame errors
V.6
Documentation on methodology
III.3.1.1
Rate of overcoverage (A2)
III.3.2
Measurement errors
V.6.1
Metadata –completeness (AC3)
III.3.2.1
Edit failure rate (A3)
V.7
Quality documentation
III.3.3
Non response errors
III.3.3.1
Unit response rate (A4)
VI
Comparability
VI.1
Comparability – geographical
VI.1.1
Assymetries for statistics mirror
flows (CC2)
III.3.3.3
Item response rate (A5)
III.3.4
Processing errors
III.3.4.1
Imputation rate (A6)
VI.2
Comparability - over time
III.3.5
Model assumptions
VI.2.1
III.3.6
Mistakes (A7)
Length of comparable time series
(CC1)
III.3.7
Data revision
VI.3
Comparability – domains
III.3.7.1
Data revision – policy
III.3.7.2
Data revision - practice
III.3.7.3
Average size of revisions (A8)
III.3.8
Seasonal adjustment
= Concepts in common with ESMS
VII
Coherence
VII.1
Coherence – cross domain
VII.1.1
Coherence –subannual and annual
statistics
VII.1.2
Coherence – National Accounts
VII.1.3
Coherence with other statistics
VII.2
Coherence – internal
9
The ESMS and the ESQRS
 The metadata produced in the ESMS and ESQRS need
to be kept consistent. The ESQRS is based on the
ESQR, but not taking up all the chapters contained in the
latter one.
 The information in the ESQRS is more detailed
compared to the information on statistical data quality
contained in the ESMS.
 ESQRS reports deeper in terms of data
quality compared to the ESMS
10
The ESMS and the ESQRS
ESMS
Accuracy and reliability
Description:
Accuracy:
closeness of computations or
estimates to the exact or true
values that the statistics were
intended to measure.
Reliability: closeness of the
initial estimated value to the
subsequent estimated value.
Non- sampling
error
Description:
Error in survey
estimates which
cannot be
attributed to
sampling
fluctuations.
ESQRS
Accuracy
Description:
Accuracy:
closeness of
computations or
estimates to the exact or
true values that the
statistics were intended
to measure.
Non- sampling
error
Description:
Error in survey
estimates which
cannot be
attributed to
sampling
fluctuations.
Non- response
error
Description:
The difference
between the
statistics computed
from the collected
data and those that
would be computed
if there were no
missing values.
Unit response rate
Description:
The ratio of the
number of units for
which data for at
least some variables
have been collected
to the total number
of units designated
for data collection.
Formulae unit resp. rate
Description:
Ex. calculation
formluae for unweighted unit
response rate.
11
ESS Guidelines
 The guidelines for quality reporting from ESS Handbook for
Quality Reports (EHQR) are already used in the “ESS
Guidelines” for ESMS.
 These guidelines will be further used in the ESQRS in order to
provide detailed guidelines for 6 different statistical processes:
•
•
•
•
•
•
Sample survey
Census
Statistical Process using Administrative Sources
Statistical Process involving Multiple Data Sources
Price or other Economic Index Process
Statistical Compilation
12
The underlying IT infrastructure
The Eurostat Metadata Handler as IT tool for the production,
transmission and dissemination of the ESQRS metadata.
Eurostat Metadata Handler
Euro SDMX
Registry
National
Metadata Editor
RAMON
EMIS
CODED
Common user interface
13
The ESQRS Statistical Business Process
NATIONAL
STATISTICAL
AUTHORITY
EUROSTAT
National
Metadata
Editor
National
ESQRS
NRME Database
Eurostat Website
National
ESQRS
National
and
Eurostat
ESQRS
eDamis
PRODUCTION
TREATMENT
AND ANALYSIS
DISSEMINATION
14
Production and dissemination of metadata
at national and European level
EU Member States
Eurostat
ESQRS
- Production of national ESQRS
- Production of the Eurostat ESQRS
- Transmission of national ESQRS to
based on the national ESQRS
Eurostat
- Dissemination of the Eurostat
- Dissemination of national ESQRS
ESQRS if decided so
(if decided so)
- Checking and dissemination of the
national ESQRS (dissemination only
if decided so)
ESMS
- Production of national ESMS files - Production of the Eurostat ESMS
- Transmission of national ESMS
files
files to Eurostat
- Dissemination of the Eurostat ESMS
- Dissemination of national ESMS
files
files (if decided so)
- Checking and dissemination of the
national ESMS files (dissemination
only if decided so)
15
Summary
 New reporting structure for quality related metadata has been
created: the ESQRS.
 The ESQRS is based on the existing EU legislation and
documentation for data quality in the ESS.
 The quality indicators contained in the ESQRS allow the
harmonised measurement/ monitoring of the statistical data quality
within and across statistical processes.
 The ESS quality reporting will successively be converted into the
ESQRS by the use of the National Reference Metadata Editor
 The ESQRS needs to be further promoted and communicated
within and beyond the ESS.
16