Quality Reports - Danmarks Statistik

Download Report

Transcript Quality Reports - Danmarks Statistik

Quality Reports
Ukraine
November 2014
Why quality reporting
Regulation 223/2009 on European Statistics
European Statistics shall be produced


on the basis of uniform standards
using harmonized methods
Users shall have access to metadata describing
the quality of statistical output, in order to interpret
and use the statistics correctly
2
ESS Quality Reporting
http://epp.eurostat.ec.europa.eu/portal/page/portal/quality/quality_reporting
Here you will find:
ESS Handbook for Quality reports 2014
ESS Quality and Performance Indicators 2014
Single Integrated Metadata Structure and its Technical Manual
ESS Quality Glossary
Handbook on Data Quality - Assessment Methods and Tools
Handbook on improving quality by analysis of process variables
3
Some abbreviations
 SDMX
 ESMS

User-oriented format for quality reporting
 ESQRS

Statistical Data and Metadata Exchange
Euro-SDMX Metadata Structure
ESS Standard for Quality Reports Structure
Producer-oriented format for quality reporting
All statistical concepts of ESMS and ESQRS have been
included and streamlined in
 SIMS
Single Integrated Metadata Structure
4
Example: Accuracy and reliability
ESMS
SIMS
ESQRS
14
Accuracy and reliability
S.15
Accuracy and reliability
V
Accuracy and reliability
14.1
Overall accuracy
S.15.1
Overall accuracy
V.1
Overall accuracy
14.2
Sampling error
S.15.2
Sampling error and A1. Sampling errors - indicators for U
V.2
Sampling error
S.15.2.1
A1. Sampling errors - indicators for P
V.2.1
Sampling errors - indicators
S.15.3
Non-sampling error and A4. Unit non-response - rate for
U and A5. Item non-response - rate for U
V.3
Non-sampling error
15.3.1
Coverage error
V.3.1
Coverage error
S.15.3.1.1
A2. Over-coverage - rate
V.3.1.1
Over-coverage - rate
S.15.3.1.2
A3. Common units - proportion
S.15.3.2
Measurement error
V.3.2
Measurement error
S.15.3.3
Non response error
V.3.3
Non response error
S.15.3.3.1
A4. Unit non-response - rate for P
V.3.3.1
Unit non-response - rate
S.15.3.3.2
A5. Item non-response - rate for P
V.3.3.2
Item non-response - rate
S.15.3.4
Processing error
V.3.4
Processing error
V.3.4.1
Imputation - rate
V.3.4.2
Common units - proportion
V.3.5
Model assumption error
V.3.7
Seasonal adjustment
14.3
Non-sampling error
S.15.3.5
Model assumption error
5
ESS handbook - Purpose
Handbook is addressed to



NSO for their own internal assessment of process and output
quality
NSO as the starting point for preparing user-oriented quality
reports
NSO for producer-oriented quality reports to Eurostat
Single metadata structure should promote




Both user-oriented and producer-oriented should be derived from
the same source
Maximum re-use of information in the metadata system
Reduction and simplification of documents
The user-oriented quality reports should be improved
6
ESS Handbook - Structure
1.
2.
3.
4.
5.
6.
7.
8.
9.
Introduction
Relevance, assessment of user needs
Accuracy and reliability
Timeliness and punctuality
Accessibility and clarity
Coherence and comparability
Cost and burden
Confidentiality
Statistical processing
2-6: output components
7-9: process components
7
Statistical Processes
The ESS Handbook applies to the following statistical
processes:
1.
2.
3.
4.
5.
6.
Sample survey
Census
Statistical process using administrative source(s)
Statistical process involving multiple data sources
Price or other economic index process
Statistical compilation assembling a variety of
primary sources (e.g. National Accounts)
8
Part II: Guidelines for preparing
detailed quality reports
Guidelines for preparing quality reports
 For all 9 headlines in the structure
(Relevance, Accuracy …)
 For all statistical process on the whole
and where relevant
 For the 6 types of statistical processes
(Sample Survey, Census …)
9
Part III: Annexes
1. ESS Quality and Performance Indicators
2. Technical Manual of the Single Integrated
Metadata Structure (SIMS)


Annex 1. Relation between ESMS, SIMS and ESQRS
(extract in slide 5)
Annex 2. Descriptions and guidelines for each item in
SIMS
3. References and key documents
10
DK quality reports
 Re-organisation 2014 following the ESS handbook
 Three levels
1.
2.
3.
“Front page” to appear at the homepage of Statistics Denmark,
with a short description of the 9 headlines in the Structure. From
the front page one can point on at and open around 100
specified topics (SIMS)
SIMS topics cover the more detailed quality report (See
guidelines in Annex 2). From level 2 one can point at and open
annexes for further description
Annexes
 The idea is in one product to cover all customers
(national, international, EU).
 Prepared in Danish and English
11
Quality and Performance Indicators
16 quality indicators to be reported in the ESS Quality
Reports to Eurostat





Relevance
Accuracy and Reliability
Timeliness and Punctuality
Coherence and Comparability
Accessibility and Clarity
R1
A1 – A7
TP1 – TP3
CC1 – CC2
AC1 – AC3
(cover the output components in slide 7)
12
U (user) and P (producer) in SIMS
 P-fields are producer-oriented and are normally just the figure(s) asked for
 U-fields are user-oriented and are an annotated version of the figure(s) asked for
U
SIMS ID
P
SIMS ID
Indicator
15.2
15.2.1
A2
A3
Indicator
A1
U
SIMS ID
P
SIMS ID
Indicator
A7
21.5.1
CC1
17.1.1
15.3.1.1
TP1
16.1.1
AC1
11.3.1
15.3.1.2
TP2
16.1
16.1.2
AC2
11.5.1
AC3
12.1.1
A4
15.3.3
15.3.3.1
CC2
17.2
17.2.1
A5
15.3.3
15.3.3.2
R1
14.3
14.3.1
A6
20.2
20.2.1
TP3
16.2
16.2.1
U
SIMS ID
P
SIMS ID
13
A1
Sampling error
The sampling error is calculated for key variables
The sampling error is calculated either
𝐶𝑉 𝜃 =
𝑉(𝜃)

Coefficient of variation

Confidence interval, symmetric around 𝜃
𝜃
or
14
A2 Over-coverage
 The rate of over-coverage is the proportion of units
accessible via the frame population that do not
belong to the target population.
 Definition:
Units in the frame population not in the target population
Units in the frame population
 Three main cases



Un-weighted rate
Design-weighted rate
Size-weighted rate
(wj = 1)
(wj = dj )
(wj = dj * xj )
15
A3 Common units
Mixed statistical processes where some variables or
data for some units come from survey data and others
from administrative sources
 Measure for agreement between different sources
 Definition:
Units in both the survey and the administrative source
Units in the survey
16
A4 Unit non-response
Unit non-response cover units with no information or not
usable information. Unit non-response is a source of error


It reduces the number of responses
It might introduce bias
 Definition:
Units with no or not usable information
Units in the survey
 Three main cases



Un-weighted rate
Design-weighted rate
Size-weighted rate
(wj = 1)
(wj = dj )
(wj = dj * xj )
17
A5 Item non-response
 Non response for certain variables
 Key variables and variables with major non-response
 Definition:
Units who should have but do not have answered variable X
Units who should have answered variable X
 Three main cases



Un-weighted rate
Design-weighted rate
Size-weighted rate
(wj = 1)
(wj = dj )
(wj = dj * xj )
18
A6 Data revision
Can be made very complicated – read it yourself
Example: Quarterly national accounts (DK)
 Revision measured on one variable – GDP
 Two measures calculated


First release (P) and release after 1 year (L), as the major changes
happens in this period
First release (P) and release after 3 year (L), after the yearly
National Accounts statistics has been finalized
 Two formulas used (with n = 12 quarters)

𝑀𝐴𝑅 =

𝑀𝑅 =
1
𝑛
1
𝑛
𝑛
𝑡=1 |𝑋𝐿𝑡
− 𝑋𝑃𝑡 | as a measure for the variation
𝑛
𝑡=1(𝑋𝐿𝑡
− 𝑋𝑃𝑡 ) as a measure for the bias
19
A7 Imputation
Imputation is the process used to assign replacement
values for missing, invalid or inconsistent data
 Rate calculated for key variables
 Definition:
. Units where variable X has been imputed
.
Units who should have answered variable X
 Three main cases



Un-weighted rate
Design-weighted rate
Size-weighted rate
(wj = 1)
(wj = dj )
(wj = dj * xj )
20