Quality reporting within the Eurostat and the ESS metadata systems August Götzfried and Håkan Linden Eurostat Unit B6: Reference databases and metadata.
Download ReportTranscript Quality reporting within the Eurostat and the ESS metadata systems August Götzfried and Håkan Linden Eurostat Unit B6: Reference databases and metadata.
Quality reporting within the Eurostat and the ESS metadata systems August Götzfried and Håkan Linden Eurostat Unit B6: Reference databases and metadata 1 Current situation Within the European Statistical System (ESS) reporting on statistical data quality exists in many statistical domains…. 2 Problem statement … BUT : – Quality reports do not exist for all statistical processes within the ESS; – No homogeneity between the different report structures used for data quality reporting; – Not all the quality related information is made publicly available; – No common and standard IT infrastructure is used within the ESS; The new Eurostat vision: “Improving the production method of EU statistics” requires an improvement action. 3 Progress made since 2008 2008: introduction of the Euro SDMX Metadata Structure (ESMS) at Eurostat and the ESS for the production and dissemination of reference metadata (Commission Recommendation 498/2009) 01/2009: release of the new version of the ESS quality reporting documents: • ESS Standard for Quality Reports (ESQR) • ESS Handbook for Quality Reports (EHQR) Detailed requirements following the European Statistics Code of Practice ESS Quality and Performance Indicators (QPI’s) defined 03/09: EP/Council Regulation 223/2009 Article 12 defining the quality criteria to be reported 2009/2010: Development and deployment of the Eurostat Metadata Handler with: • EMIS: production and dissemination of ESMS files at Eurostat • National Reference Metadata Editor (NRME): production, transmission and dissemination of national ESMS files 4 ESMS and ESQR ESMS is more oriented to the USERS of statistics to understand the statistical data released there is no need for too detailed information on data quality 21 SDMX cross domain concepts used ESQR is more oriented to the PRODUCERS of statistics to monitor the quality of the statistics produced in detail concentrating on the main quality concepts (being also part of the ESS Statistics Regulation No 223/2009) However, there is information on quality criteria which is common to both ESMS and ESQR. 5 ESMS and ESQR: the starting point ESQR I. Introduction to the Statistical Process and Its Outputs ESMS 8. Release policy 1. Contact TIMELINESS II. RELEVANCE 2. Metadata update 9. Frequency of dissemination III ACCURACY IV. TIMELINESS and PUNCTUALITY 3. Statistical presentation 4. Unit of measure IXI. Performance, Cost and Respondent Burden X. Confidentiality, Transparency, Security 10. Dissemination format ACCESSIBILITY VI. COMPARABILITY and COHERENCE VIII. Assessment of User needs and Perceptions 16. Comparability COMPARABILITY V. ACCESSIBILITY and CLARITY VII. Trade-offs between Output Quality Components 15. Timeliness and punctuality 11. Accessibility of documentation 17. Coherence COHERENCE 18. Cost and burden CLARITY 5. Reference period 12. Quality management 19. Data revision ACCURACY 6. Institutional mandate 13. Relevance 20. Statistical processing RELEVANCE 7. Confidentiality 14. Accuracy and reliability 21. Comment ACCURACY XI. Conclusions 6 The new ESQRS Based on the ESQR, a new report structure - the ESS Standard for Quality Reports Structure: ESQRS - was created for harmonising the reporting on statistical data quality within the ESS. The ESQRS is using the main statistical data quality criteria as listed in EP/Council Regulation 223/2009 and as being part of the ESMS and details them further : • Relevance • Accuracy • Timeliness and Punctuality • Accessibility and Clarity • Comparability • Coherence A subset of the Quality Performance Indicators (QPI’s) is also covered in the new ESQRS. 7 ESQR and ESQRS ESQR I. Introduction to the Statistical Process and Its Outputs ESQRS I Contact II Introduction III Relevance (user needs and perceptions) IV. TIMELINESS and PUNCTUALITY IV Accuracy V. ACCESSIBILITY and CLARITY V Timeliness and punctuality VI. COMPARABILITY and COHERENCE VI Accessibility and clarity VII. Trade-offs between Output Quality Components VII Comparability VIII. Assessment of User needs and Perceptions VIII Coherence IX Comment II. RELEVANCE III ACCURACY IXI. Performance, Cost and Respondent Burden X. Confidentiality, Transparency, Security XI. Conclusions 8 The ESQRS ESQRS Concept name I Introduction II Relevance (user needs and perceptions) II.1 User needs II.2 User satisfaction ESQRS Concept name IV Timeliness and Punctuality IV.1 Timeliness IV.1.1 Timelag – first results (T1) IV.1.2 Timelag – final results (T2) IV.2 Punctuality IV.2.1 Punctuality – publication (T3) II.2.1 User satisfaction index (US1) II.2.2 User satisfaction survey – date V Accessibility and Clarity II.3 Completeness V.1 News release II.3.1 Rate of available statistics (R1) V.2 Publications V.2.1 Publications – number (AC1) III Accuracy III.1 Overall accuracy V.3 On-line database V.3.1 On-line database – accesses (AC2) Micro-data access III.2 Sampling error III.2.1 Coefficient of variation (A1) V.4 III.3 Non-sampling error V.5 Other III.3.1 Coverage and other frame errors V.6 Documentation on methodology III.3.1.1 Rate of overcoverage (A2) III.3.2 Measurement errors V.6.1 Metadata –completeness (AC3) III.3.2.1 Edit failure rate (A3) V.7 Quality documentation III.3.3 Non response errors III.3.3.1 Unit response rate (A4) VI Comparability VI.1 Comparability – geographical VI.1.1 Assymetries for statistics mirror flows (CC2) III.3.3.3 Item response rate (A5) III.3.4 Processing errors III.3.4.1 Imputation rate (A6) VI.2 Comparability - over time III.3.5 Model assumptions VI.2.1 III.3.6 Mistakes (A7) Length of comparable time series (CC1) III.3.7 Data revision VI.3 Comparability – domains III.3.7.1 Data revision – policy III.3.7.2 Data revision - practice III.3.7.3 Average size of revisions (A8) III.3.8 Seasonal adjustment = Concepts in common with ESMS VII Coherence VII.1 Coherence – cross domain VII.1.1 Coherence –subannual and annual statistics VII.1.2 Coherence – National Accounts VII.1.3 Coherence with other statistics VII.2 Coherence – internal 9 The ESMS and the ESQRS The metadata produced in the ESMS and ESQRS need to be kept consistent. The ESQRS is based on the ESQR, but not taking up all the chapters contained in the latter one. The information in the ESQRS is more detailed compared to the information on statistical data quality contained in the ESMS. ESQRS reports deeper in terms of data quality compared to the ESMS 10 The ESMS and the ESQRS ESMS Accuracy and reliability Description: Accuracy: closeness of computations or estimates to the exact or true values that the statistics were intended to measure. Reliability: closeness of the initial estimated value to the subsequent estimated value. Non- sampling error Description: Error in survey estimates which cannot be attributed to sampling fluctuations. ESQRS Accuracy Description: Accuracy: closeness of computations or estimates to the exact or true values that the statistics were intended to measure. Non- sampling error Description: Error in survey estimates which cannot be attributed to sampling fluctuations. Non- response error Description: The difference between the statistics computed from the collected data and those that would be computed if there were no missing values. Unit response rate Description: The ratio of the number of units for which data for at least some variables have been collected to the total number of units designated for data collection. Formulae unit resp. rate Description: Ex. calculation formluae for unweighted unit response rate. 11 ESS Guidelines The guidelines for quality reporting from ESS Handbook for Quality Reports (EHQR) are already used in the “ESS Guidelines” for ESMS. These guidelines will be further used in the ESQRS in order to provide detailed guidelines for 6 different statistical processes: • • • • • • Sample survey Census Statistical Process using Administrative Sources Statistical Process involving Multiple Data Sources Price or other Economic Index Process Statistical Compilation 12 The underlying IT infrastructure The Eurostat Metadata Handler as IT tool for the production, transmission and dissemination of the ESQRS metadata. Eurostat Metadata Handler Euro SDMX Registry National Metadata Editor RAMON EMIS CODED Common user interface 13 The ESQRS Statistical Business Process NATIONAL STATISTICAL AUTHORITY EUROSTAT National Metadata Editor National ESQRS NRME Database Eurostat Website National ESQRS National and Eurostat ESQRS eDamis PRODUCTION TREATMENT AND ANALYSIS DISSEMINATION 14 Production and dissemination of metadata at national and European level EU Member States Eurostat ESQRS - Production of national ESQRS - Production of the Eurostat ESQRS - Transmission of national ESQRS to based on the national ESQRS Eurostat - Dissemination of the Eurostat - Dissemination of national ESQRS ESQRS if decided so (if decided so) - Checking and dissemination of the national ESQRS (dissemination only if decided so) ESMS - Production of national ESMS files - Production of the Eurostat ESMS - Transmission of national ESMS files files to Eurostat - Dissemination of the Eurostat ESMS - Dissemination of national ESMS files files (if decided so) - Checking and dissemination of the national ESMS files (dissemination only if decided so) 15 Summary New reporting structure for quality related metadata has been created: the ESQRS. The ESQRS is based on the existing EU legislation and documentation for data quality in the ESS. The quality indicators contained in the ESQRS allow the harmonised measurement/ monitoring of the statistical data quality within and across statistical processes. The ESS quality reporting will successively be converted into the ESQRS by the use of the National Reference Metadata Editor The ESQRS needs to be further promoted and communicated within and beyond the ESS. 16