Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University of Bergen.
Download ReportTranscript Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University of Bergen.
Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University of Bergen Contents of paper • • • • • Data sources for business statistics Methods for integration of sources Use of standard models for processes The need to measure quality of linked files Work to be done Interaction of sources and modules • • • • • The statistical business register The database to coordinate samples The micro file of statistical surveys The database of available data The menu of data editing imputation and estimation Data sources to be integrated for business statistics by ISEE Administrative business registers that are affiliated to the Legal Unit Register • • • • Employer Register Business Enterprise Register Value Add Tax Register Tax register of business enterprises and self-employed Enterprises and establishment of manufacturing industry • Complex enterprises – Establishments • Single enterprises 1086 1 768 19 416 Establishment by sources • • • • • A. Census survey 2 096 B. In sample and selected 1 251 C. In sample and not selected 1 092 D. Small complex establishment 319 E. Establishment excluded 16 426 Methods for integration • Linkage at unit level • Editing of a single source and linked files • Estimation by mass imputation Process errors in integrated records • • • • • Errors due to incomplete registers Linking errors Observation errors Errors made during editing Imputation errors Units identification in 2 registers Units in register 1: Not existi ng Incorrectly existing Existing with incorrect ID Existing with correct ID Sum Units in register 2: Not existing in register 00 01 02 03 0. Incorrectly existing 10 11 12 13 1. Existing with incorrect ID 20 21 22 23 2. Existing correct 30 31 32 33 3. Sum .0 .1 .2 .3 .. Statistical quality indicators • • • • • • • Process accuracy – Statistical accuracy Quality of administrative data A single source A linked file of 2 sources A linked file of a large number of sources A small evaluation sample Deviation between processes data value and the correct value Work to be done • Improve the system for unit identification • Develop the standard processing system • Evaluation by controlled experiments • Designing suitable evaluation samples • Use of process data for process design