Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University of Bergen.

Download Report

Transcript Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University of Bergen.

Role of editing and imputation in
integration of sources for
structural business statistics
Svein Gåsemyr, Statistics Norway
Svein Nordbotten, University of Bergen
Contents of paper
•
•
•
•
•
Data sources for business statistics
Methods for integration of sources
Use of standard models for processes
The need to measure quality of linked files
Work to be done
Interaction of sources and modules
•
•
•
•
•
The statistical business register
The database to coordinate samples
The micro file of statistical surveys
The database of available data
The menu of data editing imputation and
estimation
Data sources to be integrated for
business statistics by ISEE
Administrative business registers
that are affiliated to the Legal Unit
Register
•
•
•
•
Employer Register
Business Enterprise Register
Value Add Tax Register
Tax register of business enterprises and
self-employed
Enterprises and establishment of
manufacturing industry
•
Complex enterprises
– Establishments
•
Single enterprises
1086
1 768
19 416
Establishment by sources
•
•
•
•
•
A. Census survey
2 096
B. In sample and selected
1 251
C. In sample and not selected
1 092
D. Small complex establishment
319
E. Establishment excluded
16 426
Methods for integration
• Linkage at unit level
• Editing of a single source and linked files
• Estimation by mass imputation
Process errors in integrated
records
•
•
•
•
•
Errors due to incomplete registers
Linking errors
Observation errors
Errors made during editing
Imputation errors
Units identification in 2 registers
Units in register 1:
Not
existi
ng
Incorrectly
existing
Existing
with
incorrect ID
Existing
with
correct ID
Sum
Units in register 2:
Not existing in register
00
01
02
03
0.
Incorrectly existing
10
11
12
13
1.
Existing with incorrect ID
20
21
22
23
2.
Existing correct
30
31
32
33
3.
Sum
.0
.1
.2
.3
..
Statistical quality indicators
•
•
•
•
•
•
•
Process accuracy – Statistical accuracy
Quality of administrative data
A single source
A linked file of 2 sources
A linked file of a large number of sources
A small evaluation sample
Deviation between processes data value
and the correct value
Work to be done
• Improve the system for unit identification
• Develop the standard processing system
• Evaluation by controlled experiments
• Designing suitable evaluation samples
• Use of process data for process design