Administrative Data and their Use in Economic Statistics Vladimir Markhonko United Nations Statistics Division Vladimir Markhonko 12/7/2007
Download
Report
Transcript Administrative Data and their Use in Economic Statistics Vladimir Markhonko United Nations Statistics Division Vladimir Markhonko 12/7/2007
Administrative Data and
their Use in Economic
Statistics
Vladimir Markhonko
United Nations Statistics Division
Vladimir Markhonko 12/7/2007
Contents
Definitions
Advantages
of using administrative
data
Common problems
Quality of administrative data
Using administrative data in practice
Conclusions
Vladimir Markhonko 12/7/2007
Narrow Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Vladimir Markhonko 12/7/2007
Private
Sector
Wider Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Vladimir Markhonko 12/7/2007
Private
Sector
Administrative sources
are sources containing
information which is not
primarily collected for
statistical purposes.
Vladimir Markhonko 12/7/2007
Reasons for this Definition
Privatisation
of some government
functions
Growth of private sector “value-added
re-sellers”
User interest in new types of data
Vladimir Markhonko 12/7/2007
Benefits of Administrative Data
Cost
Surveys / censuses are expensive,
administrative data are often “free”
Response
burden
Reduced burden on data suppliers
Statistics can be compiled more
frequently with no extra burden
Vladimir Markhonko 12/7/2007
Benefits of Administrative Data
Coverage
Full coverage of target population
No survey errors and lower non-response
Better small-area data
Timeliness
(sometimes!)
Public image
Making use of existing data can enhance
the prestige of a statistical organisation by
making it seem more efficient
Vladimir Markhonko 12/7/2007
Population Census Costs
2000-2001
€367m, €6.2 per person
Austria, €56m, €6.9 per person
Finland, €0.8m, €0.2 per person
UK,
Source: Eurostat – Documentation of the 2000
round of population and Housing censuses in the
EU, EFTA and Candidate Countries; Table 22
Vladimir Markhonko 12/7/2007
Common Problems
Administrative
units do not always
coincide with statistical units
Conversion via automatic rules for
simple cases
Profiling for more complex cases
Gives a better understanding of
complex business structures
Expensive and needs trained staff
Vladimir Markhonko 12/7/2007
Vladimir Markhonko 12/7/2007
Common Problems
Different
definitions and classifications
Administrative and statistical priorities are
often different
Conversion matrices needed for different
classifications
Timeliness
Data arrive too late
Data relate to a different time period
Vladimir Markhonko 12/7/2007
Lag in12/7/2007
days
Vladimir Markhonko
1000
950
900
850
800
750
700
650
600
550
500
450
400
350
300
250
200
150
100
50
0
Frequency (thousands)
VAT Birth Lags
200
180
160
140
120
100
80
60
40
20
0
VAT Birth Lags
2/3
of businesses are on the register
within 2 months of start-up
Mean lag = 4 months due to “outliers”
Median = Approx. 40 days
Some pre-register - negative lags
Vladimir Markhonko 12/7/2007
Common Problems
Change
Risk of changes in government policy,
thresholds, definitions, coverage etc.
Need contingency plans
Data
management
from multiple sources
Matching / linking issues
Data conflicts – priority rules
Vladimir Markhonko 12/7/2007
Quality of Administrative Data
There
are many aspects to quality
Administrative data will be better than
survey data in some aspects but not
in others
It is important to look at overall quality
Do the data meet the needs of users?
Vladimir Markhonko 12/7/2007
Three Aspects of Quality
Quality
of incoming data
Quality of processing
(matching, merging, ...)
Quality of outputs - likely to be
different to survey based outputs,
but are they better?
Vladimir Markhonko 12/7/2007
Quality Measurement
How
to measure the quality of data
from administrative sources?
Comparing sources
Quality check surveys
Knowledge of source (metadata)
Quality reports / templates
Vladimir Markhonko 12/7/2007
Quality Templates
Companies House Data
• Framework: Contract
• Frequency: Quarterly updates, continuous
on-line access
• Timeliness: Good
• Quality:
Good
• Delivery:
CD-ROM / Internet
• Key content: Legal name, company number
Vladimir Markhonko 12/7/2007
Using Administrative Data
Conversion
to statistical concepts and
definitions
Linking / Matching
–
–
Exact Matching - linking records from
two or more sources, often using
common identifiers
Probabilistic Matching - determining the
probability that records from different
sources should match, using a
combinationVladimir
of variables
Markhonko 12/7/2007
UK Business Register
VAT
Survey
inputs
Satellite
registers
Company
registrations
PAYE
Geographic
information
systems
Business
Register
Dun and
Bradstreet
Vladimir Markhonko 12/7/2007
Vladimir Markhonko 12/7/2007
Satellite Registers
Vladimir Markhonko 12/7/2007
Examples of Satellite Registers
Tourism
- hotel register (category,
number of beds)
Transport - vehicle or ship register
(type, capacity)
Distributive trades - buildings register
(building size, sales area)
Vladimir Markhonko 12/7/2007
Conclusions
Administrative
sources should be
defined in the widest sense
There are many benefits in using
administrative data, particularly
reduced costs
There are problems when using
administrative data, but usually
someone has found a solution
Vladimir Markhonko 12/7/2007
Conclusions
Most
problems can be reduced by
effective planning and detailed
knowledge of the source
The benefits are often greater than
the costs
Vladimir Markhonko 12/7/2007
Thank you for your attention.
Vladimir Markhonko 12/7/2007