The Use of Administrative Sources for Statistical Purposes

Download Report

Transcript The Use of Administrative Sources for Statistical Purposes

The Use of Administrative
Sources for Statistical
Purposes
Steven Vale
United Nations
Economic Commission for Europe
Day 1
09.00 – 10.00
Introductions and overview of the
course
10.00 – 12.30
Introduction to administrative
sources - definitions, benefits and
quality considerations
14.00 – 15.30
Frameworks for the access and
use of administrative data
15.45 – 17.30
Frameworks in Finland and the
Medstat countries
Day 2
09.00 – 10.30
Common problems and solutions
10.45 – 12.30
Common problems and solutions
14.00 – 15.30
Presentations from participating
countries
15.45 – 17.30
Presentations from participating
countries and discussion
Day 3
09.00 – 10.45
Introduction to Matching
11.00 – 12.30
Presentations from participating
countries
14.00 – 15.45
Administrative sources in statistical
registers
16.00 – 17.30
Presentations from participating
countries and discussion
Day 4
09.00 – 10.30
Case Study – The use of
administrative sources in Finland
10.45 – 12.30
Case Study – The use of
administrative sources in Finland
14.00 – 15.00
International work in the field of
administrative sources
15.00 – 15.30
Questions and answers
15.45 – 17.00
Closing session, feedback and
evaluation
Format of the Course
• Presentations and case studies
– Course leaders
– Participants
• Group exercises and discussions
– Watch out for the orange screen!
• Time for questions
– but please ask if anything is unclear
The Use of Administrative
Sources for Statistical
Purposes
Steven Vale
United Nations
Economic Commission for Europe
Group Exercise
What are
Administrative
Sources?
In less than 20 words!
Gordon Brackstone,
Statistics Canada (1987)
Four distinguishing features of
administrative data
1. The agent that supplies the data
to the statistical agency and the
unit to which the data relate are
different (in contrast to most
statistical surveys);
2. The data were originally collected
for a definite non-statistical
purpose that might affect the
treatment of the source unit;
3. Complete coverage of the target
population is the aim;
4. Control of the methods by which
the administrative data are
collected and processed rests
with the administrative agency.
Eurostat ‘CODED’ Glossary:
An administrative source is the organisational
unit responsible for implementing an
administrative regulation (or group of
regulations) for which the corresponding
register of units and the transactions are
viewed as a source of statistical data.
Source: OECD and others, "Measuring the
Non-Observed Economy: A Handbook",
A Wider Definition?
First introduced in the Final Report of
the Eurostat internal Task Force on
Administrative Sources, 1997
Types of Data Source
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Private
Sector
Narrow Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Private
Sector
Wider Definition
Data Sources
Primary
(Statistical)
Secondary
(Non-statistical)
Public
Sector
Private
Sector
Administrative sources
are sources containing
information which is not
primarily collected for
statistical purposes.
Reasons for this definition
• Increasing privatisation of
government functions
• Growth of private sector data and
“value-added re-sellers”
• User interest in new types of data
Examples of
Administrative
Sources
• Tax data
- Personal income tax
- Value Added Tax (VAT)
- Business / profits tax
• Social security data
• Health / education records
• Registration systems for persons /
businesses / property / vehicles
• Identity cards / passports / driving
licenses
• Electoral register
• Register of farms
• Local council registers
• Building permits
• Licensing systems e.g. television, sale
of restricted goods, import / export
• Published business accounts
• Internal accounting data
• Data held by private businesses:
- credit agencies
- business analysts
- utility companies
- telephone directories
- retailers with store cards etc.
Store Cards
In return for a few
benefits, users give
the stores a lot of data:
• Name, address, sex, age
• Family circumstances (e.g. baby products,
toys, pet food)
• Indicators of work status and income (time
of shopping and type of goods)
• Car ownership (petrol purchases)
The Benefits of Using
Administrative
Sources
Cost
• Surveys are expensive, a census is
worse, data from administrative
sources are often “free”
• Less staff are needed to process
administrative data - no need for
response chasing.
Population census costs
2000-2001
• Austria, €56m, €6.90 per person
• UK, €367m, €6.20 per person
• Finland, €0.8m, €0.20 per person
Source: Eurostat – Documentation of the 2000 round of population and
Housing censuses in the EU, EFTA and Candidate Countries; Table 22
Response Burden
• Using administrative sources:
– Reduces the burden on data suppliers
– Allows statistics to be compiled more
frequently with no extra burden
• Data suppliers complain if they are
asked to provide the same information
many times by different government
departments
Coverage
• Administrative sources usually offer
better coverage of target populations,
and can make statistics more accurate:
– No survey errors
– No (or low) non-response
• Better coverage gives:
– Better small-area data
– More detailed information
Timeliness
• Producing statistics from administrative
sources can sometimes be quicker
than using surveys
• No need for:
– forms design;
– pilot surveys;
– sample design etc.
Public Image
• Making more use of existing data
can enhance the prestige of a
statistical organisation by making it
seem more efficient
• The concept of “Joined-up
government” is politically appealing
Group Discussion
What are the actual and
potential benefits of
using administrative
sources in your
countries?
Quality and
Administrative
Sources
Quality
• Are data from administrative sources
as good as data from surveys?
• Who should judge this?
• How can we measure quality?
• How should we report and
communicate quality?
Definition of Quality
International Standard
ISO 9000/2005 defines quality as;
'The degree to which a set of
inherent characteristics
fulfils requirements.’
What does this mean?
• Whose requirements?
– The user of the goods or services
• A set of inherent characteristics?
– Users judge quality against a set of
criteria concerning different
characteristics of the goods or services
• Therefore, quality is all about providing
goods and services that meet the
needs of users (customers)
Quality Components
• Different statistical organisations use
different lists of components
- but all lists are quite similar
• UNECE list:
Relevance
Accessibility
Accuracy
Clarity
Timeliness
Comparability
Punctuality
Relevance
• Are the statistics that are produced
needed?
• Are the statistics that are needed
produced?
• Do the concepts, definitions and
classifications meet user needs?
Accuracy
• The closeness of statistical
estimates to true values
• In the past: quality = accuracy
• Administrative sources can help to
improve accuracy by removing
survey errors
Timeliness
• The length of time between data
being made available and the event
or phenomenon they describe
Punctuality
• The time lag between the actual
delivery date and the promised
delivery date
Accessibility
• The ways in which users can obtain or
benefit from statistical services
(pricing, format, location, language
etc.)
Clarity
• The availability of additional material
(e.g. metadata, charts etc.) to allow
users to understand outputs better
Comparability
• The extent to which differences are
real, or due to methodological or
measurement differences
– Comparability over time
– Comparability through space (e.g.
between countries / regions)
– Comparability between statistical domains
(sometimes referred to as coherence)
Other Considerations
• Cost / efficiency
• Integrity / trust
• Professionalism
– Adherence to international standards
(e.g. UN Fundamental Principles of
Official Statistics)
Quality Measurement
• How can we measure the quality of
data from administrative sources?
• There are established methods for
measuring the quality of survey data,
but these are not always relevant for
administrative data
Three Aspects of Quality
• To understand the quality of
administrative sources we need to
consider:
– Quality of incoming data
– Quality of processing
– Quality of outputs
Incoming Data
• Timeliness
• Completeness – are there any
missing units or variables?
• Comparability with other sources
• Quality check survey?
• Knowledge of the source is vital!
Processing
•
•
•
•
•
Quality of matching / linking
Outlier detection and treatment
Quality of data editing
Quality of imputation
Keep raw data / metadata to refer
back to if necessary
Outputs
• Are the users satisfied?
• Are the outputs comparable with
data from other sources?
• What is the impact on time series?
• Are the outputs cost-effective?
• Quality reports to measure and
communicate differences?
Quality Reports
• Formats proposed by Eurostat Quality
Working Group for data from:
– A single source
– Combined sources
See Eurostat paper:“Quality assessment of
administrative data for statistical purposes”
Metadata
• Knowledge and documentation of
the source is vital to help us to
understand quality:
– How the data are collected
–
–
–
–
Why they are collected
How they are processed
Concepts and definitions used
etc…
Group Discussion
Experiences of quality
measurement in
practice?