Diapositiva 1

Download Report

Transcript Diapositiva 1

Quality reporting
in a short-term business survey
based on administrative data
M. Carla Congia
[email protected]
Fabio Rapiti
[email protected]
ISTAT - Italy
European Conference on Quality in Official Statistics
Session on Quality reporting
Rome, 8-11 July 2008
Quality reporting
Outline
• The Italian Oros Survey
• Quality issues in using administrative data
• Peculiarities of data quality assessment
• Oros quality indicators and reporting
• Final remarks
Q2008 - Rome, 8-11 July 2008
Quality reporting
The Oros Survey
• Since 2003 the Oros survey has released quarterly indicators on gross
wages and total labour cost per FTE covering all size enterprises in the
private non-agricultural sector (C to K sections Nace Rev. 1.1)
• Based on extensive use of administrative data (National Social
Security Institute - INPS) combined with survey data on Large firms with
more than 500 employees (Monthly Large Enterprise Survey)
• Provisional estimates based on the “provisional population” are
released with a 70-days delay
• Final estimates are produced after 5 quarters on the basis of the
“whole population” and complete updated information
• Meets also the requirements of the European regulations:
 STS - Short-Term Statistics
 LCI - Labour Cost Index (hourly labour cost index)
Q2008 - Rome, 8-11 July 2008
Quality reporting
The administrative source
National Social Security Institute - INPS
All Italian firms in the private sector with at least one employee have to
pay monthly social security contributions to INPS (roughly 1.3 million
employers and 12 millions employees)
DM10 form
The Monthly Declaration is a highly detailed grid where information on
total employment, wage-bills, paid days, overtime hours and social
contributions is identified by specific administrative codes (about 5,000
valid codes)
Each DM10 lays in several records (8 on average)
Data capturing
• Every firm monthly transmits to INPS the DM10 in electronic format,
not later than 30 days after the reference period
• Then the whole raw declarations are redirected to Istat at 35 days from
the end of the reference period (about 10 millions records each month)
Q2008 - Rome, 8-11 July 2008
Quality reporting
The administrative data exploiting strategy
A constrain became an opportunity
At first INPS could not aggregate in the very strict time scheduled the
DM10 data in the format required for Oros purposes. So the Istat
strategy became
“Catch what you can” “as quick as you can”
from a typical “one collection-for one single output/product”
to focus on the “whole data source”- the wage and contribution system
Advantages
• microdata are exactly those sent by firms and this allows a more direct control of
the aggregation/translation process
• a lot of information available for many other different statistical purposes
Disadvantages
• a complex preliminary phase of checks and computation inside the single DM10
to get to the target variables at micro level
• a lot of data not necessarily useful for short-term objectives
Q2008 - Rome, 8-11 July 2008
Quality reporting
Quality issues in using INPS administrative data
The Oros challenge is to produce short-term indicators processing
• a huge quantity of very detailed microdata
• in a very short time scheduled
• coping with the frequent changes in the basic INPS metadata
have to use DM10 form to take advantage of labour cost’s
reduction policies and these contribution laws continuously change
 enterprises
After preliminary studies INPS data have been considered to be suitable
for Oros purposes but still statisticians have
no quarterly ex-ante control
over the quality of the raw administrative data
Only a complex quality-oriented production process can assure
ex-post quality
coping with unusual problems
Q2008 - Rome, 8-11 July 2008
Quality reporting
Quality issues in using INPS administrative data
Fragmented and
insufficient Inps
metadata
In-house Metadata
database
Highly
disaggregated
raw data
Preliminary checks and
accurate translation into
statistical variables
Integration with
LE Survey data
Checks to avoid double
counting
Continuos
legislation
changes
Q2008 - Rome, 8-11 July 2008
Final key checks macroediting
Quality reporting
Peculiarities of data quality assessment
Relating to quality assessment of administrative data Eurostat
recommends to produce: a source-specific report and a productspecific one
In the Oros case the non-conventional use of administrative data implies
that the two reports overlap…….while new approaches on administrative
data quality assessment are empirically explored
Oros practice has been developed trying:
• to find better tools to assess quality
• to manage the measurement of rather new indicators on:
• efficient and stable data capturing
• completeness and consistency of metadata
• stable traslation/retrieval of target statistical variables
• correct integration with LE survey data
• to quarterly produce quality indicators along the whole production
process
• to meet both Istat and Eurostat requests on quality reporting
Q2008 - Rome, 8-11 July 2008
Oros quality reporting: an overview
+ PROCESS
Target
group
Oros
Producers
Frequency
+ PRODUCT
Internal users
(of micro and
macro data)
Top managers
and central
quality
managers
External Expert
users
(Eurostat, IMF,
BCE,.. )
General
Public
Survey Documentation and Methodological Handbook
Once
Metadata in SDDS
Oros PR
explanatory
notes
SIDI information
system for survey
documentation
Annually
LCI Quality
Report
Quarterly
Oros Process
Monitoring
Report
Quarterly
LCI meta
information
Istat
Quality
Report
Quality reporting
Survey Documentation and Methodological Handbook
Initial basic quality assessment of the INPS administrative source to
evaluate the suitability for the production of quarterly labour market
indicators
• Concepts and definitions of variables and population
• Translation scheme of administrative information into statistical
variables
• Coverage
• Reference time
• Accuracy
• Stability over time
And obviously contening more about…….. the survey methods and
the description of the whole production process
Q2008 - Rome, 8-11 July 2008
Quality reporting
Metadata in SDDS format
Metadata in Special Dissemination Data Standard format used to deliver
information to the IMF
• Base page
data, access by the public, integrity and quality
• Summary methodology statements
key features enabling users to
assess the suitability of the data for their purposes
• totally qualitative and compiled once: it is updated following the
relevant changes in the methodology
• compiled for the 3 outputs and different users
• Oros
• Oros
• LCI
• STS
efforts to systematize
ConIstat - short-term indicators’TS database on Istat web-site
Eurostat
Eurostat
Eurostat
Q2008 - Rome, 8-11 July 2008
Quality reporting
Process Monitoring Report 1
Quantitative indicators to keep continuosly under control and
improve the quality along the whole Oros production process
Some of them are also warning indicators  : signal decisive problems
or detect sources of error
Main quality indicators for some key steps of the process:
Data
capturing
Number of monthly records
Number of DM10 forms
Time lag between scheduled and actual delivery dates
Metadata
Database
updating
Date of last updating of DM10 metadata on INPS web-site
Number of new and expired DM10 codes by type
Rate of new DM10 codes to include/exclude
• Number of official INPS acts to analyse
Q2008 - Rome, 8-11 July 2008
Quality reporting
Process Monitoring Report 2
Preliminary
checks on
administrative
data
DM10 codes error rate=Number of impossible codes/Total
number of codes
• DM10 codes edit rate=Number of codes changed by
editing/Number of impossible codes
Rate of duplicate units=Number of duplicate units/Total
number of units
Micro editing
Edit rate=Number of unit edited/ Total number of units in
scope for the item
Total contribution to key estimates from edited
values=Total weighted quantity for edited values on total
weighted quantity for all final values
Q2008 - Rome, 8-11 July 2008
Quality reporting
Process Monitoring Report 3
Integration
with LE survey
data
Macroediting
• Number of units manually checked due to record linkage
problems (i.e. mergers or split-ups recorded in different
times)
• Number of suspicious aggregates identified automatically by
TERROR or through graphical checks
• Number of outliers treated at micro or macro level
• Total contribution to the estimates from treated values
• Length of the homogeneous time series
Q2008 - Rome, 8-11 July 2008
Quality reporting
Istat Quality Report
Still experimental
Oros has been involved in the pilot test
• quality indicators within a framework of a qualitative report coherent
with Eurostat quality components
• disseminated within the System on the Quality (SIQual) available on
Istat website
• external-user oriented
• subset of standard quality indicators appropriately chosen within
those available from the Information System for Survey Documentation
(SIDI)
• Response Rate
• Indicators on the Revision policy (MR, MAR)
• Timeliness for provisional data release
• Timeliness for definitive data release
• Length of the homogeneous time series
• description of non-sampling error, relevance, accessibility
Q2008 - Rome, 8-11 July 2008
Quality reporting
LCI Quality Report
Required by Eurostat to evaluate the quality of national LCI used to
produce the European aggregate index
LCI was established with an
“harmonization of output” and not “harmonization of input” approach
• since 2004 the LCI QR has been annually produced
• standard structure based on Eurostat dimensions of quality with a
further aspect
“completeness”
• main standard quality indicators used:
• Revision policy (MR, MAR)
• Timeliness for provisional data release
• description of method for compiling hours worked (LCI denominator)
Quarterly LCI meta information
Standard Template
mainly qualitative
release-specific
• Changes in the labour market (collective agreements, laws) which
has an impact on wages and labour cost
• Reasons of revisions in NSA, WDA and SA data
Q2008 - Rome, 8-11 July 2008
Quality reporting
Final remarks
The Oros innovative quarterly use of administrative data forces to
monitor peculiar aspects of quality not usually taken into
consideration in the standard quality assessment approach
suggested by Eurostat
Several specific indicators to assess the quality of the process, in
particular the metadata updating and the translation/aggregation of raw
INPS data, have been implemented but they need to be more
systematized
These specific indicators are essential from the producer point of view,
but they could also be used to report to the users the quality of some
key issues
On the other hand, the Oros survey satisfies the internal (SIDI, SiQual)
and external (Eurostat) requests of standard quality reports
A better integration of all the reviewed quality reporting tools is desirable
but only partially achievable
Q2008 - Rome, 8-11 July 2008