SESADP Structure of Earnings Survey – Administrative Data
Download
Report
Transcript SESADP Structure of Earnings Survey – Administrative Data
National Employment Survey Unit
Methodology Division, CSO
Project Team:
Kevin McCormack, Dr. Mary Smyth, Sinead Phelan, Ann O’Dwyer
Overview
Structure of Earnings Survey
EU Regulation – 4 years
→ met by National Employment Survey (NES)
Microdata:60,000 employees- Annual & Hourly earnings; Hours worked:
Age
Gender
Education
Occupation
NACE
Full/part-time
Nationality
Length of Service
EU Annual Earnings, GPG
National Earnings Statistics
RMFs
NES Publication
Example of Tables
Mean hourly earnings in October 2007 by educational
attainment, full/parttime status and sex
Level of
Educational
Attainment
Male
Female
Total
Full-time
Part-time
Full-time
Part-time
Full-time
Part-time
€
€
€
€
€
€
Primary or Lower
Secondary
17.62
13.15
14.78
13.06
16.88
13.08
Higher Secondary
18.68
12.44
16.36
14.56
17.78
14.15
Post Leaving Cert
20.00
13.31
15.91
15.11
18.89
14.80
Third level nondegree
23.06
14.75
19.37
17.20
21.02
16.90
Degree or higher
31.44
20.06
27.20
23.07
29.18
22.47
Total
21.69
14.11
20.42
15.69
21.17
15.40
SES - ADP
Structure of Earnings Survey - Administrative Data
Project
Project Goal:
2011 & 2012 Annual Earnings Data required - EU & Nationally
Administrative Data
Response Burden, Cost Effective, Quality, Representative
NES Annual Publication
Roll-out Infrastructure:
2013
SES 2014
5 Modules
1) Research & Identify Potential Sources – ADS
2) Linking Data Sources
3) Modelling non-available characteristics
4) Construction of the SESADS
5) Publish Results
(M1) Research & Identify ADS
7 Administrative Data Sources
2 External
Revenue P35L
Dept. Social Protection
5 CSO
Census
EHECS
SILC
• CBR
• QNHS
Fig. 1: SESADS primary data sources
DSP
P35L
CBR
QNHS
SESADS
EHECS
SILC
COP
(M2) Linking Data Sources
An analysis was undertaken of the data fields contained
within the SESADS sources.
Unique Identifiers:
Per_IdNo. (PPS No. anonymised) - employees
Ent_nbr (unique Enterprise Number ) - employers
Most suitable unique identifiers (UI) to link:
CSO’s data sources,
DSP and
Revenue Commissioners P35L data files
Fig.2: Construction of the SESADS
SESADS
COP/QNHS/SILC
Per_IdNo & ICA
DSP
Per_IdNo.,
Demographics
CBR/EHECS
Per_IdNo.
Ent_nbr, Enterprise
location, Size, and
NACE
P35L
Per_IdNo.
Ent_nbr, Gross
annnual earnings,
Weeks worked
Occupation,
NACE, Demographics,
Education,
Earnings
Identity Correlation Approach (1)
Census
No Unique Identifier
Linking social data sources (Census) is a greater challenge for the CSO.
No Unique Identifiers (UIs), such as a PPS No.
UIs were developed by following an identity correlation approach
(ICA),
e.g. combining date of birth, Gender , County live and NACE.
E.g. 29101990|F|CORK|85|
This identity correlation approach enabled the social data sources to be linked
• SESADS
Currently contains 1 million of the approx. 1.3 million F/P time
employees in the State
Quality checked 800,000 records,
Representative of the NACE sectors,
Identity Correlation Approach (2)
Annual Births YoB = 63,000
DoB 63,000 / 365 days
= 173
Gender ÷ 2
=
NACE ÷ 14
=
6 (17)
County ÷ 26 (3)
=
1 (5)
86
E.g. 29101990|F|85|CORK|
On completion of Module 2 - SESADS will contain all employees in
the State, Gross Annual/Weekly Earnings classified by:
Variables
Sources
NACE,
Gender,
Enterprise
Size group,
Public/Private sector,
Weeks worked,
------------------------------------------
Occupation,
Area of residence,
Education,
Age,
Nationality.
P35L
CBR
EHECS
DSP
------------------- COP
QNHS
SILC
Module 3: Modelling of non-available
characteristics
Employee characteristics to be modelled are:
(1) Hours Worked
(2) Annual bonuses
(3) BIK (benefit in kind)
(4) full/part-time employment status for employees.
A multiple imputation methodology will be employed to carry out this stage of
the Project.
EHECS,QNHS and SILC data sources will be leveraged to provide the base
information.
Once this model is completed, the SESADS will fulfill both the Eurostat annual
and 4- yearly Eurostat SES earnings requirements.
Module 4: Construction of the SESADS
The SESADS will be constructed in the CSO’s
Administrative Data Centre (ADC)
Structures (known as layers) consistent with those as
outlined in the ESSnet on microdata linking and data
warehousing in statistical production.
SES – EU microdata format
Module 5: Publication of Results
The first set of SES statistics for 2011 and 2012 (gender pay
gap and average earning) were submitted to Eurostat in
November 2013.
Finalised datasets with more detail will be available mid-
2015
NES Publication
Timetable
SESADP – signed off 2015
SES 2011 & 2012 Data
NES Publication
Roll out Project infrastructure
for 2013 & 2014 data
Assess by end 2015
SES 2014
Microdata- submitted to Eurostat mid-2016
ends-
Cost Benefit Analysis
Business Survey
V’s
15 persons
-Cost € 1.5 million
-T+ 18 months
-Quality Data Edits -Sample (70K)
-Burden (10K Ents, 70K ees) --
SESADP
1 FTE (3.5)
€ 0.1 m (€0.2m)
T+ 10 months
Revenue data
1 million
None
Thanks to:
CSO Divisions: Cork
Dublin
STS – cross division support
EHECS
ADC
CENSUS
Earnings Analysis
CBR
QNHS
SILC
IT
Etc.