An attempt to identify grey employment – Estimation of

Download Report

Transcript An attempt to identify grey employment – Estimation of

SEBA – IE CASS – IEHAS
Economics of Crisis, Education and Labour
Chinese - Hungarian International Conference
30th June -1st July 2011, Budapest
An attempt to identify grey employment
Estimation of wage under-reporting
and tests of the predictions
Péter Elek - János Köllő – Balázs Reizer - Péter A. Szabó
Eötvös Loránd University, Institute of Economics, Central European University, Reformed Presbyterian Church of Central and Eastern Europe
We try to identify cases, when total remuneration consists of a reported
MW and an unreported ‘envelope wage’. We do so in 3 steps:
(i) Estimate a double hurdle (DH) model of the wage distribution, which
takes into account:
(i) the crowding of low-productivity workers at the MW (truncation)
(ii) reporting of MW instead of the full wage (tax evasion)
(ii) Relying on the DH results:
(i) we estimate the probability that a MW earner is paid an ‘envelope wage’
(ii) simulate the ‘genuine’ wages of MW earners
(iii) classify MW earners and their firms as ‘cheaters’ and ‘non-cheaters’
(iii) Test if our DH estimates have predictive power
We look at strong exogeneuous shocks (Hungary 2001-2, 2007) to which
cheaters and non-cheaters were expected to respond differently 
Test 1: Doubling of the MW (2001-2002)
The expectation is that under-reporting contained the growth of labor costs
so the MW shock had weaker effect on cheating firms
Test 2: Introduction of a minimum contribution base = 2MW (2007)
Main rule: firms had to pay 2MW contribution even for wages lower than
2MW
Firms were allowed to pay w<2MW but they faced a high risk of audit. Firms
continuing to pay MW faced particularly high risk.
Cheating firms had incentive to raise the reported wages of their ‘disguised’
MW earners (up to paying them an official wage of 2MW)
Furthermore, we expect that cheating firms were adversely affected by the
reform so their output and employment fell
Motivation (1): Scarce results on „fake” MWs
Ample anecdotal evidence of „fake” MWs but scarce results on their magnitude
and distribution across sectors, occupations, firm size, etc.
Inspection of aggregate, country-level data
Share of MW earners versus the Kaitz-index
Size of the spike at the MW is correlated with estimated size of the informal
economy (Tonin 2006)
Survey-based evidence
Turkey (Erdogdu 2008), Baltic states (Masso & Krillo 2009, Meriküll & Staehr
2008, Kris et al. 2007), EU (Eurobarometer 2007)
Indirect evidence from gap between reported income and consumption
Benedek at al. (2006): consumption fell in high-income households with MW
earners during the large hikes in Hungary
Tonin (2007): food consumption fell in household affected by the hikes
compared to unaffected households of similar income
Motivation (2): policy relevance
• Mostly in CEEs, MW policies are strongly influenced by the
belief that ‘nearly all’ MW earners are paid envelope wages
• Governments are tempted to whiten the grey economy by
raising the MW and/or increasing the tax burden on low
wages (Bulgaria 2003, Croatia 2003, Hungary 2001-2002,
2007)
Motivation (3): Hungary’s unusual MW policies
• Hungary’s unusual MW policies provide a unique
opportunity to study wage under-reporting. Furthermore,
the data background is better than in most countries
MW in Hungary – Doubling the MW in 2001-2002
Fraction paid near the MW (5%)
0
.1
5
.3
10
.5
15
.7
20
MW/average wage, MW/median wage
1990
1995
2000
MW/AW
2005
MW/MEDW
2010
1990
1995
>5 employees
2000
>10 employees
2005
2010
>20 employees
Decision to raise the MW from Ft 25,500 to Ft 40,000 (2001) and Ft 50,000 (2002).
Over 70% inrease in real terms, given anticipated inflation. Primarily motivated by
‘making work pay’
Followed by a huge increase in the share of MW earners. The data clearly suggest that
it was partly explained by the spread of envelope wages 
After the large hikes: many MW earners in highwage occupations
By 2003, the share of MW earners reached high levels among
•
•
•
•
Small firm managers (27.4%)
Managers of larger firms (11.9%)
Lawyers, business and tax advisors etc. (14.9%)
Professionals in construction (17.9%)
In businesses engaged in cash transactions with customers
• Blue collars in house building (20.6%) versus civil engineering (4.3%)
• Personal services (22%) versus other branches of services (<7%)
After the large hikes: wage distributions
Engineers and science
0
0
1
.2
Density
Density
.4
2
.6
3
Unskilled laborers and casual workers
11
12
13
log gross monthly earnings
Kernel density estimate
14
15
10.5
Normal density
11
11.5
12
log gross monthly earnings
Kernel density estimate
.2
.1
0
WS 2003
Density
.3
.4
Top managers
10
12
14
log gross monthly earnings
Kernel density estimate
16
Normal density
12.5
Normal density
MW in Hungary: minimum contribution base, 2007
Fraction paid near the MW (5%)
0
.1
5
.3
10
.5
15
.7
20
MW/average wage, MW/median wage
1990
1995
2000
MW/AW
2005
MW/MEDW
2010
1990
1995
>5 employees
2000
>10 employees
2005
2010
>20 employees
Introduction of a minimum contribution base (2MW). MW viewed as a signal of wage
under-reporting
MW earners suddenly disappeared in all categories of firms. We believe it was partly
explained by cheaters’ reaction to the increased risk of audit
Data
•
All data come from the Wage Survey: linked employer-employee data
covering over 150,000 workers in more than 15,000 firms, annually.
•
The WS covers all large firms (>20 employees) and a random sample of
smaller firms (5-20)
•
SMEs (5-50) report data on all workers. Larger firms report data on a
sample of workers
•
The surveys are cross-section but firms can be linked across years directly
and workers can be linked indirectly
***
•
DH model: cross-sections 2003, 2006
•
Test 1: panel of small firms observed in 2000 and 2003
•
Test 2: panels of workers and firms observed in 2006 and 2007
The double hurdle (DH) model
• A worker’s genuine wage is observed if
– her productivity is above the MW (jumps the first hurdle)
– and her wage is fully reported (jumps the second hurdle)
• The genuine log wage:
y  X  u
• The reported log wage (where m=log(MW)):
 y if X  u  m and Z  v  0
y 
otherwise
m
*
• where (u,v) is normally distributed with variance matrix:
 2
S  
 
 

1 
The DH model – More insight
• Tobit is a special case if the second hurdle is
not effective
• DH model first proposed by Cragg (1971) and
widely used then in environmental economics,
models of consumer choice, banking etc.
• But only by Shelkova (2007) to analyse wage
distributions
Assumptions behind the DH model
• Unlike equilibrium models (e.g. Tonin 2006), we assume
that many workers with productivity below the MW stayed
in their jobs during/just after the episodes under
investigation. Both hurdles were effective
 When the plan of raising the MW to Ft 50,000 was announced, 32.7% of the
employees earned less than that
 When the double contribution base was announced, 58% earned less than 2MW
• Generally, taxes can be evaded by reporting any wage
below the genuine wage. Reporting the MW is the costminimising choice only if it does not increase the risk of
audit. We assume that was true in Hungary prior to 2007*
*) Elek and Szabó (2009) model a case of under-reporting, when the observed wage is not necessarily equal to the MW
Preliminary transformation of wages
• Log wages are not truncated normal because of the
crowding of wages just above the MW
• Preliminary transformation is needed
Martinez-Espineira (2006) Moffatt (2005) use Box-Cox. Yen and Jones (1997) use inverse hyperbolic sine
10.5
11
g(y)
11.5
12
• We apply:
g x  x  r  exp x  m  r  / r  if x  m  r
10.5
11
11.5
y
12
Transformed log wages are normal
g x  x  r  exp x  m  r  / r  if x  m  r
• r is estimated by two
methods:
2
1.5
0
.5
1
Density
– maximum likelihood on
a cross section
– from a quasi panel
Panel A
10
12
14
16
12
14
transformed log wages
16
log wages
1.5
1
.5
0
Density
• Transformed log
wages are
approximately
truncated normal
2
Panel B
10
The DH model – Estimation
• Likelihood function is given as:

*

  zi   /   yi  xi   m
L   1    , ,1 xi   m, zi  
2

1


y* m
y* m  

  1   y
 



*
i
 xi   m 





• Maximum likelihood estimate is consistent
and asymptotically normal if the
distributional assumptions are correct
Calculation of under-reporting probabilities
and simulation of genuine wages
• Under-reporting probabilities for MW-earners:


P X  u  m, Z  v  0 | y  m 
*
 X  m /      , ,1  X  m, Z 
1    , ,1  X  m, Z 
• The „genuine” wage of each MW earner can be
simulated:
– Simulate (u,v) bivariate normal variables with
covariance matrix S until Xβ+u<m or Zγ+v<0 holds
– Then the genuine log wage: y=max(Xβ+u, m)
Classification of workers and firms
• Workers: different criteria applied: cheater if MW is
reported and:
(1) P>0.5,
(2) w>MW
(3) w>1.5 MW.
Further thresholds were tested (1.1MW, 2MW) without the
qualitative conclusions being affected
• Firms: cheater if at least one employee is caught
cheating/victim of cheating
Results - DH
But we still have a huge spike at
the MW (true MW earners) 
•
Similar results for 2006
0
•
6000
Mean simulated (‘genuine’)
earnings of ‘cheating’ MW earners
exceeded 220% of the MW
4000
•
2000
5-8% of all employees and 35-55%
of the MW earners estimated to
receive envelope wages
Frequency
•
8000
DH estimates (for 2003)
0
100000
200000
300000
400000
Simulated wage of MW earners 2003
500000
Cheating and non-cheating MW earners
by occupation
MW earners classified on the basis of the DH estimates
Cheater if P>0.5
Teachers_doctors
Professionals
Technicians
Managers
Assemblers
Administrators
Services
Office_clerks
Agriculture
Industry
Architects_etal
Porters_guards
Drivers
Cleaners
Trade
Construction
Unskilled
0
10
non-cheater
20
30
cheater
Cheating and non-cheating MW earners
by firm size
MW earners classified on the basis of the DH estimates
Cheater if P>0.5
Size_301_plus
Size_51_300
Size_21_50
Size_11_20
Size_5_10
0
10
20
non-cheater
30
cheater
40
Cheating and non-cheating MW earners
by industry
MW earners classified on the basis of the DH estimates
Cheater if P>0.5
Electricity_at_al
Financial
Transport
Mining
Services
Agriculture
Manufacturing
Real_estate
Trade
Hotels_restaurants
Construction
0
10
non-cheater
20
30
cheater
Cheating and non-cheating MW earners
by ownership
MW earners classified on the basis of the DH estimates
Cheater if P>0.5
Foreign
Mixed
Domestic
0
5
10
non-cheater
15
cheater
20
Tests of the predictions
Test 1 – Empirical specification
The MW substantially increased the
costs of employing low-wage
workers
For firms starting or continuing underreporting, the implied cost
increase was smaller (for identical
workers)
The difference between cheaters and
non-cheaters in terms of total cost
increase varied with exposure 
Implied growth
in the cost
of employing
low-wage workers
non-cheater
1
Therefore we test if 1= 2 for
• wage growth
• residual wage growth
• employment growth
• share of unskilled workers
between 2000 and 2003
ln(.) = 1(exposurecheater)+ 2(exposurenon-cheater)+Z + u
cheater
2
Exposure to the
MW hike
Test 1 - Measurement
We have 4x3x2x2=48 equations/dependent var
Alternative measures of exposure (4x)
•
•
•
•
Fraction affected = earning less than Ft 40,000 in May 2000
Fraction affected = earning less than Ft 50,000 in May 2000
MW shock = average wage increase implied by the first MW hike under full compliance,
constant employment and no spillover (Machin-Manning-Rahman 2003)
MW shock = average wage increase implied by the first and second MW hikes under full
compliance, constant employment and no spillover (Machin-Manning-Rahman 2003)
Alternative measures of fraudulent behavior (3x)
•
Cheater if for at least one worker w>MW or w>1.5MW or P>0.5
Controls - base-period values (yes, no)
•
•
•
•
•
•
•
•
Firm size
Average wage
Capital/labor ratio
Profit/worker
Dummy for value subtractors
Skill shares, average age, share of men
Local unemployment rate
Industry dummies
Alternative samples (2x)
•
All firms versus only low-wage firms
Test 1 - sample
• 263 small firms (5-20 workers) observed in 2000 and
2003 in the WS
• We choose small firms because they report data on all of
their employees  exposure and skill composition are
precisely measured
• Disadvantage: small firms are randomly sampled, year
by year, so the panel is rather small
• Selection to the estimation sample from the base-period
population of small firms is examined with probit. The
results hint at random selection
Test 1 – Results (example)
Effects of exposure to the MW hike on wages and employment in 2000-2003
(Excerpts from Tables 6-9)
1
(cheaters)
2
(non-cheaters)
F-test
H0: 1=2
Log change of the
average wage
0.9665***
1.5171***
9.86***
Log change of the
average residual wage
0.7571***
1.1218***
4.60***
Log change of
employment
-0.4276*
-1.1279***
4.75**
-0.2472***
-0.6401***
9.32***
Dependent:
Pct points change in
unskilled share
Proxy of exposure: MW shock (average wage increase if firms increase the wages of affected workers to
Ft 40,000 and nothing else happens)
Proxy of cheating: P>0.5 for at least one MW employee in 2003
Sample: only low-wage firms (at least one worker earning less than Ft 40,000 in May 2000)
Controls: yes
Residual wage: firm-level mean residuals from benchmark Mincer equations estimated using WS 2000 and WS 2003
Test 1 – Results from different specifications
Test1: Results from different specifications
(48 specifications = 100%)
Non-cheaters: |1|>0 – effect of MW hike significant
Dependent: change of the
Average Residual Employ- Unskilled
wage
wage
ment
share
(+)
(+)
(-)
(-)
100
100
77
100
Cheaters: |2|>0 – effect of MW hike significant
100
100
10
40
|1| > |2| - effect of MW hike on non-cheaters is stronger
100
100
100
100
H0: 1=2 rejected
100
73
69
100
Test 2 – Models and samples
• Wage change regressions using the data of MW earners
(as of 2006) also observed in 2007
w* | (w0*=MW0) = f(X, cheater dummy)
• Probits. Same worker panel
Pr(w1*=2MW1 | w0*=MW)=(X, cheater dummy)
• Firm-level regressions for wages, employment and
sales. Sample: firm panel 2006-2007
lnL=h(Z, cheater dummy)
Test 2 – Results
individual regressions
Table 10: The effect of estimated cheating behavior on wage growth
between May 2006 and May 2007
(OLS, 7042 observations)
Proxy used
P>0.5
Coefficient
St. error
Controls
No
19825***
Education
15699***
All
12095***
*** p<0.01, ** p<0.05, * p<0.1.
1552
1275
1339
w>MW
Coefficient
St. error
w>1.5 MW
Coefficient
St. error
8703***
5742***
3472***
10706***
7263***
5089***
1072
980.1
1014
1268
1141
1173
Controls (all variables relate to 2006): Dummies for education (college graduate, secondary school and
vocational school), work experience in years, dummies for gender, municipality and firm size categories
Table 11: The effect of estimated cheating behavior on the probability that a
worker paid the MW in May 2006 was paid 2MW in May 2007
(Probit marginal effects, 7042 observations)
P>0.5
Coefficient
St. error
Proxy used
w>MW
Coefficient
St. error
w>1.5 MW
Coefficient
St. error
Controlls
No
0.133***
0.0175
0.0479***
0.00903
0.0583***
0.0107
Education
0.0845***
0.0141
0.0256***
0.00753
0.0296***
0.00868
All
0.0431**
0.0192
0.0111
0.00739
0.0144
0.00901
*** p<0.01, ** p<0.05, * p<0.1
Controls (all variables relate to 2006): Dummies for education (college graduate, secondary school and
vocational school), work experience in years, dummies for gender, municipality and firm size categories
Test 2 – Results
firm-level regressions
Effect of cheating behavior on changes of wages, employment and sales
Controls
P>0.5
Change of average wage (log)
0.134***
0.00860
No
0.0778***
0.00992
Yes
Change of employment (log)
-0.0329*** 0.00907
No
-0.0262*** 0.00960
Yes
Change of sales revenues (log)
No
-0.0640*** 0.0152
Yes
-0.0499*** 0.0167
*** p<0.01, ** p<0.05, * p<0.1.
Proxies of ‘cheating’ beavior
w>MW
w>1.5 MW
0.0775***
0.0365***
0.00636
0.00740
0.0875***
0.0432***
0.00668
0.00750
-0.0274***
-0.0196***
0.00645
0.00681
-0.0314***
-0.0244***
0.00688
0.00725
-0.0242**
-0.0151
0.0113
0.0120
-0.0284**
-0.0170
0.0128
0.0136
Sample: Firms observed in the Wage Survey in 2006 and 2007. Number of observations 4150 except
for sales revenues (4173).
Controls include skill shares, average wage, average age and dummies for sectors, regions, type of
municipality and state ownership
Conclusions
• The DH model seems to locate ‘fake’ MWs with some
precision
• The model might be used for statistical profiling but,
more importantly,it points to the limits of tax enforcement
• True MW earners exist. Substantially raising the MW
(+taxes) in order to ‘whiten the grey economy’ may
adversely affect non-cheating firms and workers
• Research: merging cheaters and non-cheaters leads to
strongly biased estimates of MW effects