Diapositiva 1

Download Report

Transcript Diapositiva 1

Assessing inconsistencies in reported job
characteristics of employed stayers:
An analysis on two-wave panels from the
Italian Labour Force Survey, 1993-2003
Francesca Bassi*, Alessandra Padoan**
and Ugo Trivellato***
*Statistics Department, University of Padova
**Statistics Office, Regione Veneto
***Statistics Department, University of Padova, and CESifo
European Conference on Quality in Official Statistics,
Rome, 8-11 July 2008
1
FOCUS OF THE PAPER
 Measurement error in information on
industry and occupation.
 Yearly transition matrices for workers
who are continuously employed over the
year and did not change job (263,884
units).
 Italian Quarterly Labour Force Survey
1993-2003.
2
OUTLINE OF THE PAPER
1) The context of the analyses
2) Descriptive indicators of (dis)agreement
3) Testing whether the consistency of
information increases when the number
of categories is collapsed
4) Examination of the patterns of
inconsistencies among response
categories
5) Comparison of alternative classifications
jointly by occupation and industry
3
1. THE CONTEXT OF THE ANALYSES
INDUSTRY
 Collected by an open-ended question
 12 categories (ATECO2002):
Agriculture; Mining and raw material extraction; Manufacturing;
Construction; Wholesale and retail trade;
Accommodation and food services; Transportation and communication
Financial and real estate activities;
Professional and support service activities;
Public Administration, defence and compulsory social services;
Education, health and other social services;
Other public, social and personal service activities
 Istat suggests to use the 12-category
classification
4
Table 1: Transition matrix by industry, April 1993 to April 1994
1994
1993
Agric. Mining Manuf. Constr. Wholes. Accom. Transp. Finance Profess. P.A.
Agric.
Mining
Manuf.
Constr.
Wholes.
Accom.
Transp.
Finance
Profess.
P.A.
Education
Other
1,624
1
16
12
24
2
11
2
2
23
7
9
4
270
28
12
6
1
2
5
2
11
2
5
26
8
32
22
5315
87
90 1,721
171
55
6
5
47
22
17
8
39
39
37
22
25
10
25
14
17
14
179
28
3,648
26
31
21
32
25
28
39
4
2
0
2
15
41
0
16
40
31
705
10
4 1,306
2
10
3
9
10
48
13
10
19
25
1
4
11
6
6
0
10
777
23
13
10
8
3
30
2
8
43
29
27
17
22
19
0
6
4
44
12
12
814
22
12 2,098
11 108
36
40
Education Other
7
1
37
12
28
15
11
11
22
164
3,188
58
4
1
36
21
34
14
21
10
65
27
38
938
5
OCCUPATION
Collected by a closed form question
11 categories:
Manager, Executive, Clerk, Workman, Apprentice, Outworker
Entrepreneur, Professional, Own-account worker,
Member of a producers’ cooperative, Contributing family worker
Istat suggests to use the binary
classification
Employee,Self-employed
6
Table 2: Transition matrix by occupation, April 1993 to April 1994
1994
1993
Manager
Executive
Clerk
Workman
Apprentice
Outworker
Entrepreneur
Professional
Own-account w
Coop’s member
Contr. family w
Manager Executive Clerk
243
65
55
15
0
0
4
6
4
0
0
89
39
684 218
284 6,790
16 557
0
11
0
4
0
6
13
36
4
40
0
6
3
36
Workman Appr. Outw. Entrep. Prof.
14
11
465
7,882
79
18
4
6
118
14
36
0
0
0
25
98
0
0
0
2
0
5
0
0
2
11
1
29
0
1
6
0
1
Own-a Coop. Contr.
6
7
4
0 11
4
13 31
44
8
6 138
0
1
2
0
1
5
238 21 123
7 641
93
102 89 4,353
8
3
54
12
7 117
0
0
8
22
0
0
6
2
74
115
14
0
1
12
35
1
0
10
3
102
7
767
7
2. DESCRIPTIVE INDICATORS OF
(DIS)AGREEMENT
P = percentage of frequencies outside the
main diagonal
ei = net difference rate
Ii = index of inconsistency
K = Cohen’s Kappa
ei 
X .i  X i.
100
X ..
Ii 
X .i  X i.  2 X ii
X .i  X ..  X i.   X i.  X ..  X .i / X ..
K
po  pe
1  pe
8
MAIN RESULTS
 Industry is reported with less inconsistency than
occupation.
 There is no significant trend in the indices
 K coefficients are high and statistically significant, but we
expect them equal to 1
 P assumes non negligible values
Table 3: Measures of inconsistencies with reference to industry and occupation
Panels
93-94
94-95
95-96
96-97
97-98
98-99
99-00
00-01
01-02
02-03
Industry
P
11.8
10.6
10.9
9.5
9.7
10.6
10.9
10.9
10.3
9.7
Occupation
K
0.8672
0.8785
0.8750
0.8915
0.8896
0.8787
0.8758
0.8754
0.8822
0.8892
P
14.0
13.1
13.1
12.5
12.9
12.9
13.1
13.7
12.5
12.4
K
0.8132
0.8255
0.8265
0.8349
0.8297
0.8301
0.8276
0.8195
0.8346
0.8355
9
3. COLLAPSING CATEGORIES
The hierarchical Kappa coefficient allows to verify if
aggregating categories improves agreement
K2 implies a less disaggregated classification than K1
Wii’ are chosen so that they imply aggregation among
categories identifying similar employment
I
Kˆ 
I
w
I
ii '
i 1 i '1
I
pii'  wii' pi. p.i '
i 1 i '1
I
I
1   wii' pi. p.i '
i 1 i '1
H 0 : Kˆ 2  Kˆ 1
H : Kˆ  Kˆ
1
2
1
10
MAIN RESULTS
Industry:
 Switching from 12 to 6 categories significantly improves agreement
in all panels
 Reducing further to 5 categories significantly improves agreement in
7 out of 10 panels
 No significant increase is obtained when reducing to 3 categories
 Istat uses the 12-category classification
Table 4: Hierarchical Kappa coefficients and Wald test: industry
Panels
93-94
94-95
95-96
96-97
97-98
98-99
99-00
00-01
01-02
02-03
Kappa coefficients
12 categories 6 categories 5 categories
0.8672
0.8833
0.8940
0.8785
0.8939
0.9037
0.8750
0.8899
0.8989
0.8915
0.9044
0.9104
0.8896
0.8982
0.9044
0.8787
0.8894
0.8964
0.8758
0.8880
0.8931
0.8754
0.8865
0.8883
0.8822
0.8910
0.8926
0.8892
0.9008
0.9046
3 categories
0.9020
0.9113
0.9005
0.9159
0.9082
0.9008
0.8944
0.8903
0.8943
0.9044
6 vs. 12
217.75***
174.26***
193.51***
165.64***
81.08***
107.82***
132.95***
109.25***
80.50***
131.86***
Wald test
5 vs. 6
3 vs. 5
96.26*** 26.59***
71.25*** 21.41**
71.79***
1.17
38.77*** 14.39
34.85***
6.19
40.62***
7.65
21.90**
0.65
2.91
1.39
2.47
1.08
14.05
0.02
11
Table A1 Industry
12 categories
Agriculture
Mining and raw material
extraction
Manufacturing
Construction
Wholesale and retail
trade
Accommodation and
food services
Transportation and
communication
Financial and real estate
activities
Professional and support
service activities
Public Administration,
defence and compulsory
social security
Education, health and
other services
Other public, social and
personal service
activities
6 categories
Agriculture
5 categories
Agriculture
3 categories
Agriculture
Manufacturing and mining Manufacturing and mining
Industrial sector
Construction
Wholesale and retail trade
Construction
Wholesale and retail trade
Services
Services
Other activities
Public Administration
12
MAIN RESULTS
Occupation:
 Switching from 11 to 6 categories significantly improves agreement
in all panels
 Reducing further to 2 categories significantly improves agreement in
all panels
 Istat uses the 2-category classification
Table 5: Hierarchical Kappa coefficients and Wald test: occupation
Panels
93-94
94-95
95-96
96-97
97-98
98-99
99-00
00-01
01-02
02-03
Kappa coefficients
11 categories
6 categories
2 categories
0.8132
0.8709
0.9317
0.8255
0.8803
0.9371
0.8265
0.8804
0.9361
0.8349
0.8904
0.9402
0.8297
0.8850
0.9406
0.8301
0.8863
0.9398
0.8276
0.8816
0.9388
0.8195
0.8764
0.9317
0.8346
0.8911
0.9413
0.8355
0.8894
0.9432
Wald test
6 vs. 11
2 vs. 6
1,035.71*** 621.99***
816.87***
486.58***
931.28***
560.05***
965.87***
487.31***
927.84***
552.70***
922.88***
512.05***
874.81***
565.95***
893.42***
481.62***
922.73***
466.68***
893.83***
527.38***
13
Table A2 Occupation
11 categories
Manager
Executive
Clerk
Workman
Apprentice
Outworker
Entrepreneur
Professional
Own-account worker
Member of a producers’ cooperative
Contributing family worker
6 categories
2 categories
White-collar
Blue-collar
Employee
Outworker
Self-employed
Self-employed
Member of a producers’ cooperative
Contributing family worker
14
4. PATTERNS OF INCONSISTENCIES
Goodman quasi-independence model is used to evaluate if, when we
leave the main diagonal cells aside, the remaining cells show
particular patterns of disagreement
Accepting the model, implies that errors in reporting employment occur
randomly
Rejecting the model implies that there are systematic patterns of
associations in errors
log Fij    i   j  ij
ij  0, i  j
15
MAIN RESULTS
Industry:
 The quasi-independence model is always rejected (12,
6, 5 and 3 categories)
 The BIC index is lower with 6 categories
 Estimated residuals suggest non-random measurement
error affecting responses in each wave of the survey
Occupation:
 The quasi-independence model is always rejected (11, 6
and 2 categories)
 The BIC index is lower with 2 categories
 Estimated residuals suggest that the binary classification
has become too rigid for the Italian labour market
16
5. ALTERNATIVE CLASSIFICATIONS JOINTLY
BY OCCUPATION AND INDUSTRY
4-category joint classification:
Self-employed
Employee in agriculture
Employee in industrial sector
Employee in services
4- category alternative classification:
Self-employed
Employee in agriculture
Employee in industrial sector and private services
Employee in Public Administration and social services
17
Table A3: Joint classification by occupation and industry
13 categories
7 categories
6 categories
4 categories
Self-employed
Employee in:
Agriculture
Mining and raw
material extraction
Manufacturing
Construction
Wholesale and
retail trade
Accommodation
and food services
Transportation
and
communication
Financial and real
estate activities
Professional and
support service
activities
Public
Administration,
defence and
compulsory social
security
Education, health
and other services
Other public,
social and
personal service
activities
Self-employed
Employee in:
Agriculture
Manufacturing
and mining
Self-employed
Employee in:
Agriculture
Manufacturing
and mining
Self-employed
Employee in:
Agriculture
4 categories
alternative class.
Self-employed
Employee in:
Agriculture
Industrial sector
Construction
Wholesale and
retail trade
Construction
Wholesale and
retail trade
Industrial sector
and private
services
Services
Other activities
Public
Administration
Services
Public
Administration
and social services
MAIN RESULTS
The ‘alternative classification’ has a
(significantly) higher level of agreement in
(8) 9 out of 10 panels.
The ‘alternative classification’ has a
significantly higher level of agreement in
the overall sample.
19
CONCLUSIONS - 1
1) Aggregating categories improves agreement
2) The best levels of aggregation are
for industry:
Agriculture, Manufacturing and mining, Construction, Wholesale
and trade, Other activities
for occupation:
Self-employed, Employee
for occupation and industry jointly:
Self-employed, Employee in agriculture, Employee in industrial
sector and private services, Employee in Public Administration
and social services
20
CONCLUSIONS - 2
3) Estimated residuals from the model of
quasi-independence suggest that even
cross-section information is affected by
non-random measurement error
4) May dependent interviewing help in
reducing inconsistencies?
21
SELECTED REFERENCES
 Bound J., C. Brown and N. Mathiowetz (20002). Measurement error
in survey data. In J.J. Heckman and E. Leamer (Eds.), Handbook of
Econometrics. Volume 5, Amsterdam, Elsevier Science, 3705-3843.
 Goodman L.A. (1968). The analysis of cross-classified data:
independence, quasi-independence, and interaction in contingency
tables. Journal of the American Statistical Association, 63, 10191131.
 Landis J.R. and G.G. Koch (1977). The measurement of observer
agreement for categorical data. Biometrics, 33, 159-174.
 Mathiowetz n. (1992). Errors in reports of occupation. Public Opinion
Quarterly, 56, 352-355.
 Sala E. and P. Lynn (2006). Measuring change in employment
characteristics: the effects of dependent interviewing. International
Journal of Public Opinion Research, 18, 500-509.
22