Using Quasi-variance to Communicate Sociological Results

Download Report

Transcript Using Quasi-variance to Communicate Sociological Results

Presenting results from statistical models
Professor Vernon Gayle and Dr Paul Lambert
(Stirling University)
Wednesday 1st April 2009
Structure of the Seminar
Should take 1 semester!!!
1. Principals of model construction and interpretation
2. Key variables – measurement and func. Form
3. Presenting results
4. Longitudinal data analysis
5. Individuals in households – multilevel models
“One of the useful things about mathematical
and statistical models [of educational
realities] is that, so long as one states the
assumptions clearly and follows the rules
correctly, one can obtain conclusions which
are, in their own terms, beyond reproach.
The awkward thing about these models is
the snares they set for the casual user; the
person who needs the conclusions, and
perhaps also supplies the data, but is
untrained in questioning the assumptions….
…What makes things more difficult is that, in
trying to communicate with the casual user,
the modeller is obliged to speak his or her
language – to use familiar terms in an
attempt to capture the essence of the model.
It is hardly surprising that such an enterprise
is fraught with difficulties, even when the
attempt is genuinely one of honest
communication rather than compliance with
custom or even subtle indoctrination”
(Goldstein 1993, p. 141).
Structure of the this session
1. Presenting results
•
This talk could also take weeks on end
•
Two topics only - not the final word
–
–
•
Quasi-Variances
Sample Enumeration methods
Many more topics emerging,
–
–
propensity score matching
simulation modelling
Using Quasi-variance to Communicate Sociological
Results from Statistical Models
Vernon Gayle & Paul S. Lambert
University of Stirling
Gayle and Lambert (2007) Sociology, 41(6):1191-1208
A little biography (or narrative)…
• Since being at Centre for Applied Stats in 1998/9 I
has been thinking about the issue of model
presentation
• Done some work on Sample Enumeration Methods
with Richard Davies
• Summer 2004 (with David Steele’s help) began to
think about “quasi-variance”
• Summer 2006 began writing a paper with Paul
Lambert
The Reference Category Problem
• In standard statistical models the effects of a
categorical explanatory variable are assessed
by comparison to one category (or level) that
is set as a benchmark against which all other
categories are compared
• The benchmark category is usually referred to
as the ‘reference’ or ‘base’ category
The Reference Category Problem
An example of Some English Government
Office Regions
0 = North East of England
---------------------------------------------------------------1 = North West England
2 = Yorkshire & Humberside
3 = East Midlands
4 = West Midlands
5 = East of England
Government Office Region
Table 1: Logistic regression prediction that self-rated health is ‘good’
(Parameter estimates for model 1 )
No Higher qualifications
1
2
3
4
Beta
Standard
Error
Prob.
95% Confidence Intervals
-
-
-
Higher Qualifications
Males
-
-
Females
North East
0.0056
0.65
-
0.0041
-0.20
-
-
<.001
<.001
-
-
-
0.64
0.66
-
-
-0.21
-0.20
-
-
North West
0.09
0.0102
<.001
0.07
0.11
Yorkshire & Humberside
0.12
0.0107
<.001
0.10
0.14
East Midlands
0.15
0.0111
<.001
0.13
0.17
West Midlands
0.13
0.0106
<.001
0.11
0.15
East of England
0.32
0.0107
<.001
0.29
0.34
South East
0.36
0.0101
<.001
0.34
0.38
South West
0.26
0.0109
<.001
0.24
0.28
Inner London
0.17
0.0122
<.001
0.15
0.20
Outer London
0.27
0.0111
<.001
0.25
0.29
Constant
0.48
0.0090
<.001
0.46
0.50
Beta
Standard
Error
Prob.
North East
-
-
-
North West
Yorkshire & Humberside
95% Confidence
Intervals
-
-
0.09
0.07
0.11
0.12
0.10
0.14
Conventional Confidence Intervals
• Since these confidence intervals overlap we might be beguiled
into concluding that the two regions are not significantly
different to each other
• However, this conclusion represents a common
misinterpretation of regression estimates for categorical
explanatory variables
• These confidence intervals are not estimates of the difference
between the North West and Yorkshire and Humberside, but
instead they indicate the difference between each category and
the reference category (i.e. the North East)
• Critically, there is no confidence interval for the reference
category because it is forced to equal zero
Formally Testing the Difference
Between Parameters -
t
ˆ
ˆ
2- 3
ˆ
ˆ
s.e. (  2 -  3 )
The banana skin is here!
Standard Error of the Difference
var( ˆ 2)  var( ˆ 3) - 2 (cov ( ˆ 2 - ˆ 3 ))
Variance North West (s.e.2 )
Only Available in the
variance covariance matrix
Variance Yorkshire &
Humberside (s.e.2 )
Table 2: Variance Covariance Matrix of Parameter Estimates for the Govt Office Region variable in Model 1
Column
Row
1
2
3
4
5
6
7
8
9
North
West
Yorkshire &
Humberside
East
Midlands
West
Midlands
East
England
South East
South West
Inner
London
Outer
London
1
North West
.00010483
2
Yorkshire &
Humberside
.00007543
.00011543
3
East
Midlands
.00007543
.00007543
.00012312
4
West
Midlands
.00007543
.00007543
.00007543
.00011337
5
East
England
.00007544
.00007543
.00007543
.00007543
.0001148
6
South East
.00007545
.00007544
.00007544
.00007544
.00007545
.00010268
7
South West
.00007544
.00007543
.00007544
.00007543
.00007544
.00007546
.00011802
8
Inner
London
.00007552
.00007548
.0000755
.00007547
.00007554
.00007572
.00007558
.00015002
9
Outer
London
.00007547
.00007545
.00007546
.00007545
.00007548
.00007555
.00007549
.00007598
Covariance
.00012356
Standard Error of the Difference
0.0083 =
0.00010483 0.00011543- 2 ( 0.00007543
)
Variance North West (s.e.2 )
Only Available in the
variance covariance matrix
Variance Yorkshire &
Humberside (s.e.2 )
Formal Tests
t = -0.03 / 0.0083 = -3.6
Wald c2 = (-0.03 /0.0083)2 = 12.97; p =0.0003
Remember – earlier because the two sets of
confidence intervals overlapped we could wrongly
conclude that the two regions were not
significantly different to each other
Comment
• Only the primary analyst who has the
opportunity to make formal comparisons
• Reporting the matrix is seldom, if ever, feasible
in paper-based publications
• In a model with q parameters there would, in
general, be ½q (q-1) covariances to report
Firth’s Method (made simple)
s.e. difference ≈
quasivar(ˆ2 )  quasivar(ˆ3 )
Table 1: Logistic regression prediction that self-rated health is ‘good’ (Parameter estimates for model 1, featuring
conventional regression results, and quasi-variance statistics )
No Higher qualifications
Higher Qualifications
Males
Females
North East
1
2
3
4
5
Beta
Standard
Error
Prob.
95% Confidence
Intervals
QuasiVariance
-
-
-
0.65
-0.20
-
0.0056
0.0041
-
<.001
<.001
-
-
-
-
0.64
0.66
-
-
-
-
-0.21
-0.20
-
-
-
0.0000755
North West
0.09
0.0102
<.001
0.07
0.11
0.0000294
Yorkshire & Humberside
0.12
0.0107
<.001
0.10
0.14
0.0000400
Firth’s Method (made simple)
s.e. difference ≈
0.0083 =
quasivar(ˆ2 )  quasivar(ˆ3 )
0.0000294 0.0000400
t = (0.09-0.12) / 0.0083 = -3.6
Wald c2 = (-.03 / 0.0083)2 = 12.97; p =0.0003
These results are identical to the results calculated by
the conventional method
The QV based ‘comparison intervals’ no longer overlap
Firth QV Calculator (on-line)
Table 2: Variance Covariance Matrix of Parameter Estimates for the Govt Office Region variable in Model 1
Column
Row
1
2
3
4
5
6
7
8
9
North West
Yorkshire &
Humberside
East
Midlands
West
Midlands
East
England
South East
South West
Inner
London
Outer
London
1
North West
.00010483
2
Yorkshire &
Humberside
.00007543
.00011543
3
East
Midlands
.00007543
.00007543
.00012312
4
West
Midlands
.00007543
.00007543
.00007543
.00011337
5
East England
.00007544
.00007543
.00007543
.00007543
.0001148
6
South East
.00007545
.00007544
.00007544
.00007544
.00007545
.00010268
7
South West
.00007544
.00007543
.00007544
.00007543
.00007544
.00007546
.00011802
8
Inner
London
.00007552
.00007548
.0000755
.00007547
.00007554
.00007572
.00007558
.00015002
9
Outer
London
.00007547
.00007545
.00007546
.00007545
.00007548
.00007555
.00007549
.00007598
.00012356
Information from the Variance-Covariance
Matrix Entered into the Data Window (Model 1)
0
0 0.00010483
0 0.00007543 0.00011543
0 0.00007543 0.00007543 0.00012312
0 0.00007543 0.00007543 0.00007543 0.00011337
0 0.00007544 0.00007543 0.00007543 0.00007543 0.00011480
0 0.00007545 0.00007544 0.00007544 0.00007544 0.00007545 0.00010268
0 0.00007544 0.00007543 0.00007544 0.00007543 0.00007544 0.00007546 0.00011802
0 0.00007552 0.00007548 0.00007550 0.00007547 0.00007554 0.00007572 0.00007558 0.00015002
0 0.00007547 0.00007545 0.00007546 0.00007545 0.00007548 0.00007555 0.00007549 0.00007598 0.00012356
-1
0
1
2
3
5+ GCSE Passes Year 11
Ethnicity Effects
Black
Indian
Chinese
Bangladeshi
White
Pakistani
Ethnicity
Parameter estimate
Parameter estimate
95% confidence interval
95% QV compariosn intervals
Source: YCS Cohort 9, n=12789.
Model: Logistic regression estimating '5+ GCSE Passes A*-C'.
QV Conclusion –
We should start using method
Benefits
• Overcomes the reference
category problem when
presenting models
• Provides reliable results
(even though based on an
approximation)
• Easy(ish) to calculate
• Has extensions to other
models
Costs
• Extra column in results
• Time convincing colleagues
that this is a good thing
Example
Drew, D., Gray, J. and Sime, N. (1992)
Against the odds: The Education and Labour
Market Experiences of Black Young People
Comparison of Odds
Greater than 1 “higher odds”
Less than 1 “lower odds”
Naïve Odds
• In this model (after controlling for other
factors)
White pupils have an odds of 1.0
Afro Caribbean pupils have an odds of 3.2
• Reporting this in isolation is a naïve
presentation of the effect because it ignores
other factors in the model
A Comparison
Pupil with
Pupil with
4+ higher passes
White
Professional parents
Male
Graduate parents
Two parent family
0 higher passes
Afro-Caribbean
Manual parents
Male
Non-Graduate parents
One parent family
Odds are multiplicative
4+ Higher Grades
Ethnic Origin
Social Class
Gender
Parental Education
No. of Parents
1.0
1.0
1.0
1.0
1.0
1.0
1.0
3.2
0.5
1.0
0.6
0.9
Odds
1.0
0.86
Naïve Odds
• Drew, D., Gray, J. and Sime, N. (1992) warn of this
danger….
• …Naïvely presenting isolated odds ratios is still
widespread (e.g. Connolly 2006 Brit. Ed. Res.
Journal 32(1),pp.3-21)
• We should avoid reporting isolated odds ratios
where possible!
Logit scale
• Generally, people find it hard to directly interpret results
on the logit scale – i.e. 
Log Odds, Odds, Probability
• Log odds converted to odds = exp(log odds)
• Probability = odds/(1+odds)
• Odds = probability / (1-probability)
Log Odds, Odds, Probability
Odds
ln odds
p
99.00
4.60
0.99
19.00
2.94
0.95
9.00
2.20
0.9
4.00
1.39
0.8
2.33
0.85
0.7
1.50
0.41
0.6
1.00
0.00
0.5
0.67
-0.41
0.4
0.43
-0.85
0.3
0.25
-1.39
0.2
0.11
-2.20
0.1
0.05
-2.94
0.05
0.01
-4.60
0.01
Odds are asymmetric – beware!
Divide by 4 rule
• Gelman and Hill (2008) suggest dividing coefficients from logit models by 4 as
a guide for assessing the effects of the  estimated for a given explanatory
variable as a probability
• They assert that /4 provides a ‘rule of convenience’ for estimating the upper
bound of the predictive difference corresponding to a unit change in the
explanatory variable.
• Gelman and Hill (2008) are careful to report that this is an approximation and
that it performs best near the midpoint of the logistic curve
• We believe that this has some merit as a rough and ready method of
interpreting the effects of estimates and is a useful tool especially when tables
of coefficients are rapidly flashed up at a conference presentation
Gelman, A. and J. Hill (2008) Data Analysis Using Regression and
Multilevel/Hierarchical Models, Cambridge: Cambridge University Press
Communicating Results
(to non-technically informed audiences)
• Davies (1992) Sample Enumeration
• Payne (1998) Labour Party campaign data
• Gayle et al. (2002)
• War against the uninformed use of odds (e.g.
on breakfast t.v.)
Sample Enumeration Methods
In a nutshell…
“What if” – what if the gender effect was removed
1. Fit a model (e.g. logit)
2. Focus on a comparison (e.g. boys and girls)
3. Use the fitted model to estimate a fitted value
for each individual in the comparison group
4. Sum these fitted values and construct a
sample enumerated % for the group
Naïve Odds
• Naïvely presenting odds ratios is
widespread (e.g. Connolly 2006)
• In this model naïvely (after controlling for
other factors)
Girls have an odds of 1.0
Boys have an odds of .58
We should avoid this where possible!
Logit Model
• Example from YCS 11
(these pupils took GCSE in 2001)
y=1 5+ GCSE passes (A* - C)
X vars
gender; family social class (NS-SEC);
ethnicity; housing tenure; parental
education; parental employment;
school type; family type
Naïve Odds
• Example from YCS 11
(these pupils took GCSE in 2001)
• In this model naïvely (after controlling for
other factors)
Girls have an odds of 1.0
Boys have an odds of .66
We should avoid this where possible!
Sample Enumeration Results
Percentage with 5+ GCSE (A*-C)
All
52%
Girls
58%
Boys
47%
(Sample enumeration est. boys)
Observed difference
(50%)
11%
Difference due ‘directly’ to gender
3%
Difference due to other things
8%
Pseudo Confidence Interval
Sample Enumeration
Male Effect
Upper Bound
50.32%
Estimate
49.81%
Lower Bound
49.30%
Bootstrapping to construct a pseudo confidence interval
(1000 Replications)
Reference
• A technical explanation of the issue is given in
Davies, R.B. (1992) ‘Sample Enumeration Methods for Model Interpretation’ in
P.G.M. van der Heijden, W. Jansen, B. Francis and G.U.H. Seeber (eds)
Statistical Modelling, Elsevier
We have recently written a working paper on logit models
http://www.dames.org.uk/publications.html
Conclusion –
Why have we told you this…
• Categorical X vars are ubiquitous
• Interpretation of coefficients is critical to
sociological analyses
– Subtleties / slipperiness
– (e.g. in Economics where emphasis is often on
precision rather than communication)