ppt

Transcript ppt

Correction for measurement error
in survey research
using SQP
Willem E. Saris
RECSM 2013
Introduction
• All researchers agree that survey data
contain measurement errors
• Since 1971 procedures for correction of
measurement errors are known (Duncan
and Goldberger)
• However, very few researchers try to correct
for these errors
Attention to measurement problems
in social science journals of 2011
Journal
Year
No. paper Survey
research
used
ESR
2011
48
41
9
1
EJPR
2011
32
20
4
1
POQ
2011
33
32
4
1
AJPS
2011
54
23
3
0
JM
2011
47
27
11
8
ESR=The European Sociological Review,
EJPR= European Journal of Political Research,
POQ= Public Opinion Quarterly,
APSR=The American Journal of Political Science ,
JM=Journal of Marketing
Errors
Errors
mentioned corrected
Why does this happen?
1. because the effect of measurement
errors is very small?
or
2. because it is very difficult to correct for
measurement error?
or
3. because the information about the size of
the measurement errors is not available?
1. Is the effect of the measurement
errors very small?
1. Is the effect of the measurement
errors very small?
• Around 1971, Alwin, Andrews and I
detected, using LISREL, that the errors in
survey questions are very large
• All three have spent their academic life on
the estimation and correction for
measurement error
• Duane Alwin (2007) concentrated on the
Quasi Simplex approach
• Frank Andrews (1984) and I used the MTMM
approach
The size of the error variance
• Our estimate was that in average 50% of the
variance of responses to survey questions is
error
• So there is a considerable difference
between the variable one likes to measure
and the observed variable
Consequences of measurement
error
• The consequences will discussed for
• The observed correlations
• The regression analysis
• Comparative research
The consequences for the
correlation
• Imagine that we are interested in the
correlation between
– f1 = job satisfaction
– f2= life satisfaction
• We ask : How satisfied are you with your job?
and : How satisfied are you with your life?
The responses are represented by y1 and y2
We know that there is quite a difference
between f1 and y1 and between f2 and y2
A very simple model
f1
q1
y1
e1
r(f1f2)
f2
q2
y2
e2
If the variables fi and yi are standardized
• qi2 = the quality of the indicator i for
latent variable i
• 1- qi2 = the error variance of indicator i
for latent variable i
• It can be proven that:
r(y1y2) = r(f1f2) q1q2
Consequences for correlations
• If the correlation between the latent variables is r(f1,f2) = .9, the
correlation between the observed variables will be as follows
Quality
Quality
coefficient coefficient
Observed
correlation
q1
q2
r(y1, y2)
1.0
1.0
.90
.9
.9
.73
.8
.8
.58
.7
.7
.45
.6
.6
.33
Consequences for correlations and
regressions
JS*
.6
JS
e1
.4
e3
LS
.6
LS*
.4
u3
Age*
Age
.99
e2
Consequences for correlations and
regressions
Correlations between
Latent variables
JS*
Age*
JS*
1.0
Age*
0.0
1.0
LS*
.4
.4
Observed variables
LS*
1.0
Regression
LS*=.4JS*+.4Age*+u3
JS
Age
JS
1.0
Age
0.0
1.0
LS
.13
.24
LS
1.0
LS=.13JS+.24Age+e3
Consequences for cross cultural
comparison
Country A
f1
.9

Country B
f1
f2
.9
.7

f2
.7
y1
y2
y1
y2
e1
e2
e1
e2
Corr(Y1,Y2)=.65=.8*.9*.9
Corr(Y1,Y2)=.4=.8*.7*.7
Conclusions
• The research of me, Andrews, Alwin and others
shows that the error variance in survey data is
rather large
• The errors cause that the correlations and
regression coefficients between observed
variables can be very different from those
between latent variables
• Differences in error variances across countries
will make comparisons across countries
impossible
2. Is correction for measurement errors very
difficult?
The standard SEM approach
1
e1
y1
e2
y2
e3
y3
x1
Environmental
Values
(1)
Environment friendly
behavior (3)
e4
y4
e5
y5
e6
y6
Influence
2

Perception
Environmental damage
(1)
x2
1
2
3
x3
3
x4
4
Understanding
politics
(2)
Is this the approach to use?
• In principle this approach is correct but in
reality it leads to a lot of complications and
errors
• This may be a reason that researchers don´t
correct for measurement errors
• There should be simpler procedures
2. Is correction for measurement
errors very difficult?
f1
q1
y1
e1
r(f1f2)
f2
q2
y2
e2
If this model holds :
r(y1y2) = r(f1f2) q1q2
Then it also holds that
r(f1f2) = r(y1y2)/ q1q2
So correction for measurement
error is very simple
This holds for single questions as
well as composite scores
Quality estimates of two scales in
the last Pilot of the ESS
• Two scales were constructed:
– one based on opinions about liberal rights called
“liberal democracy” and
– one based on opinions about electoral
requirements called “electoral democracy”
• The quality of the scale is:
– for liberal demoncracy .79
– for electoral democracy .77
Correction for measurement error
• The oberved correlation between the two
scales is r(y1y2) = .638
• So r(f1f2) = .638/√(.79x.77) = .82
• So while the observed correlation is not very
high, the correlation corrected for
measurement error indicates quite a strong
relationship between the two scales
Relationships with other variables
• We expect that the scale of liberal
democracy should correlate with the
variables :
– Just (no poverty), quality = .51
– Direct (referenda), quality = .62
– Income (houshold), quality = .92
• We will now show how simple we can do
regresion analysis with and without
correcting for measurement errors
Procedure to correct for
measurement error using LISREL
Without correction for
measurement error
With correction for measurement
error
Effects on liberal democracy in the UK
da ni=4 no=378 ma=km
km
1.0
.495 1.0
.401 .413 1.0
.210 -.053 -.116 1.0
labels
liberal just direct income
model ny=1 nx=3
out
Effects on liberal democracy in the UK
da ni=4 no=378 ma=km
cm
.79
.495 .51
.401 .413 .62
.210 -.053 -.116 .92
labels
liberal just direct income
model ny=1 nx=3
out
Here 1 on the diagonal
Here quality on the diagonal
The correlations and regression
Without correction for
measurement errors
With correction for
measurement errors
Correlations
Correlations
liberal
just direct income
-------- -------- -------- -------liberal
1.00
just
0.50
1.00
direct
0.40
0.41
1.00
income
0.21 -0.05
-0.12
1.00
Regression (36% explained)
liberal
s.e.
t-value
just direct income
-------- -------- -------0.40
0.27
0.26
(0.05) (0.05) (0.04)
8.77
5.84
6.29
liberal
just direct income
-------- -------- -------- -------liberal
1.00
just
0.78
1.00
direct
0.57
0.73
1.00
income 0.25
-0.08
-0.15
1.00
Regression (70% explained)
just direct income
-------- -------- -------liberal
0.76
0.07
0.32
s.e.
(0.04) (0.04) (0.03)
t-value 18.22
1.59
11.06
Generalization
• The same can be done for causal models
with several variables and composite scores
• It can be done for standardized and
unstandardized coefficients
• STATA has also posibilities for correction for
measurement error but less general
Procedure to correct for
measurement error using Stata
Limitations:
• One can apply it only on regression, not on
causal models in general
• Only correction for measurement error in the
independent variables
• Only unstandardized analysis
Regression without correction in
STATA
regress liberal socjustice direct income if cntry==1
The procedure for correction in
STATA
eivreg liberal socjustice direct income , r(socjustice .51 direct .62
income .92), if cntry==1
Conclusions
• Correction for measurement errors is
nowadays very simple
• Correction for measurement errors is also
necessary
3. Is it difficult to estimate the quality
of questions and composite scores?
3. Is it difficult to estimate the quality
of questions and composite scores?
• There are a lot of different procedures
• They all require at least 2 questions for each
concept and the estimates are specific for
the formulations of these questions
• That means that the questionnaires become
twice as long and more expensive
The Multi-Trait Multi Method
approach
• There are many procedures developed to
obtain estimates of the quality of questions
and composite scores (Saris&Gallhofer 2007)
• We have chosen the MTMM design
– proposed by Campbell and Fiske (1959)
– further developed by Andrews (1984), Saris and
Andrews (1991), Saris, Satorra and Coenders
(2004)
An example
• Three ESS questions about satisfaction:
– On the whole, how satisfied are you with the present
state of the economy in Britain?
– Now think about the national government. How
satisfied are you with the way it is doing its job?
– And on the whole, how satisfied are you with the way
democracy works in Britain?
Three alternative response scales
The first (M1):
1)very satisfied , 2)fairly satisfied, 3)fairly dissatisfied or 4)very
dissatisfied
The second (M2):
very
dissatisfied
0
1
2
3
4
5
6
7
8
9
very
satisfied
10
The third (M3):
1)not at all satisfied 2)satisfied 3)rather satisfied 4)very satisfied
Estimation
• In this way one gets 45 variances and
covariances
• Using this data the quality coefficients for
these 9 questions can be estimated
Limitation of these experiments
• In the ESS 3.000 questions have been
evaluated with respect to quality up to now
• However, in the same time 60.000 questions
have been asked
• One can never evaluate all questions
• So an alternative procedure is necessary
An alternative procedure
• Frank Andrews already studied the
relationship between the characteristics of
the questions and the quality of questions
• My idea was that if these relationships are
strong one can use them for the prediction
of the quality of new questions
• I also thought of creating a program that
could make these quality predictions
MTMM experimenst in IRMCS
1990 - 2000
• 87 MTMM experiments were collected in the US
(Andrews), the Netherlands (Scherpenzeel),
Belgium (Billiet)and Austria (Költringer)
containing 1023 questions
• A first meta analysis was done to see if the
quality of the questions could be explained by
question characteristics
• The results were very promising: the explained
variance was .50 and .60 for the reliability and
validity (Saris & Gallhofer 2007)
MTMM experiments in the ESS
2000 - 2012
• In the European Social Survey in each round in
each country 4 to 6 experiments
• That means that in each round 1000 questions
in more than 20 different European languages
were evaluated
• After 3 rounds, we had information about the
quality of 3.000 questions
• We expected to be able to predict the quality
of the questions from the questions
characteristics
The long way to the solution: SQP
• We coded the question characteristics of the
MTMM questions
• And we estimated the relationship between these
characteristics and the quality of the questions
• Without going into details (Oberski et al 2012), we
could predict reliability with a R2 =.8 and the validity
with a R2=.9 for the present 3.700 MTMM questions
• The prediction procedure was implemented in the
program SQP 2.0
The quality predictions of SQP 2.0
• So we are quite confident that SQP can
make rather good predictions of new
questions on the basis of the characteristics
of the question
Let us go to have a look
Available here: http://sqp.upf.edu/
Can be used free of charge!
You just need to register and then
you can use it directly online
Conclusions
• It seems that it is easy to get information about the
quality of questions
• SQP gives for a lot of questions information about the
quality based on research
• SQP can also be used to predict the quality of questions
that have not been studied
• Users can bring in their own questions and by coding
the question obtain a prediction of the quality
• If the qualities of single questions are known, the quality
of composite scores can also be derived
Conclusions
• The program SQP is an internet application
• So all users that are coding questions add
information about quality of new questions
to the database
• In this way,one gets a growing data base of
questions with their quality: A wikipedia for
questions
Conclusions
Is there any reason not to correct for
measurement error ?
1. Is the effect of measurement errors very
small? NO!
2. Is it very difficult to correct for measurement
error? NO!
3. Is the information about the size of the
measurement errors missing? NO!
Conclusions
• There is no reason anymore to analyze data
without correction for measurement error
• If one takes research seriously, one has to
make the correction for measurement errors
• Otherwise one cannot trust the results from
the research
Summary
• A summary of all details and problems of this
approach using ESS data will be provided in
a second edition of
• Saris and Gallhofer Design, Evaluation and
Analysis of Questionnaires for Survey
Research. Hoboken, Wiley
• The book will appear in 2014
A FINAL ILLUSTRATION OF
CORRECTION FOR A MORE
COMPLEX CASE
• A very popular topic of research is the
explanantion of the opinion about
immigration of people from outside Europe
Economic
threat
Allow more people
from outside Europe
Better
life
Culture
threat
Summary of the predicted values of
the quality indicators in Ireland
Variable
Method
r2
v2
m2
q2
Allow
SQP2.0
.826
.906
.094
.747
Economy
SQP2.0
.770
.780
.220
.601
Culture
SQP2.0
.761
.705
.295
.537
Better
SQP2.0
.748
.725
.275
.543
Correction for errors taking cmv into
account
ρ(f1,f2)
f1
f2
v1j
Mj
m1j
t1j
v2j
m2j
t2j
r1j
r2j
fi = ith variable of interest
vij = validity coefficient for variable i
Mj = method factor for both variables
mij = method effect on variable i
tij = true score for yij
rij = reliability coefficient
y1j
y2j
yij = the observed variable
e1j
e2j
eij= the random error in variable yij
r(y1j,y2j) = r(f1,f2)q1jq2j + cmv
r(f1,f2)r(y1j,y2j) - cmv]/ q1jq2j
Correction of the correlations for
random errors and CMV
Estimates of the parameters with
and without correction
Conclusion
• This example shows again that the
corrections for measurement error is
necessary
• Now there is also no excuse anymore.
• The procedures to correct are simple
• And SQP provides information about the
quality of questions even without collecting
extra new data
We did not do this work alone
• Hubert Blalock, Karl Jöreskog, Frank Andrews, Albert
Satorra, Marius de Pijper, Anuska Ferligoj, Roger
Jowel, JoanManuel Batista
• Past Students: Annette Scherpenzeel, Richard
Költringer, Germa Coenders, Chris Aalbers, Irmgard
Corten, William van der Veld, Luis Coromina, Laura
Guillen, Desiree Knoppen
• The new generation: Melanie Revilla, Diana Zavalla,
Laur Lilleoja, Wiebke Weber
• Special group: Daniel Oberski and Tom Gruner
The Future
• Of course the predictions are not perfect
• Improvement is always possible
- Alternative quality estimation procedures
can be developed
• Extention is necessary for
- new question forms and
- other languages
• But…
Future
• I leave this task for the RECSM researchers:
– Wiebke Weber, Melanie Revilla, Diana Zavalla,
Anna de Castellarnau, Lydia Repke, Jennifer
Neumann, Bruno Arpino, Paolo Moncagatta and
André Pirralha
• I have a lot of confidence that they will take
the proper decisions in the future to
maintain and improve the present tool
• So that I can concentrate on other things…
Club Pati Barcelona
www.upf.edu/survey
[email protected]