Part 2 DIF detection in STATA

Download Report

Transcript Part 2 DIF detection in STATA

Part 2 DIF detection in STATA
Dif Detect - Stata
Developed by Paul Crane et al, Washington University based on
Ordinal logistic regression (Zumbo, 1999)
Ordinal or continuous covariates (i.e. not restricted to binary).
Model incorporates latent trait scores rather than sum scores advantage over parametric methods
http://www.alz.washington.edu/DIFDETECT/welcome.html
DIFwithPAR but need Parscale software for this
Publications:
Crane, Belle and Larson (2004) Test bias in a cognitive test: differential item functioning in
the CASI; Statistics in Medicine, 23, 241-256.
DifD website
DifD Model specification
Tests for Uniform and Non-uniform DIF
ologit itemresponse ability group ability*group
The program examines 3 ordinal logistic regression models for each item
•
•
•
f (item response) = cut + ß∗1θ
f (item response) = cut + ß∗1θ + ß∗2 group
f (item response) = cut + ß∗1θ + ß∗2 group+ß∗3θ∗ group
(1)
(2)
(3)
Cut represents the cutpoint(s) for each level in the proportional odds ologit model
θ (theta) is the IRT estimate of ability (e.g derived from Mplus)
Group is the indicator for the covariate
In model 3, β3 is the coefficient for the ability-group interaction term.
Gibbons et al (2009) International Psychogeriatics, 21:1
Detecting Non-Uniform & Uniform DIF
Non-Uniform DIF
(e.g. demographic interference between ability and item responses differs
at varying levels of the trait)
Log likelihoods of models 2 & 3 are compared to test the significance of
the interaction term
Uniform DIF
Fits models with and without group assignment (i.e. models 1 & 2)
Compares the relative difference between parameters associated with θ
If the relative difference was >10% then uniform DIF is present.
Option to also be compare -2 log likelihoods). Default alpha 0.20
(Maldonado & Greenland, 1993)
Gibbons et al (2009) International Psychogeriatics, 21:1
How to run Dif Detect - Stata
In stata type: findit difd
1 package found (Stata Journal and STB listed first)
difd from http://fmwww.bc.edu/RePEc/bocode/d
'DIFD': module to evaluate test items for differential item functioning
(DIF) / DIF detection is a first step in assessing bias in test items. /
difd detects DIF in test items between groups, conditional on the trait that the test is
measuring, using logistic regression.
•
•
•
Gives installation file:
difd.ado
difd.hlp
(click here to install)
How to run DIFd in Stata
Code for detection of differential item function (DIF) from difd.hlp.
difd varlist , ID(var) ABility(varlist) GRoups(varlist) CATegorical(varlist) RUnname(str)
NUL(#) NUP(#) NUPValue(#) UBeta(#) UBP(#) ULPV(#) UP(#) UPPValue(#)
ITemsub(#)
where:
varlist is the list of variables (items) to be tested for DIF
id (nb leave this out).
ability is the ability variable(s) (derive from Mplus or similar).
groups is the list of grouping variables. (can use binary, ordinal or Continuous ‘grouping’
variables)
Options
categorical is the list of any group variables that are categorical and have more than 2
levels. Note can omit dichotomous variables from this list. Default is none (all
continuous or dichotomous).
runname names the log file DIFdRUnname.log. Default is DIFd.log.`
DIFd in Stata
Code for detection of differential item function (DIF) from difd.hlp.
difd varlist , ID(var) ABility(varlist) GRoups(varlist) CATegorical(varlist) RUnname(str)
NUL(#) NUP(#) NUPValue(#) UBeta(#) UBP(#) UL(#) ULPV(#) UP(#) UPPValue(#)
ITemsub(#)
Options cont….
• ul indicates whether the log-likelihood test will be used as a criterion for
uniform DIF. Default is no (0). UL(1) will include this criterion
• ulpvalue is the p-value for testing uniform DIF with the log-likelhood method.
Default is 0.05.
Note: DIF results for categorical grouping variables will be in terms of the ordered
values of group. For example, if ethnic has 3 levels, 3 sets of DIF results will be
reported: ethnic12, ethnic13, ethnic23, where ethnic12 compares the 2 lowest
values of ethnic, ethnic13 the lowest and highest, etc.
DIFd Stata – back to Mplus
Run basic CFA model and save factor scores (ability) from Mplus to a data file
1) Add syntax to specify your ID in VARIABLE command:
idvar is caseno;
2) Add AUXILIARY in VARIABLE command to ensure covariates to be
used to test for DIF are included in saved data file (as these will not in
your basic CFA model or use variable statement )
auxiliary is sex;
3) Add SAVEDATA following OUTPUT statement and specify file
name and location
SAVEDATA: SAVE=FSCORES; FILE=C:\DATA\bext16.DAT;
Mplus to save F scores (ability)
USEVARIABLES are rut03 rut04 rut10 rut14 rut18;
CATEGORICAL are rut03 rut04 rut10 rut14 rut18;
idvar is caseno;
AUXILIARY = sex;
missing are all ( 88 999 );
ANALYSIS:
! TYPE is missing H1;
ESTIMATOR IS wlsmv;
ITERATIONS = 1000;
CONVERGENCE = 0.00005;
MODEL:
Conduct by rut03 rut04 rut10 rut14 rut18;
OUTPUT: SAMPSTAT STANDARDIZED RES MOD(10) ;
SAVEDATA: SAVE=FSCORES; FILE=C:\DATA\bext16.DAT;
Mplus output (save data)
SAVEDATA INFORMATION
Order and format of variables
RUT03
RUT04
RUT10
RUT14
RUT18
CASENO
SEX
CONDUCT
F10.3
F10.3
F10.3
F10.3
F10.3
I6
F10.3
F10.3
Save file
C:\DATA\bext16.DAT
Save file format
5F10.3 I6 F10.3
Save file record length 5000
Item responses
Individual factor scores /
ability scores
Import .dat file to spss
Open .dat file in spss using text import wizard
nb. Step 2 - select fixed width
Step 4 - make sure column breaks are right aligned because of missing data
Step 6 – check if numeric
Right align col
Save as stata file!
DIFd in Stata
•
difd rut03 rut04 rut10 rut14 rut18, ru(difd16ext) ab(conduct) gr(sex) cat(sex) ul(1)
•
log: C:\data\DIFdfin16ext.log
(0 observations deleted)
There are 8773 observations.
The 5 items of interest: rut03 rut04 rut10 rut14 rut18.
The 1 group of interest: sex.
The 1 ability of interest: conduct.
_______________________________________________________________
•
•
•
•
•
•
•
•
•
•
•
•
•
Non-Uniform Differential Item Functioning
-------------------------------------------------------------------------------------> group = sex
-------------------------------------------ability
|
and item | P(Dif.(LL)) Non-Uniform DIF
----------+--------------------------------conduct |
rut03 | .8930773
no
rut04 | .0532258
no
rut10 | .0515209
no
rut14 | .0001041
yes
rut18 | .245261
no
Non-Uniform DIF if P(Dif.(LL)) < .05
DIFd in Stata (Uniform DIF output)
Uniform Differential Item Functioning
-> group = sex
---------------------------------------------------------ability |
and item | Change in Est. P(Dif.(LL)) Uniform DIF
----------+----------------------------------------------conduct |
rut03 | .0042654
3.46e-16
yes
rut04 | .0135444
5.35e-06
yes
rut10 | -.0003362
.8060181
no
rut14 | .0011879
.159341
no
rut18 | .0006344
.6337193
no
---------------------------------------------------------Uniform DIF if Change in Est. > .1 or P(Dif.(LL)) < .05
This output was produced using DIFd version 1.0 by Paul Crane, Laura Gibbons, Lance Jolley, and
Gerald van Belle University of Washington Copyright 2005
DIFd in Stata (output file)
•
Saves parameters estimates an output data set, DIFd.dta, which includes model
results, with Brant test p-values for ordinal items and Hosmer-Lemeshow p-values
for binary items as data file (difd.dta)
Example extract of difd.dta file
type
item
group ability
ll
bab
sebab
bgp
sebgp
bi
sebintx pHL
0.287
1 rut03
sex
b16ext
-1107.5
4.115
0.414 -0.982
0.313 -0.039
1 rut03
sex
b16ext
-1107.5
4.063
0.143 -1.021
0.130
1 rut03
sex
b16ext
-1140.8
4.046
0.140
2 rut04
sex
b16ext
-2315.2
2.830
0.253
0.160
0.135
2 rut04
sex
b16ext
-2317.1
3.297
0.085
0.369
0.081
2 rut04
sex
b16ext
-2327.5
3.253
0.084
0.313
0.162
pBrant
model
dif
ll1
0.183
1
0.004
-1107.5
0.125
2
0.004
-1107.5
0.080
3
0.004
-1107.5
0.045
1
0.014
-2315.24
0.005
2
0.014
-2315.24
0.038
3
0.014
-2315.24
DIFd in Stata
Advantages:
Uniform and Non-Uniform Dif
Continuous , binary and polytomous items
Use ability scores rather than total scores
Disadvantages:
One subgroup at a time
Designed for unidimensional IRT (not multidimensional scales)
Based on ordinal logistic regression model so assumes proportional
slopes
To assess impact of DIF
IRT scores can be compared between participants when accounting for
DIF and not accounting for Dif
References
•
•
•
•
•
•
•
•
•
Camilli, G. and Shepard, L. A. (1994). Methods for Identifying Biased Test Items. Thousand Oaks, CA: Sage.
Crane, P. K., van Belle, G. and Larson, E. B. (2004). Test bias in a cognitive test: differential item functioning in
the CASI. Statistics in Medicine, 23, 241–256.
Crane, P. K., Gibbons, L. E., Jolley, L. and van Belle, G. (2006). Differential item functioning analysis with
ordinal logistic regression techniques: DIFdetect and difwithpar. Medical Care, 44, S115–S123.
Gibbons LE, McCurry S, et al (2009) Japanese–English language equivalence of the Cognitive Abilities
Screening Instrument among Japanese-Americans International Psychogeriatrics (2009), 21:1, 129–137
Jones, R. N. (2006). Identification of measurement differences between English and Spanish language versions
of the Mini-mental State Examination: detecting differential item functioning using MIMIC modeling. Medical
Care, 44, S124–133.
Mellenburgh, G. (1989). Item Bias and Item Response Theory. International Journal of Educational Research,
13, 127 – 143.
Reise, S.P. Widaman, K.F. and. Pugh RH (1993) Confirmatory Factor Analysis and Item Response Theory: Two
Approaches for Exploring Measurement Invariance Psychological Bulletin Vol. 114, No. 3, 552-566
Teresi, J (2006) Different approaches to Differential Item Functioning in Health Applications Advantages,
Disadvantages and some neglected topics. Medical Care, 44, 11, S152–170.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic
regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Canada:
Directorate of Human Resources Research and Evaluation, Department of National Defense.