Structural Equation Models for Longitudinal Analysis of

Download Report

Transcript Structural Equation Models for Longitudinal Analysis of

MCUAAAR: Methods &
Measurement Core Workshop:
Structural Equation Models for
Longitudinal Analysis of Health
Disparities Data
April 11th, 2007
11:00 to 1:00
ISR 6050
Thomas N. Templin, PhD
Center for Health Research
Wayne State University
Many hypotheses concerning health disparities involve the
comparison of longitudinal repeated measures data across one or
more groups. A chief advantage of this type of design is that
individuals act as their own control reducing confounding.
SEM Models for Balanced Continuous
Longitudinal Data
 Early Models (Jöreskog, 1974, 1977)

Autoregressive (2-wave or multi-wave)


Covariance structure only (means were not modeled)
Simplex , Markov, and other models for correlated error structure
 Contemporary Models


Autoregressive models with means structures (Arbuckle, 1996)
Growth curve models


Latent means with no variance (Joreskog, 1989)
Latent factors with means with variance (Tisak & Meridith,1990)
 Multigroup and Cohort Sequential Designs


Latent means and variance modeled separately (Random Effects
Mixed Design) (Rovine & Molenaar,2001)
Latent change and difference models (McArdle & Hamagami, 2001)
SEM Models for Balanced Continuous
Longitudinal Data
 Contemporary Models (cont.)

Growth curve models (cont)





Growth models for experimental designs (Muthen &Curran, 1997)
Biometric Models (McArdle, et al,1998)
Pooled interrupted time series model (Duncan & Duncan, 2004)
Latent class GC models (Muthen, M-Plus)
Multilevel GC models
MG-Latent Identity Basis Model
 Unlike the familiar two-wave autoregressive model,
latent growth curve and change and difference models
involve a different approach to SEM modeling.
 Many of these models appear to be variations of one
another.
 I formulated what I am calling a multigroup latent identitly
basis model (MG-LBM) that serves as a starting point for
more specific longitudinal models.
 I will formulate this for model and then derive latent
difference and growth, random effects, and other kinds of
models that have appeared in the literature
MG-Latent Basis Model
 Two Parts

Means Structure

Within group coding of within subject contrasts.
 Test parameters by comparing models with and
without equality constraints

Between plus within-group coding.
 Test parameters directly.

Covariance Structure


Model error directly (replace error covariances
with latent factors, etc)
Model error indirectly (add latent structure to
prediction equations)
Means Structure Notation
1 0 0 0
Y i 
i 
0 1 0 0
0 0 1 0
0 0 0 1
y i1
Yi 
y i2
y i3
y i4
1
2
3
4

i1

i1


i1

i1
E
T 0 4x4
T
E


E
covY4x4
E
YE
 4x1
Amos Setup: Within-Group Coding of
Means Structure For Girls Group
g14
g24
g13
g23
g12
0, g1
e11
0, g2
ig1
y1
e21
y2
g34
0, g3
ig2
e31
ig3
y3
e41
0, g4
ig4
y4
1
1
1
m1
m4
1
,0
,0
m2
,0
m3
,0
Amos Setup : Within-Group Coding For
Boys Group
b14
b24
b13
b23
b12
0, b1
e11
0, b2
ib1
y1
e21
y2
b34
0, b3
ib2
e31
ib3
y3
e41
0, b4
ib4
y4
1
1
1
m1
m4
1
,0
,0
m2
,0
m3
,0
Within-Group Coding
Parameter constraints identified in “manage models”
All intercepts are constrained to 0.
ib1 = ib2 = ib3 = ib4 = ig1 = ig2 = ig3 = ig4=0
Estimated Means Structure Model for
Girls Group
MG-LBM Model (within group coding)
Girls
Chi Square = .000, DF = 0
Chi Square Probability = \p, RMSEA = \rmsea, CFI = 1.000
3.96
3.71
3.94
3.66
3.05
0, 4.10
e11
0, 3.29
.00
y1
e21
y2
4.97
0, 5.08
.00
e31
.00
y3
e41
0, 5.40
.00
y4
1
1
1
m1
m4
1
24.09, .00
21.18, .00
m2
22.23, .00
m3
23.09, .00
Estimated Means Structure Model for
Boys Group
MG-LBM Model (within group coding)
Boys
Chi Square = .000, DF = 0 , Chi Square Probability = \p, RMSEA = \rmsea, CFI = 1.000
1.51
2.63
3.40
2.06
2.15
0, 5.64
e11
0, 4.28
.00
y1
e21
y2
3.04
0, 6.59
.00
e31
.00
y3
e41
0, 4.08
.00
y4
1
1
1
m1
m4
1
27.47, .00
22.88, .00
m2
23.81, .00
m3
25.72, .00
Contrast Coding Across Groups
 In order to explicitly estimate between group effects




and interactions you need one design matrix for
within and between effects.
The more general coding described next will provide
a foundation for this.
With 4 repeated measures and 2 groups a total of 8
contrasts or identity vectors are needed.
The same 8 means will be estimated but now there is
one design matrix across both groups.
This is achieved by constraining parameter estimates
for each of the 8 identity vectors to be equal across
groups
Design Matrix to Code Within and
Between Effects
y 11
Girls: Y 1 
y 12
y 13
y 14
y 21
Boys: Y 2 
y 22
y 23
y 24
1 2 3 4 5 6 7 8
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
Amos Coding for Means Structure:
Girls Group
g14
g24
g13
g23
g12
0, g1
e11
0, g2
ig1
y1
mg1, 0
e31
ig2
mg5, 0
m2
mg2, 0
ig3
mg6, 0
m3
mg3, 0
ig4
0
m7
1
0, g4
y4
0
m6
1
e41
y3
0
m5
m1
0, g3
y2
0
1
e21
g34
mg7, 0
m8
1
m4
mg4, 0
mg8, 0
Amos Coding for Means Structure:
Boys Group
b14
b24
b13
b23
b12
0, b1
e11
0, b2
ib1
y1
mb1, 0
e31
ib2
mb5, 0
m2
mb2, 0
ib3
mb6, 0
m3
mb3, 0
ib4
1
m7
0
0, b4
y4
1
m6
0
e41
y3
1
m5
m1
0, b3
y2
1
0
e21
b34
mb7, 0
m8
0
m4
mb4, 0
mb8, 0
Alternate Coding: Girls Group
g14
g24
g13
g23
g12
0, g1
e11
0, g2
ig1
y1
m1
mg1, 0
mg5, 0
0, g3
e31
ig2
y2
m5
1
e21
m2
mg2, 0
e41
ig3
y3
m6
1
g34
mg6, 0
m3
mg3, 0
ig4
y4
m7
1
0, g4
mg7, 0
m8
1
m4
mg4, 0
mg8, 0
Alternate Coding: Boys Group
b14
b24
b13
b23
b12
0, b1
e11
0, b2
ib1
y1
ib2
mb2, 0
mb3, 0
ib4
1
m8
mb7, 0
m3
0, b4
y4
m7
mb6, 0
m2
ib3
1
m6
mb5, 0
e41
y3
1
m5
mb1, 0
0, b3
e31
y2
1
m1
e21
b34
mb8, 0
m4
mb4, 0
Parameter Constraints
Parameter constraints identified in “manage models”
All intercepts are constrained to 0.
ib1 = ib2 = ib3 = ib4 = ig1 = ig2 = ig3 = ig4=0
Each of the p x q latent means is constrined to equality
across group (boys = girls)
mb1 = mg1
mb2 = mg2
mb3 = mg3
mb4 = mg4
mb5 = mg5
mb6 = mg6
mb7 = mg7
mb8 = mg8
Estimated Means
MG-LBM Model
Girls
Chi Square = .000, DF = 0
Chi Square Probability = \p, RMSEA = \rmsea, CFI = 1.000
3.96
3.71
3.94
3.66
3.05
0, 4.10
e11
0, 3.29
.00
y1
m1
21.18, .00
0, 5.08
e31
.00
y2
m5
1
e21
1
m2
22.23, .00
e41
.00
y3
m6
22.88, .00
4.97
23.81, .00
0, 5.40
y4
m7
1
m3
23.09, .00
.00
25.72, .00
m8
1
27.47, .00
m4
24.09, .00
Estimated Means
MG-LBM Model
Boyss
Chi Square = .000, DF = 0 , Chi Square Probability = \p, RMSEA = \rmsea, CFI = 1.000
1.51
2.63
3.40
2.06
2.15
0, 5.64
e11
0, 4.28
.00
y1
.00
22.23, .00
1.00
m8
25.72, .00
m3
23.09, .00
.00
y4
m7
23.81, .00
m2
.00
0, 4.08
1.00
m6
22.88, .00
e41
y3
1.00
m5
21.18, .00
0, 6.59
e31
y2
1.00
m1
e21
3.04
27.47, .00
m4
24.09, .00
Application
 This method is used to construct models for
cohort sequential designs and for missing
value treatments when there are distinct
patterns of missingness
 May be useful for family models where the
groups represent families of different sizes or
composition
Remember Everything You Used to
Know About Coding Regression
 With this mean structure basis you can now apply any of the familiar
regression coding schemes to test contrasts of interest
 You can use dummy coding, contrast, or effects coding. Polynomial
coding is used for growth curve models. Dummy coding will
compare baseline to each follow-up measurement
 Interactions are coded in the usual way as product design vectors
 Using the inverse transform of Y you can construct contrasts specific
to your hypothesis if the standard ones are not adequate.
Dummy Coding to Compare Each Follow-up
Measure With the Baseline Measure
1 2 3 4
y1
1
0
0
0
y2
1
1
0
0
y3
1
0
1
0
y4
1
0
0
1
Note that here we include the unit vector in the dummy coding. In regression,
the unit vector is included automatically so you don’t usually think about it.
Amos Setup: Dummy Coding to Compare Each
Follow-up Measure With the Baseline Measure
g14
g24
g13
g23
g12
0, g1
e11
0, g2
0
y1
1
e21
g34
0, g3
0
y2
e31
y3
0, g4
0
e41
y4
1
1
1
,0
f1
f3
f2
,0
1
1
1
,0
f4
,0
0
Comments & Interpretation
 There is nothing intuitive about the coding. It is based




on the inverse transform.
Here it looks like we are taking the average of all the
measures to compare with each follow-up measure.
In reality, we really are just comparing baseline (i.e,
Y1) with each follow-up measure.
The latent means estimate Y1, Y2-Y1, Y3-Y1, and
Y4 –Y1.
Check this out against the means in the handout
Change From Baseline Model
Dummy Coding of Lambda
kappa(i+1) = Time(i+1)mean - Time(1)mean
Girls
Chi Square = .000, DF = 0
Chi Square Probability = \p, RMSEA = \rmsea, CFI = 1.000
3.96
3.71
3.88
3.82
3.05
0, 4.10
e11
0, 3.29
0
y1
1.00
e21
5.14
0, 5.42
0
y2
e31
y3
0, 5.40
0
e41
0
y4
1.00
1.00
1.00
21.18, .00
f1
f3
f2
1.05, .00
1.00
1.00
1.00
2.00, .00
f4
2.91, .00
Statistical Tests of Change Contrasts
Asymptotic Test
Estimate
S.E.
C.R.
P Label
f1
21.182
.641
33.067
***
f2
1.045
.360
2.907
.004
f3
2.000
.421
4.750
***
f4
2.909
.398
7.312
***
Statistical Tests of Change Contrasts
Bootstrapped Tests and 95% CI
Parameter
Estimate
Lower
Upper
P
f1
21.182
20.078
22.517
.010
f2
1.045
.331
1.666
.010
f3
2.000
1.145
2.743
.010
f4
2.909
2.068
3.602
.010
Novel Contrast Using Inverse
Transform
1 2 3 4
y1 y2 y3 y4
1
A
1 1 1
1 1 0
0
y1
, inverse:
y2
1
2
1
2
1
2
12
1
2
1
2
1
1
0
0
1 1
y3 0
0
1
1
0
0
0
y4 0
0
0
1
1
Novel Coding Using Inverse Transform
Girls
Chi Square = .000, DF = 0, Chi Square Probability = \p,
RMSEA = \rmsea, CFI = 1.000
3.96
3.71
3.88
3.82
3.05
0, 4.10
e11
0, 3.29
0
y1
e21
5.14
0, 5.42
0
y2
e31
0, 5.40
0
y3
e41
0
y4
.50
-3.86, .00
f1
.50
1.00
1.00
.50
-.50
.50
.50
-1.05, .00
f3
f2
-.91, .00
1.00
1.00
1.00
f4
24.09, .00
Growth Curve Model with Fixed Effects Only
Jöreskog, 1989
g14
g24
g13
Girls
g23
g12
0, g1
e11
g34
0, g2
e21
m1
y1
m2
y2
1
1
e31
0, g4
e41
m3
y3
y4
2
1
ICEPT
,0
0, g3
4
6
1
0
Slope
,0
m4
b14
b24
b13
Boys
b23
b12
0, b1
e11
b34
0, b2
e21
m1
y1
m2
y2
1
1
e31
0, b4
e41
m3
y3
y4
2
1
ICEPT
,0
0, b3
4
6
1
0
Slope
,0
m4
Constraints on model parameters
Constraints on Covariance Matrix:
Homogeneity of Covariance Assumption
b12=g12
b13 = g13
b14 = g14
b23 = g23
b24 = g24
b34 = g34
Intercepts set to zero in both groups
m1 = m2 = m3 = m4=0
Y variable variances are set equal within
group
b1 = b2 = b3 = b4
g1 = g2 = g3 = g4
Growth Curve Model
Girls
Joreskog & Sorbom (1989, LISREL 7 User Guide, 2nd Ed., p 261)
Chi Square = 11.454, DF = 16
Chi Square Probability = .781, RMSEA = .000, CFI = 1.000
2.94
3.08
2.90
3.15
2.97
0, 3.70
e11
.00
y1
0, 3.70
e21
.00
y2
1.00
0, 3.70
e31
.00
y3
0, 3.70
e41
y4
4.00
2.00
1.00 1.00
ICEPT
21.23, .00
3.45
6.00
1.00
.00
Slope
.48, .00
.00
Growth Curve Model
Boys
(Pottoff & Roy, 1964)
Joreskog & Sorbom (1989, LISREL 7 User Guide, 2nd Ed., p 261)
Chi Square = 11.454, DF = 16 , Chi Square Probability = .781, RMSEA = .000, CFI = 1.000
2.94
3.08
2.90
3.15
2.97
0, 5.84
e11
.00
y1
e21
.00
0, 5.84
e31
.00
y3
0, 5.84
e41
y4
4.00
2.00
1.00 1.00
ICEPT
22.60, .00
3.45
0, 5.84
y2
1.00
Compare to
Data in
Handout
6.00
1.00
.00
Slope
.79, .00
.00
Do the slope
and intercept
estimates look
Reasonable for
each group?
Part II: Covariance Structure for
Correlated Observations
 Standard techniques like we OLS regression,
ANOVA, and MANOVA compare means and
leave the correlated error unanalyzed.
 The SEM approach, and modern regression
procedures like HLM, tap the information in
the correlation structure.
 Latent structure can be brought out of the
error side or the observed variable side of the
model.
Amos Setup: Growth Curve Model
with Random Slope and Intercept
g14
g24
g13
g23
g12
0, g1
0, g2
e1
e2
1
g34
1
mg1
y1
y2
1
0, g3
e3
1
mg2
1
mg3
y4
2
1
ICEPT
e4
y3
1
1
0, g4
4
6
0
Slope
mg4
Model Constraints
Correlations among error terms are fixed to 0
b12=g12=0
b13 = g13=0
b14 = g14=0
b23 = g23=0
b24 = g24=0
b34 = g34=0
b3 = b4
Intercepts fixed to 0.
m1 = m2 = m3 = m4 = mg1 = mg2 = mg3 = mg4=0
Growth Curve Model (Tisak & Meridith, 1990)
Girls
Chi Square = 10.144, DF = 11, Probability = .517,
RMSEA = .000, CFI = 1.000
.00
.00
.00
The covariance
among the
measures is now
accounted for by
the random
effects
.00
.00
0, .89
e1
1
.00
0, .50
0, .39
e2
.00
y1
1
0, .17
e3
1
.00
y2
e4
1
.00
y3
y4
2.00
1.00 1.00
1.00
1.00
.00
4.00
6.00
.00
ICEPT
Slope
21.24, 2.88
.48, .02
.14
Growth Curve Model
Boys
Chi Square = 10.144, DF = 11 Probability = .517
RMSEA = .000, CFI = 1.000
.00
.00
.00
.00
.00
0, 3.78
e1
1
.00
0, 2.36
0, 2.55
e2
.00
y1
1
0, 2.55
e3
1
.00
y2
e4
1
.00
y3
y4
2.00
1.00 1.00
1.00
1.00
.00
4.00
6.00
.00
ICEPT
Slope
22.51, 2.11
.81, -.01
.06
Mixed Model (Rovine & Molenaar, 2001)
Girls
Latent variable parameters constrained equal across groups
Chi Square = 32.467, DF = 20
Chi Square Probability = .039, RMSEA = .158, CFI = .806
The fixed and random parts
can be separated at the latent
level. The mathematical
equivalence of this type of
SEM and the hierarchical or
mixed effects model with
balanced data was shown by
Rovine & Molenaar (2001)
.00
.00
.00
.00
.00
0, 1.72
e11
y1
1
0, 1.72
e21
.00
.00
0, 1.72
e31
.00
y2
.00
y3
e41
.00
y4
1
4.00
2.00
.00
1
1
ICEPT-m
0, 1.72
1
1 1
.0
1
2
4
slope-m
6
21.21, .00
0, 2.91
ICEPT
Slope
0, .02
-.01
6.00
.48, .00
Extension to other kinds of
multilevel or clustered data
have appeared in the
literature
b14
b24
b13
b23
b12
0, b2
0, b1
0, b3
e2
e1
1
b34
1
m1
e3
1
m2
y2
y1
e4
2
m4
y4
4
6
1
1
1
m3
y3
11
If the latent factors have sufficient
variance, they can be used as
variables in a more comprehensive
model. Here the intercept has
substantial variance but the slope
does not. Individual differences in
the intercept could be an important
predictor of health outcome.
0, b4
0
ICEPT
Slope
0
Health
Outcome
1
1
0,
1
0,
1
0,
b14
b24
b13
b23
b12
0, b2
0, b1
1
m1
0, b4
e3
1
m2
y2
y1
1
1
Variable
Correlated
With
Race/Ethnicity
0, b3
e2
e1
1
b34
e4
1
m3
y3
1
y4
4
2
6
1
0
ICEPT
1
0,
Slope
0
Health
Outcome
1
1
0,
m4
1
0,
1
0,
Here individual
differences in the
intercept are modeled
as a mediator of health
outcome
The longitudinal repeated measures advantage only applies for
constructs that actually do change over time. In the example below,
individual differences only exist in the average score or the intercept
resulting in a between groups analysis subject to all the usual
confounding.
Y
0
Time
Change in Y would only be related to other variables by chance. In
longitudinal analysis determining the variance in true change is critical
but how to do it is somewhat of an issue.
For example, in this figure true change exists at the population level but is
constant within groups.
Y
0
Time
Once group is taken into account there are no individual differences in
rate of change. Hence hypotheses concerning change in Y at the group
level should be recognized as untestable.
Pooled Interrupted Time Series Analyses
Duncan & Duncan, 2004
Change From Baseline Model
Dummy Coding of Lambda
kappa(i+1) = Time(i+1)mean - Time(1)mean
Girls
Chi Square = 7.615, DF = 5 Chi Square Probability = .179,
RMSEA = .229, CFI = .946
0, 2.56
0, 2.56
r1
0
r2
1
1.00
1.16
e21.00
0
y1
1
0
e11.00
y2
r4
r3
1
.74
0, .75
0, .75
0
0 .95
e31.00
y3
0
1
0
e41.00
y4
0
1.00
1.00
1.00
21.18, .00
f1
f3
f2
1.05, .00
1.00
1.00
1.00
2.00, .00
f4
2.91, .00
E1
E2
E3
E4
1
1
1
1
X1
X2
X3
X4
Amos default
Growth model
ICEPT
SLOPE