ICPSRWeek3Class4 - Investigadores CIDE

Download Report

Transcript ICPSRWeek3Class4 - Investigadores CIDE

General Structural Equations
(LISREL)
Week 3 #4
Mean Models Reviewed
Non-parallel slopes
Non-normal data
Models for Means and
Intercepts (continued)
Multiple Group Models:
For “zero order” latent variable mean
differences:




“free” individual measurement equation intercepts
but constrain them to equality across groups
Fix the latent variable means to 0 in group 1
Free the latent variable means in groups 2->k
If the latent variables of interest are endogenous
and if there are exogenous latent variables in the
model, constrain construct equation path
coefficients to zero.
2
Models for Means and
Intercepts (continued)
For “zero order” latent variable mean differences:






“free” individual measurement equation intercepts but constrain them to equality
across groups
Fix the latent variable means to 0 in group 1
Free the latent variable means in groups 2->k
If the latent variables of interest are endogenous and if there are exogenous latent
variables in the model, constrain construct equation path coefficients to zero.
Individual LV mean parameters represent contrast with
(differerence from) “reference group” (group with LV
mean set to zero; LR tests requested for joint
hypotheses (e.g, constrain means to zero in all groups
vs. model with groups 2->k freed)
Check modification indices on measurement equation
intercepts to verify “proportional indicator differences”
assumption holds (or at least holds approximately) 3
AMOS Programming




Check off “means and intercepts”
Means and intercepts will now appear on
diagram. Where variances used to appear,
there will now be two parameters (mean +
variance); where the variable is dependent,
one parameter (intercept) will appear.
Impose appropriate parameter constraints
[insert brief demonstration here!]
4
Review yesterday’s slides from
slide 52



Uses World Values Study 1990 data for
an example
We’ll use an updated version (new data,
some difference in countries) today
Refer to handout (slides not
reproduced)
5
Means1a.LS8
- tau-x elements allowed to vary between
countries. Must fix kappa (mean of ksi’s) to 0 since otherwise
not identified.
Chi-square=233.65 df=42
United States:
TAU-X
A006
F028
F066
F063
F118
F119
------------------------------------------1.6191
3.6383
2.2287
8.5530
4.7504
2.9739
(0.0263)
(0.0688)
(0.0563)
(0.0733)
(0.0941)
(0.0749)
61.4969
52.8937
39.5980
116.6334
50.4717
39.6838
TAU-X
F120
-------4.3443
(0.0883)
49.2263
F121
-------5.9000
(0.0757)
77.9553
A006
-------2.1202
(0.0236)
89.9232
TAU-X
F120
-------4.4986
(0.0713)
F028
-------4.7402
(0.0612)
77.4453
CANADA:
TAU-X
F066
-------3.2042
(0.0551)
58.1887
F063
-------7.4657
(0.0706)
105.7780
F118
-------5.4974
(0.0812)
67.7035
F119
-------3.3091
(0.0646)
51.2445
F121
-------6.0079
(0.0636)
6
Means1b.ls8
Measurement model like means1a, but now we are expressing group
1 versus group 2 differences in means by 2 parameters (1 for each latent variable)
as opposed to calculating them for each indicator using, e.g., TX 1 [1] – TX1 [2].
Chi-square=276.27 df=48
KAPPA in Group 2 (Canada) [Kappa in Group 1 is fixed to zero]
KSI 1
KSI 2
-------- -------1.0712 0.3236
(0.0731) (0.0948)
14.6538 3.4138
Above provides significance tests for:
Canada-U.S. differences in religiosity (z=14.6538, p<.001)
Canada-U.S. differences in sex/morality attitudes (z=3.4138, p<.001)
For a joint significance test to see if both the means for Religiosity and Sex/morality
are different (null hypothesis, differences both = 0), see program Means1c.ls8.
Chi-square = 512.9661 df=50 for this model; subtract chi-squares (512-276) for
test (df=2).
7
Diagnostics for this model:
See Modification Indices for TX vectors:
USA
Modification Indices for TAU-X
A006
F028
F066
F063
F118
F119
------------------------------------------0.6495
0.2995
2.8724
8.2808
27.0494
2.0749
Modification Indices for TAU-X
F120
F121
--------------12.1313
5.2727
CANADA
Modification Indices for TAU-X
A006
F028
F066
F063
F118
F119
------------------------------------------0.6495
0.2995
2.8725
8.2808
27.0495
2.0749
Modification Indices for TAU-X
F120
F121
--------------12.1312
5.2728
Expected Change for TAU-X
A006
F028
F066
F063
F118
F119
------------------------------------------0.0164
0.0238
0.0593
0.1261
0.3003
0.0637
Expected Change for TAU-X
F120
F121
---------------0.1981
-0.1015
8
Means2a
Model with exogenous single-indicator variables.
Single indicator ksi-variables: gender, age, education.
Specification GA=IN in group 2 implies a parallel slopes model.
Thus, the AL parameters in group 2 can be interpreted as “group 1 vs. group 2
differences, controlling for differences in sex, education and age”.
TAU-X
GENDER
AGE
EDUC
---------------------0.4217
42.3840
4.5365
(0.0146)
(0.4750)
(0.0413)
28.9360
89.2300
109.8409
ALPHA
ETA 1
ETA 2
--------------1.2272
0.5898
(0.0714)
(0.0954)
17.1899
6.1819
KAPPA
GENDER
--------0.0196
(0.0187)
-1.0482
AGE
-------3.3360
(0.6297)
5.2977
EDUC
--------0.4151
(0.0504)
-8.2333
9
Diagnostics:
Test of equal slopes (GA=IN) assumption:
Modification Indices for GAMMA
GENDER
AGE
EDUC
---------------------ETA 1
7.7083
6.9705
0.2122
ETA 2
3.1923
0.1765
9.3836
A global test will require the estimation of a separate
model (Means2b) with GA=PS (parallel slopes assumption
relaxed).
Chi-square
df
CFI
Chi-square comparisons
Means2a:
699.807
90
.9635
Means2b:
669.594
84
.9649
10
Means2b
ALPHA
CANADA (FIXED TO 0 IN US)
ETA 1
ETA 2
--------------1.2545
0.6371
(0.0725)
(0.0968)
17.3057
6.5809
GAMMA - USA
GENDER
AGE
EDUC
---------------------ETA 1
0.6845
-0.0170
0.0817
(0.1003)
(0.0031)
(0.0352)
6.8230
-5.5398
2.3209
ETA 2
0.0624
(0.1462)
-0.0144
(0.0045)
0.3074
(0.0520)
GENDER
-------0.9597
(0.0931)
10.3099
AGE
--------0.0308
(0.0028)
-11.1125
EDUC
-------0.1525
(0.0389)
3.9173
-0.0936
(0.1200)
-0.0246
(0.0036)
0.5333
(0.0521)
GAMMA-Canada
ETA 1
ETA 2
11
Expressing effects when parallel slope assumption is relaxed:
is pattern diverging, converging, crossover?
Equations:
Eta1 = alpha1 + gamma1 Ksi 1 + gamma2 Ksi2 + gamma3 Ksi 3 + zeta1
Hold constant at the 0 values of all Ksi variables except one. Not quite the overall
mean (Ksi=0 in group 1, but in group 2 it’s 0 + kappa), but close enough.
In group 1, alpha1 = 0, equation is:
Eta1 = gamma1 [1]Ksi1
[+alpha1=0 + gamma2 Ksi2=0 + gamma3 Ksi3=0 + zeta1
where E(zeta1)=0
In group 2, alpha1 = alpha1[2]
Eta1 = alpha1[2] + gamma1[2] Ksi1
[+ other terms =0]
Now, the question is, at what values do we evaluate the equation?
1. Ksi1=0
This is the Ksi1 mean in group 1. (we could, alternatively
use something like kappa1[2]/2, which is half way between
the group 1 and the group 2 mean of kappa1 … or even a
weighted version)
2. Ksi1 = 0 + k standard deviations, where k can be any reasonable number
1? 1.5? 2.0?
3. Ksi1 = 0 – k standard deviations.
12
How do we find the standard deviation of Ksi?
Look at the PHI matrix to obtain variances, and take the
square root of these!
PHI
USA
GENDER
AGE
EDUC
---------------------GENDER
0.2441
(0.0102)
23.9687
AGE
-0.4381
(0.2350)
-1.8642
259.2400
(10.8158)
23.9687
EDUC
0.0251
(0.0204)
1.7457
(0.6670)
1.9599
(0.0818)
13
For education, if we had a pooled estimate (Canada + US) we could
use it, otherwise, we can be approximate 1.9599, 1.4733 ~ 1.72
sqrt(1.72) = 1.3. So we will want to evaluate at EDUC=0,
EDUC=+1.3 (or perhaps +2.6?), EDUC=-1.3 (or perhaps -2.6?).
At Educ=0, Canada-US difference is 1.2545 (see alpha parameter,
above) USA=0 Canada=1.2545
At Educ=-2.6, USA= 0 + (-2.6 * .0817) [usa gamma for educ = .0817]
= -.2124
Canada = 1.2545 + (-2.6 * .1525)
[Canadian gamma for educ = .1525]
= 858
At Educ = +2.6, USA = 0 + (2.6 * .0817) = .2124
Canada = 1.2545 + (2.6 * .1525) = 1.651
14
2
1.5
1
USA
Canada
0.5
0
-2.6
0
2.6
-0.5
Education
15
For age, approximate variance is sqrt (270) = 16.43. We could thus use 0 ±
16.43 or 0 ≠ 32.86 (or 0 ≠ (1.5 * 16.43) or if we knew that the mean was
approximately 42 (see tau-x parameter), we could simply do something like ±
20 years (more intuitive)
2
1.5
1
USA
Canada
0.5
0
-20
0
20
-0.5
Age (0=42 years)
16
Models for Four Groups
• Canada
• U.S.A.
• Germany
• U.K.
Means3a GA=PS Chi-square = 1892.25 df=180
Means3b GA=IN
Chi-square = 1986.94
df=198
17
Formulas: USA: =0.0738*B8
Canada: =1.087+(B8*0.1457)
UK : =2.4339+(B8*-0.1167)
Germany: =1.8139+(B8*0.0957)
[B8 refers to the first education row. Formula becomes B9, B10
For rows below]
Value of
Educ
-2.6
0
2.6
USA
USA
-0.19188
0
0.19188
Canada
Canada
0.70818
1.087
1.46582
UK
2.73732
2.4339
2.13048
Germany
1.56508
1.8139
2.06272
18
3
2.5
2
USA
1.5
Canada
UK
1
Germany
0.5
0
-0.5
-2.6
0
2.6
Education
19
Dealing with data that are not normally distributed within the
traditional LISREL framework
Questions:
-how bad is it if our data are
not normally distributed?
- what can we do about it?
-are there easy “fixes”?
Non-Normal Data
How about just ignoring the problem?
Early 1980s: Robustness studies.
 Major findings:

In almost all cases, using LV models better
than OLS even if data non-normal


(assumes multiple indicators available)
some discussion of conditions under which
parameters might not be accurate (e.g.,
low measurement coefficient models)
21
Non-Normal Data

Early articles:





A. Boomsa, On the Robustness of LISREL
Johnson and Creech, American Sociological
Review, 48(3), 1983, 398-403
Henry, ASR, 47: 299-307
(related: Bollen and Barb, ASR, 46: 232-39)
See a good summary of early and later
simulation studies: West, Finch and Curran
in Hoyle.
22
Non-Normal Data


See a good summary of early and later simulation
studies: West, Finch and Curran in Hoyle.
Formal properties:
Consist
ent?
Asymp.
Efficient?
Acov X2
(θ)
Multinormal (no
kurtosis)
√
√
√
√
Elliptical
√
√
X
X
Arbitrary
√
X
X
X
23
Non-Normal Data
Many of the studies have involved CFA models
•E.g., Curran, West, Finch, Psych. Methods, 1(1), 1996.
• General findings (non-normal data):
• ML, GLS produce X2 values too high
•Overestimated by 50% in simulations
•GLS, ML produce X2 value slightly larger when
sample sizes small, even when data are normally
distributed
•Underestimation of NFI, TLI, CFI
•Also underestimated in small samples esp. NFI
•Moderate underestimation of std. errors (phi 25%,
lambda 50%)
24
Non-Normality

Detection:




ur = E(x – ur)r kurtosis  4th moment
Mean of 3 standardized: u4 / u22
Standardized 3rd moment u3/ (u2)3/2
Tests of statistical significance usually available (Bollen,
p. 421) b1, b2 (skew,kurt)




N(0,1) test statistic for Kurtosis (H0: B2 – 3 = 0)
Different tests (one approx. requires N>1000)
Joint test κ2 Approx. distr. as X2, df=2
Mardia’s multivariate test: skewedness, kurtosis, joint.
25
Non-Normality
An alternative estimator:
Fwls (also called Fagls):
[s – σ(θ)’ w-1 [s – σ(θ)]
Browne, British Journal of Mathematical and Statistical
Psychology, 41 (1988) 193ff.
also 37 (1984), 62-83
Optimal weight matrix?
asymptotic covariance matrix of sij
Acov(sij,sgh) = N-1 (σijgh - σij σgh)
Sijgh = 1/N Σ (zi)(zj)(zg)(zh)
where zi is the mean-deviated value
If multinomial: σijgh = σij σgh + σjg σjh +
σjh σjg
(reduces to GLS)
W-1 is ½ * (k)(k+1) + ½ (k)(K+1)
26
Non-Normality
An alternative estimator:
Fwls (also called Fagls):
[s – σ(θ)’ w-1 [s – σ(θ)]
W-1 is ½ * (k)(k+1) + ½ (k)(K+1)
Computationally intense:
20 variables: 22,155 distinct elements
To be non-singular,
N must be > p + ½ (p)(p+1)
20 variables: minimum 230
30 variables: minimum 495
Older versions of LISREL used to impose higher restrictions
(refused to run until thresholds well above the minima
shown above were reached)
27
Non-Normality
An alternative estimator:
Fwls (also called Fagls):
[s – σ(θ)’ w-1 [s – σ(θ)]
W-1 is ½ * (k)(k+1) + ½ (k)(K+1)
The AGLS estimator is commonly available in SEM
software
 LISREL 8
 AMOS
 SAS-CALIS
 EQS
 Be careful! Not really suitable for small N problems
 Good idea to have sample sizes in the thousands, not
hundreds.
28
Non-Normality
An alternative estimator:
Fwls (also called Fagls):
[s – σ(θ)’ w-1 [s – σ(θ)]
W-1 is ½ * (k)(k+1) + ½ (k)(K+1)
The AGLS estimator is commonly available in SEM software

LISREL 8: ME=WL in OU statement; must also provide
asymptotic covariance matrix generated by PRELIS


AC FI= statement follows CM FI= statement
AMOS: check box on analysis options
Again, the problem is that this estimator can be unstable given
the size of the matrix (acov) that needs to be inverted
(especially in moderate sample sizes)
29
Non-Normality
Sample program in LISREL with adf estimator:
LISREL model for religiosity and moral conservatism
Part 2: ADF estimation
DA NI=14 NO=1456
CM FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.cov
ACC FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.acc
SE
1 2 3 4 5 6 7 8 9 10 11 12 13 14/
MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c
ps=sy,fr te=sy
va 1.0 ly 1 1 ly 8 2
fr ly 2 1 ly 3 1 ly 4 1 ly 5 1
fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2
fr te 2 1 te 11 10 te 7 6
ou me=ml se tv sc nd=3 mi
30
Non-Normality
Generating asymptotic
covariance matrix in
PRELIS
31
Non-Normality
Generating asymptotic
covariance matrix in
PRELIS
Resultant matrix will be
much larger than
covariance matrix
32
Non-Normality ADF estimation
LISREL model for religiosity and moral conservatism
Part 2: ADF estimation
DA NI=14 NO=1456
CM FI=h:\icpsr99\nonnorm\relmor1.cov
ACC FI=h:\icpsr99\nonnorm\relmor1.acc
SE
1 2 3 4 5 6 7 8 9 10 11 12 13 14/
MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c
ps=sy,fr te=sy
va 1.0 ly 1 1 ly 8 2
fr ly 2 1 ly 3 1 ly 4 1 ly 5 1
fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2
fr te 2 1 te 11 10 te 7 6
ou me=wl se tv sc nd=3 mi
33
Non-Normality ML, scaled statistics
LISREL model for religiosity and moral conservatism
Part 2: ADF estimation
DA NI=14 NO=1456
CM FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.cov
ACC FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.acc
SE
1 2 3 4 5 6 7 8 9 10 11 12 13 14/
MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c
ps=sy,fr te=sy
va 1.0 ly 1 1 ly 8 2
fr ly 2 1 ly 3 1 ly 4 1 ly 5 1
fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2
fr te 2 1 te 11 10 te 7 6
ou me=ml se tv sc nd=3 mi
34
Non-Normality
Low tech solutions:
For variables that are continuous,
TRANSFORMATION
 See classic regression texts such as Fox
 Common transformations:






X  log(X) (usually natural log)
X  sqrt (X)
X  X2
X  1/ X (even harder to interpret since this will result in sign
reversal)
Transforming to remove skewedness often/usually removes
kurtosis, but this is not guaranteed
“Normalization” as an extreme option (e.g., map rank-ordered
data onto N(0,1) distribution).
35
Non-Normality
Generally, if kurtosis between +1 and -1,
not considered too problematic
(See Bollen, 1989)
From this…….
36
Transformations
AMOS: Transformations must be performed on SPSS
dataset. Save new dataset, and work from
this. (e.g, COMPUTE X1 = LOG(X1).)
LISREL: Transformations can be
performed in PRELIS.
PRELIS already
provides distribution
information on variables
as a “check”
PRELIS “compute” dialogue
box under transformations
Remember to SAVE the Prelis dataset after
each transformation. Use of stat package
(SPSS, Stata, SAS) may be preferable
37
Transformations
All the usual caveats apply:
1.
If a variable only has 4-5 values,
transformation will not normalize a variable (at
the very least, will still have tucked-in tails) –
though it could help bring it closer to within the
+1  -1 range (Kurtosis)
2.
If a categorized variable has one value with a
majority of cases, then no transformation will
work
3.
If the variable has negative values, make sure
to add a constant (“offset”) before logging
38
Other solutions:
Robust test statistics
(Bentler) Implementation: EQS, LISREL
2. Muthen has recently developed a WLSM (meanadjusted) and WLSMV (mean and variance
adjusted) estimator
Implementation: MPLUS only
3. Bootstrapping
Implementation: AMOS (easy to use)
LISREL (awkward)
4. CATEGORICAL VARIABLE MODELS (CVM).
1.
39
Bootstrapping
Computationally intensive
 Sampling with replacement; from
resampling space R draw bootstrap
sample S*n,j where j=# of samples, n=bootstrap n
 Typically, bootstrap N = sample N
 Repeat resampling B times, get set of
values


Issue: what if, across 200 resamples, 2 of
them have ill-defined matrices?

Usually, these are discarded
40
Bootstrapping


Computationally intensive
Sampling with replacement; from resampling
space R draw bootstrap sample S*n,j where j=# of
samples, n=bootstrap n


Typically, bootstrap N = sample N
Repeat resampling B times, get set of values

Issue: what if, across 200 resamples, 2 of them have illdefined matrices?



Usually, these are discarded
Tests: 5% confidence intervals (want large # of
samples… confidence intervals do not need to be
symmetric (can look to value at 95th percentile
and at 5th among bootstrapped samples).
41
More common to compute standard errors
Bootstrapping


Overall model X2 correction (available in AMOS)..
Bollen and Stine.
Yang and Bentler (chapter in Marcoulides &
Schumacker):
 “faith” in bootstrap based on its
appropriateness in other app’s
 Simulation study, 1995, if explor. factor
analysis … rotated solutions close, but not so
with unrotated solutions
 “It seems that in the present stage of
development, the use of the bootstrap
estimator in covariance structure analysis is
still limited. It is not clear whether one can 42
trust the bias estimates.”
Bootstrapping

Ichikawa and Konishi, 1995


When data multinormal, bootstrap se’s not as
good as ML
Bootstrap doesn’t seem to work when N<150
consistent overestimation (at N=300, not a
problem though).
43
The Categorical Variable Model
Conceptual background:
We observe y
interested in latent y*
with C discrete values
Yi = Ci – 1 if vi,ci-1 < yi*
where v is a threshhold
Yi = Ci – 2 if vi,ci-2 < yi* ≤ vi,ci-1
Yi = Ci – 3 if vi,ci-3 < yi* ≤ ≤ vi,ci-2
…..
1
If v1,1 if vi,1 < yi* ≤ vi,2
0
if yi* ≤ vi,1
v’s are threshhold parameters
to be estimated.
44
The Categorical Variable Model
Observed and Latent Correlations
X-variable scale y-variable scale
Latent corr.
Continuous
Contiuous
Continuous
Categorical
Dichotomous
continuous
categorical
dichtoomous
categorical
dichotomous
Observed correl.
pearson
pearson
point-biserial
pearson
phi
pearson
polyserial
biserial
polychoric
tetrachoric
If it is reasonable to assume that continuous and normally
distributed y* variables underlie the categorical y
variables… a variety of latetn correlations can be specified.
45
The Categorical Variable Model
If it is reasonable to assume that continuous and normally distributed
y* variables underlie the categorical y variables… a variety of
latetn correlations can be specified.
First step: estimate thresholds using ML
Second step: latent correlations estimated
Third step: obtain a consistent estimator of the asymptotic
covariance matrix of the latent correlations (for use in a weighted
least squares estimator in the SEM model).
Extreme case: ability to recover y* model when variables split into
25%/75% dichotomies: promising (though X2 underestimated)
46