Path analysis: Observed variables

Download Report

Transcript Path analysis: Observed variables

Path analysis:
Observed variables
• Much has been written about path analysis; has been
around for over 20 years; started in sociology.
• Usually has been performed with multiple regression.
• Multiple regression is awkward because you have to make
several passes and then put all of the results together.
• However, multiple multiple regressions is perfectly fine.
• Path analysis with LISREL will not yield different results!
• Why do it? More elegant. Can do one run. Can compare
parameters between groups more easily.
Assumptions
• Multiple DVs: otherwise you’d just do a simple
multiple regression
• A single indicator for each measure (not latent).
• Each variable is assumed to be perfectly reliable
(no error).
• Sufficient sample size: conservative estimate says
at least 10 subjects per parameter; can sometimes
get away with 5
Advantages
• Forces you to explicitly state your model
• Allows you to decompose your effects into
direct and indirect effects
• Can do model modification more easily:
Remember, you must have a sufficiently
large sample size to have exploratory and
confirmatory samples
An example
z1
f1
g1,1
z3
b3,1
Y1
Y3
b2,1
X1
g2,1
b3,2
Y2
z2
Details . . .
• What is known and unknown?
• Degrees of freedom = (N)(N+1)/2, or 10.
• What is being estimated? One variance (phi
for X1); 2 gammas; 3 betas; and 3 zetas = 9
unknowns.
• Therefore, will run this path model with
1df.
. . . .details
• Will focus on two chief matrices, first:
Gamma:
X1
Y1
free
Y2
free
Y3
0
(this is where we get 1df)
Beta matrix
• Now the Beta matrix:
Y1 Y2 Y3
Y1
------Y2
free
----Y3
free
free
--Note that the diagonal is non-meaningful; and that the top of the
matrix is reserved for nonrecursive path models. In LISREL
syntax, this matrix is called SD (or sub-diagonal).
Model fitting?
• It is important to know that there will be no
iterations. That means that there is no
maximum likelihood generation of a latent
variable (e.g., a ksi).
• Still, the program does generate a host of fit
indices to tell you whether your model fits
the data well or not. Let’s look at this.
Path model of Mueller’s data
z1
X1
f3,1
f2,1
g1,1
g1,2
X2
g1,3
X3
g3,3
g2,3
f3,2
g2,2
g2,1
z3
b3,1
Y1
Y3
b2,1
b3,2
Y2
z2
Now, with actual variables . . .
z1
Mother
Educ.
f3,1
f2,1
Father
Educ.
f3,2
Parent
income
g1,1
g1,2
g1,3
g3,3
g2,3
Academic
ability
g2,2
g2,1
z3
b3,1
Income 5
yrs. grad.
b2,1
b3,2
Highest
degree
z2
LISREL syntax: oh my, oh my
Note: This is an observed path model on Mueller's data on college graduation
DA NG=1 NI=15 NO=3094 MA=CM
KM FI=a:\assign3\mueller.cor
SD FI=a:\assign3\mueller.sds
LA
mothed fathed parincm hsrank desfin confin acaabil drvach selfcon
degasp typecol colsel highdeg occpres incgrad
se
acaabil highdeg incgrad mothed fathed parincm/
MO NY=3 NX=3 PH=SY,FR PS=DI,FR GA=FU,FI BE=FU,FI
FR GA(1,1) GA(1,2) GA(2,1) GA(2,2) GA(1,3) GA(2,3) GA(3,3)C
BE(3,1) BE(3,2) BE(2,1)
PD
OU SC EF TV AD=50
the matrices . . .
Gamma matrix: G
X1
X2
X3
Y1
free
free
free
Y2
free
free
free
Y3
0
0
free
Y1
Y2
Y3
Y1
----
----
----
Y2
free
----
----
Y3
free
free
----
Beta matrix: B
How did the loadings turn out?
.5*
Mother
Educ.
.05*
.07*
2.6*
Academic
ability
.07
1.1*
1.5*
Father
Educ.
.28*
.01
.01
2.1*
Income 5
yrs. grad.
.02
.05*
.15*
Parent
income
.03*
Highest
degree
.86*
Model fit indices







Measures of absolute fit
C2(2) = 19.98
GFI = 1.00
Critical N = 1426.88
RMSEA = .054
AGFI = .98
PGFI = .095 (i.e., not
parsimonious)
•



Measures of relative fit
NFI = .99
RFI = .95
PNFI = .13 (not
parsimonious)
 NNFI = .96
 CFI = .99
Where do we go from here?
• We obtained good model fit indices. . . alright,
they’re damn good, except for parsimony.
• Can we do better? Where can we trim the model?
Delete the nonsignificant paths. This is model
modification—do not attempt this without a
confirmation sample, unless you want to claim
that your model is merely exploratory.
New pruned model
.5*
Mother
Educ.
.06*
.08*
2.6*
Academic
ability
1.1*
1.4*
Income 5
yrs. grad.
.29*
Father
Educ.
.05*
2.1*
.16*
Parent
income
.04*
Highest
degree
.86*
Pruned model fit indices





Measures of absolute fit
C2(6) = 30.19
GFI = 1.00
Critical N = 1723.67
RMSEA = .036
(outstanding!)
 AGFI = .99
 PGFI = .28 (better)
•





Measures of relative fit
NFI = .99
RFI = .98
PNFI = .40 (better)
NNFI = .98
CFI = .99
How about a randomly generated
model?
.5*
Income
at grad.
.05*
.07*
2.6*
Mother
Educ.
.07
1.1*
1.5*
Highest
degree
.28*
.01
.01
2.1*
Academic
ability
.02
.05*
.15*
Parent
income
.03*
Father
Educ.
.86*
Fit for randomly generated model







Measures of absolute fit
C2(2) = 153.87
GFI = .98
Critical N = 186.16
RMSEA = .15
AGFI = .83
PGFI = .09
•





Measures of relative fit
NFI = .95
RFI = .62
PNFI = .13
NNFI = .62
CFI = .95
Moral of the story
• Some indices are affected more than others
• When you have a huge sample size, and a host of
correlated measures, you’ll still end up with some
acceptable fit indices. So beware!
• With smaller sample sizes and stinky variables
(low internal reliability), covariances will be
smaller, and model fit will suffer accordingly. So,
don’t get used to a sample size of 3,000.
Mediation or moderation?
• All of the models proposed thus far have
featured mediation: A => B => C.
• As you probably know, I like moderation
too. Much confusion over which to use.
• Baron & Kenny’s rules: must have sig.
covariation between all variables before
attempting. Not always obtained.
• So how would one do moderation?
Mediation and moderation
Stress
Coping
Stress
Outcome
Outcome
Coping
Statistically, how are they
different or similar?
• Both can be performed on either observed
or latent (although a moderational path
model has not been standardized yet).
• We’ve seen the mediation model, let’s
consider the moderation model.
• The chief issue is that there is one Y
variable (outcome), and all other variables
are considered to be X variables.
The figure
Stress
Coping
Stress X
Coping
Outcome
Syntax
Note: This is an observed path model for the moderation of stress on outcome by coping
DA NG=1 NI=4 NO=0 MA=CM
KM FI=a:\stress.dat
LA
stress coping strxcop outcome
se
outcome stress coping strxcop/
MO NY=1 NX=3 PH=SY,FR PS=DI,FR GA=FU,FI
FR GA(1,1) GA(2,1) GA(3,1)
PD
OU SC EF TV AD=50