Transcript Slide 1

SEM is Based on the Analysis of Covariances!
Why?
Analysis of correlations represents loss of information.
illustration with regressions having same slope and intercept
A
100
80
y
60
y
40
r = 0.86
20
10
20
x
60
40
r = 0.50
20
0
0
0
B
100
80
30
0
10
20
30
x
Analysis of covariances allows for estimation of both standardized and
unstandardized parameters.
10
I. SEM Essentials:
1. SEM is a form of graphical modeling, and therefore, a system in which relationships
can be represented in either graphical or equational form.
equational
form
graphical
form
y1 = γ11x1 + ζ1
x1
11
1
y1
2. An equation is said to be structural if there exists sufficient evidence from all
available sources to support the interpretation that x1 has a causal effect on y1.
11
3. Structural equation modeling can be defined as the use of two or more structural
equations to represent complex hypotheses.
Complex
Hypothesis
x1
y3
ζ3
y1
Corresponding
Equations
e.g.
y2
ζ1
ζ2
y1 = γ11x1 + ζ1
y2 = β 21y1 + γ 21x1 + ζ 2
y3 = β 32y2 + γ31x1 + ζ 3
12
4. Some practical criteria for supporting an assumption of causal relationships in
structural equations:
a. manipulations of x can repeatably be demonstrated to be followed by responses
in y, and/or
b. we can assume that the values of x that we have can serve as indicators for the
values of x that existed when effects on y were being generated, and/or
c. if it can be assumed that a manipulation of x would result in a subsequent
change in the values of y
Relevant References:
Pearl (2000) Causality. Cambridge University Press.
Shipley (2000) Cause and Correlation in Biology. Cambridge
14
The Methodological Side of SEM
100
80
60
40
software
hyp testing
stat modeling
factor analysis
regression
20
0
15
6. SEM is a framework for building and evaluating multivariate hypotheses about
multiple processes. It is not dependent on a particular estimation method.
7. When it comes to statistical methodology, it is important to distinguish
between the priorities of the methodology versus those of the scientific
enterprise. Regarding the diagram below, in SEM we use statistics for the
purposes of the scientific enterprise.
Statistics and other
Methodological
Tools, Procedures,
and Principles.
The Scientific
Enterprise
16
The Relationship of SEM to the Scientific Enterprise
multivariate
descriptive statistics
structural
equation
modeling
univariate
descriptive statistics
univariate data
modeling
Data
exploration,
methodology and
theory development
simplistic
models
realistic
predictive models
detailed process
models
Understanding of Processes
modified from Starfield and Bleloch (1991)
17
8. SEM seeks to progress knowledge through cumulative learning. Current work is
striving to increase the capacity for model memory and model generality.
structural equation modeling
exploratory/
model-building
applications
one aim of
SEM
confirmatory/
hypothesis-testing
applications
18
11. An interest in systems under multivariate control motivates us to explicitly
consider the relative importances of multiple processes and how they
interact. We seek to consider simultaneously the main factors that
determine how system responses behave.
12. SEM is one of the few applications of statistical inference where the results
of estimation are frequently “you have the wrong model!”. This feedback
comes from the unique feature that in SEM we compare patterns in the
data to those implied by the model. This is an extremely important form of
learning about systems.
19
13. Illustrations of fixed-structure protocol models:
Univariate Models
Multivariate Models
x1
x1
y1
x2
x2
y2
x3
y1
x3
F
y3
x4
x4
y4
x5
x5
y5
Do these model structures match the causal forces that influenced
the data? If not, what can they tell you about the processes operating?
20
14. Structural equation modeling and its associated scientific goals represent an
ambitious undertaking. We should be both humbled by the limits of our
successes and inspired by the learning that takes place during the journey.
21
Some Terminology
path
coefficients
direct effect of x1 on y2
21
x1
11
exogenous
variable
21
y2
2
y1
1
endogenous
variables
indirect effect of x1 on y2

is 11 times
21
22
First Rule of Path Coefficients: the path coefficients for
unanalyzed relationships (curved arrows) between exogenous variables are
simply the correlations (standardized form) or covariances (unstandardized
form).
x1
y1
.40
x2
x1
x2
----------------------------x1
1.0
x2
0.40
1.0
y1
0.50
0.60
y1
1.0
23
Second Rule of Path Coefficients: when variables are
connected by a single causal path, the path
coefficient is simply the standardized or unstandardized
regression coefficient (note that a standardized regression
coefficient = a simple correlation.)
x1
11 = .50
y1
21 = .60
x1
y1
------------------------------------------------x1
1.0
y1
0.50
1.0
y2
0.30
0.60
y2
y2
1.0
 (gamma) used to represent effect of exogenous on endogenous.
 (beta) used to represent effect of endogenous on endogenous.
24
Third Rule of Path Coefficients: strength of a compound path is the
product of the coefficients along the path.
x1
.50
y1
.60
y2
Thus, in this example the effect of x1 on y2 = 0.5 x 0.6 = 0.30
Since the strength of the indirect path from x1 to y2 equals the
correlation between x1 and y2, we say x1 and y2 are
conditionally independent.
25
What does it mean when two separated variables
are not conditionally independent?
x1
y1
------------------------------------------------x1
1.0
y1
0.55
1.0
y2
0.50
0.60
x1
r = .55
y1
r = .60
y2
1.0
y2
0.55 x 0.60 = 0.33, which is not equal to 0.50
26
The inequality implies that the true model is
x1
additional process
y2
y1
Fourth Rule of Path Coefficients: when variables are
connected by more than one causal pathway, the path
coefficients are "partial" regression coefficients.
Which pairs of variables are connected by two causal paths?
answer: x1 and y2 (obvious one), but also y1 and y2, which are connected by
the joint influence of x1 on both of them.
27
And for another case:
x1
y1
x2
A case of shared causal influence: the unanalyzed relation
between x1 and x2 represents the effects of an unspecified
joint causal process. Therefore, x1 and y1 connected by two
causal paths. x2 and y1 likewise.
28
How to Interpret Partial Path Coefficients:
- The Concept of Statistical Control
x1
.40
.31
y1
y2
.48
The effect of y1 on y2 is controlled for the joint effects of x1.
I have an article on this subject that is brief and to the point.
Grace, J.B. and K.A. Bollen 2005. Interpreting the results from multiple
regression and structural equation models. Bull. Ecological Soc.
29
Amer. 86:283-295.
Interpretation of Partial Coefficients
Analogy to an electronic equalizer
from Sourceforge.net
With all other variables in model held to their means, how much does a
response variable change when a predictor is varied?
30
Fifth Rule of Path Coefficients: paths from error variables are
correlations or covariances.
x1
.40
equation for path
from error variable
 1 R
.31
y2
R2 = 0.16
y1
.92
2
yi
R2 = 0.44
.48
1
.84
.73
2
.56
alternative is to
show values for zetas,
which = 1-R2
31
Now, imagine y1 and y2
are joint responses
R2 = 0.25
.50
y2
2
.40
y1
1
x1
x1
y1
------------------------------x1
1.0
y1
0.40
1.0
y2
0.50
0.60
y2
R2 = 0.16
1.0
Sixth Rule of Path Coefficients: unanalyzed residual correlations
between endogenous variables are partial correlations or covariances.
32
Seventh Rule of Path Coefficients: total effect one variable has on
another equals the sum of its direct and indirect effects.
.15
x1
y2
.64
.80
Total Effects:
.27
ζ2
-.11
x2
y1
x1
x2
------------------------------y1
0.64
-0.11
y2
0.32
-0.03
y1
--0.27
ζ1
Eighth Rule of Path Coefficients:
sum of all pathways between two variables
(causal and noncausal) equals the
correlation/covariance.
note: correlation between
x1 and y1 = 0.55, which
equals 0.64 - 0.80*0.11
34
Path Tracing Rules
• No loops
• No going forward & then backward
• A maximum of one curved arrow per path
35
Suppression Effect - when presence of another variable causes path
coefficient to strongly differ from bivariate correlation.
.15
x1
y2
.64
.80
.27
ζ2
-.11
x2
y1
x1
x2
y1
y2
----------------------------------------------x1
1.0
x2
0.80
1.0
y1
0.55
0.40
1.0
y2
0.30
0.23
0.35
1.0
ζ1
path coefficient for x2 to y1 very different from correlation,
(results from overwhelming influence from x1.)
37
2. Estimation (cont.) – analysis of covariance structure
The most commonly used method of estimation over the past 3
decades has been through the analysis of covariance structure
(think – analysis of patterns of correlations among variables).
compare
Observed Correlations*
S
=
{
Model-Implied Correlations
}
1.0
.24 1.0
.01 .70 1.0
Σ
=
* typically the unstandardized correlations, or covariances
{
σ11
σ12 σ22
σ13 σ23 σ33
}
42
3. Evaluation
Observed Covariance Matrix
Hypothesized Model
x1
y2
+
y1
S
=
{
compare
Model Fit
Evaluations
Parameter
Estimates
}
1.3
.24 .41
.01 9.7 12.3
Σ
=
{
σ11
σ12 σ22
σ13 σ23 σ33
}
Implied Covariance Matrix
43
1. The Multiequational Framework
(a) the observed variable model
We can model the interdependences among a set of predictors and responses
using an extension of the general linear model that accommodates the
dependences of response variables on other response variables.
y = α + Βy + Γx + ζ
α = p x 1 vector of intercepts
Φ = cov (x) = q x q matrix of
covariances among xs
Β = p x p coefficient matrix of ys on ys
Ψ = cov (ζ) = q x q matrix of
y = p x 1 vector of responses
Γ = p x q coefficient matrix of ys on xs
covariances among errors
x = q x 1 vector of exogenous predictors
ζ = p x 1 vector of errors for the elements of y
44
(b) the latent variable model
η = α + Β η + Γξ + ζ
The LISREL
Equations
Jöreskög 1973
where:
η is a vector of latent responses,
ξ is a vector of latent predictors,
Β and Γ are matrices of coefficients,
ζ is a vector of errors for η, and
α is a vector of intercepts for η
(c) the measurement model
x = Λxξ + δ
y = Λyη + ε
where:
Λx is a vector of loadings that link observed x
variables to latent predictors,
Λy is a vector of loadings that link observed y
variables to latent responses, and
δ and ε are vectors are errors
47
Notation
• Covariance Matrixes of Interest:
–Φ
–Ψ
– Θδ
– Θε
Exogenous Constructs
Structural Error Terms
Measurement Error X
Measurement Error Y
Confirmatory Factor Analysis
x  Λ xξ  δ
x1  111  1
Trust in Individuals
x2  211   2
ξ1
λ11
people are
helpful
(x1)
δ1
λ21
people can
be trusted
(x2)
δ2
x3  1   3
1
people are
Fair
(x3)
δ3
  11 
 VAR(1 )


  
0
VAR( 2 )


0
0
VAR ( 3 )
Form of S(q)
• The form of S(q) can be worked out with
expectation operators
S XX
S
 YX
S XY 


SYY 



( I  B)1

1
1
1 
( I  B)  ( I  B) (   )( I  B) 
G89.2247 Lecture 5
56
Fitting Functions
• ML minimizes
1
F (q )  ln S(q )  ln S  tr SS(q )  ( p  q)
• ULS minimizes
(
)
F (q )  tr(S  S(q ))(S  S(q ))
• GLS minimizes
F (q )  ( 2 )tr (S  S(q )S
1
1
)(S  S(q )S )
1
• ADF minimizes a weighted least squares
criterion, but with a weight different than GLS
G89.2247 Lecture 5
57
Estimation Example using Excel
B
0
0.3205
0.5059
0
S
X1
X2
Y1
Y2
S(q)
X1
1.0452
0.5004
0.6356
0.5204

0.4485
0
0
0.3975
X2
0.5004
1.1538
0.4433
0.6829
Y1
0.6356
0.4433
1.1075
0.7256
Y2
0.5204
0.6829
0.7256
1.3033

1.0452 0.5004
0.5004 1.1538
ML=
ULS=
GLS=
0.0000 LR= 6E-07
0.0000
0.0000 LR= 6E-07
N=
1.0452
0.5004
0.6356
0.5205
0.5004
1.1538
0.4433
0.6829
0.6356
0.4433
1.1075
0.7256
0.5205
0.6829
0.7256
1.3033
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
S-S(q)

0.566 -0.22
-0.22 0.67
250
Y1= 0.3205 Y2+ 0.448 X1+
Y2= 0.5059 Y1+
0 X2+
G89.2247 Lecture 5
B
B








q
0.3205 1
0.5059 2
0.4485 3
0.3975 4
1.0452 5
1.1538 6
0.5004 7
0.566 8
0.6703 9
-0.224 10
0 X2+E1
0.4 X2+E2
61
Measures of Fit: Chi Square
• If model is not saturated, and
• If residuals of Y have multivariate normal
distribution
– ML*(N-1) and GLS*(N-1) have large sample chi
squared distributions
– Degrees of freedom given by difference in number
of parameters in model compared to saturated
model
G89.2247 Lecture 5
62
Illustration of the use of Χ2
correlation matrix
x
issue: should there be
a path from x to y2?
0.40
y1
y2
0.50
1.0
0.4 1.0
0.35 0.5
1.0
rxy2 expected to be 0.2
(0.40 x 0.50)
X2 = 1.82 with 1 df and 50 samples
P = 0.18
X2 = 3.64 with 1 df and 100 samples
P = 0.056
Essentially, our ability to detect
significant differences from our
base model, depends as usual
on sample size.
X2 = 7.27 with 1 df and 200 samples
P = 0.007
63
Chi Square Test Issues
• Appeal of Chi Square Test
– Makes model fit appear confirmatory
– Can reject an ill fitting model
– Can compare nested models
– Can calculate power
• Problems with Chi Square Test
– Global test will reject a good model if data are not
multivariate normal
– Usual issues of significance testing
G89.2247 Lecture 5
64
Diagnosing Causes of Lack of Fit (misspecification)
Modification Indices: Predicted effects of model modification on model chisquare.
Residuals: Most fit indices represent average of residuals between observed and
predicted covariances. Therefore, individual residuals should be inspected.
Correlation Matrix to be Analyzed
y1
y2
x
-------- -------- -------y1
1.00
y2
0.50
1.00
x
0.40
0.35
1.00
Fitted Correlation Matrix
y1
y2
x
-------- -------- -------y1
1.00
y2
0.50
1.00
x
0.40
0.20
1.00
residual = 0.15
65
Six Steps to Modeling
•
•
•
•
•
•
Specification
Implied Covariance Matrix
Identification
Estimation
Model Fit
Respecification
Specification
• Theorize your model
– What observed variables?
• How many observed variables?
– What latent variables?
• How many latent variables?
– Relationship between latent variables?
– Relationship between latent variables and
observed variables?
– Correlated errors of measurement?
Identification
• Are there unique values for parameters?
• Property of model, not data
• 10 = x + y




2, 8
-1, 11
4, 6
x=y
Identification
• Underidentified
• Just identified
• Overidentified
Identification
• Rules for Identification
– By type of model
• Classic econometric
– e.g., recursive rule
• Confirmatory factor analysis
– e.g., three indicator rule
• General Model
– e.g., two-step rule
Identification
Trust in Individuals
ξ1
λ11
people are
helpful
(x1)
δ1
λ21
people can
be trusted
(x2)
δ2
1
people are
Fair
(x3)
δ3
• Identified? Yes (just) by 3-indicator rule.
A Perspective on “Fit”