Transcript PowerPoint

Structural Equation Modeling
(SEM)
Niina Kotamäki
SEM







Covariance structure analysis
Causal modeling
Simultaneous equations modeling
Path analysis
Confirmatory factor analysis
Latent variable modeling
LISREL-modeling
Highly flexible “modeling toolbox”
Extension of the general linear model (GLM)
SEM

Quite recent innovation (late 1960s early 1970 )

Extensively applied in social sciences, psychology, economy, chemistry and
biology
• Applications in ecology and environmental sciences are limited
• Even less common in aquatic ecosystems

tests theoretical hypothesis about causal relationships

tests relationships between observed and unobserved variables

combines regression analysis (path analysis) and factor analysis

researchers use SEM to determine whether a certain model is valid
Regression model:
Y=aX1+bX2+ε
DEPENDENT
INDEPENDENT
X1
a
corr
Y
X2
ε
b
LIMITATIONS
Multiple dependent (Y) variables are not permitted
M
Each independent variable (X) is assumed to be measured without error
controlled experiments  measurement errors are negligible and uncontrolled
variation is at minimum
observational studies  all variables are subject to measurement error and
uncontrolled variation
Strong correlation (multicollinearity) may cause biased parameter estimates and
inflated standard errors
Indirect effects (mediating variables) cannot be included
The error or residual variable is the only unobserved variable
SEM deals with these limitations






Works with multiple, related equations simultaneously
Allows reciprocal relationships
Ability to model constructs as latent variables
Allows the modeller to explicitly capture unreliability of measurement in the
model
Indirect effects / mediating variables
Compares the performance of a model across multiple populations
Steps of SEM analysis
1.
2.
3.
4.
5.
6.
7.
Development of hypothesis / theory
Construction of path diagram
Model specification
Model identification
Parameter estimation
Model evaluation
Model modification
1. Development of hypothesis
SEM is a confirmatory technique:
 researcher needs to have established theory about the
relationships
 suited for theory testing, rather than theory development
2. Construction of path diagram
error
error
η
path
ξ
correlation
path
η
Endogenous latent variable
Exogenous latent variable
coefficients
error
3. Model Specification

Creating a hypothesized model that you think explains the
relationships among multiple variables

Converting the model to multiple equations
4. Model Identification

(Just) identified
• a unique estimate for each parameter
• number of equations = number of parameters to be estimated
• a+b=5, a-b=2

Under-identified (not identified)
• number of equations < number of parameters
• infinite number of solutions
• a+b=7
• model can not be estimated

Over-identified
• number of equations > number of parameters
• the model can be wrong
Just identified model
ξ1
η2
ξ2
ξ3
η1
Over-identified model (SEM usually)
ξ1
η1
ξ2
ξ3
η2
5. Parameter estimation

technique used to calculate parameters

testing how well a model fits the data

expected covariance structure is tested against the covariance
matrix of oberved data H0: Σ=Σ(θ)

estimating methods: e.g. maximum likelihood (ML), ordinary least
Squares (OLS), etc.


Measurement Model
•
•
•
The part of the model that relates indicators to latent factors
The measurement model is the factor analytic part of SEM
The respective regression coefficient is called lambda () / loading
Structural model
•
•
This is the part of the model that includes the relationships between the
latent variables
relation between endogenous and exogenous construct is called gamma
(γ) and relation between two endogenous constructs is called beta (β)
δ1
δ2
Measurement model
λx11
X1
ξ1
λx21
X2
Structural model
γ11
λy11
ϕ21
δ1
λx32
X3
ϕ31
δ2
γ12
ξ2
λx42
X4
γ22
ϕ32
δ1
δ2
X5
λx53
λx63
X6
γ23
η1
y1
ε1
y2
ε2
λy21
β21
λy32
η2
y3
ε3
y4
ε4
λy42
ξ3
Endogenous latent variables
Exogenous latent variables
6. Model evaluation


Total model
•
Chi Square (2) test
•
Because we are dealing with a measure of misfit, the p-value for 2 should
be larger than .05 to decide that the theoretical model fits the data
•
fit indices e.g. RMSEA, CFI, NNFI etc.
•
the theoretically expected values vs. the empirical data
Model parts
•
t-value for the estimated parameters showing whether they are different
from 0 (or any other value that we want to fix!);
t > 1.96, p < .05
7. Model modification

Simplify the model (i.e., delete non-significant parameters or
parameters with large standard error)

Expand the model (i.e., include new paths)

Confirmatory vs. explanatory
• Don’t go too far with model modification!
Advantages of SEM

use of confirmatory factor analysis to reduce measurement error by having
multiple indicators per latent variable

graphical modeling interface

testing models overall rather than coefficients individually

testing models with multiple dependents

modeling indirect variables

testing coefficients across multiple between-subjects groups

handling difficult data (time series with autocorrelated error, non-normal data,
incomplete data).
SEM in ecology, example
Structural model
Physical environment
Water clarity
Phytoplankton
dynamics
Nutrients
Example from: G.B. Arhonditsis, C.A. Atow, L.J. Steinberg, M.A. Kenney,
R.C. Lathrop, S.j. McBride, K.H. Reckhow. Exploring ecological patterns
with structural equation modeling and Bayesian analysis.
Ecological Modeling 192 (2006) 385-409
Herbivore
Biovolume
Chlorophyll a
water
clarity
Epilimnion depth
Phytoplankton
dynamics
Herbivore
Nutrients
Phosphorus (SRP)
Nitrogen (DIN)
Daphnia
Zooplankton
ε1
ε2
Biovolume
Epilimnion depth
(physical environment)
φ12
λ4
λ5
γ1
γ2
Nutrients
λ2
Chlorophyll a
water
clarity
β1
Phytoplankton
dynamics
β2
ψ33
Herbivore
ψ11
λ6
λ3
ψ22
λ7
Phosphorus (SRP)
Nitrogen (DIN)
Daphnia
Zooplankton
δ2
δ3
ε5
ε4
2 =22.473; df=19
0.67
0.79
p=0.261 >0.05 OK!
Biovolume
Chlorophyll a
0.89
0.82
Epilimnion depth
(physical environment)
-0.07
0.42
-0.84
0.84
-0.92
Phytoplankton
dynamics
water
clarity
-0.66
0.43
Nutrients
0.84
Herbivore
0.76
0.96
0.99
0.91
Phosphorus (SRP)
Nitrogen (DIN)
Daphnia
Zooplankton
0.71
0.98
0.93
0.83
SEM Software packages







LISREL
AMOS
Function sem in R
MPlus
EQS
Mx
SEPATH
References:
http://www.upa.pdx.edu/IOA/newsom/semrefs.htm