Transcript PowerPoint
Structural Equation Modeling (SEM) Niina Kotamäki SEM Covariance structure analysis Causal modeling Simultaneous equations modeling Path analysis Confirmatory factor analysis Latent variable modeling LISREL-modeling Highly flexible “modeling toolbox” Extension of the general linear model (GLM) SEM Quite recent innovation (late 1960s early 1970 ) Extensively applied in social sciences, psychology, economy, chemistry and biology • Applications in ecology and environmental sciences are limited • Even less common in aquatic ecosystems tests theoretical hypothesis about causal relationships tests relationships between observed and unobserved variables combines regression analysis (path analysis) and factor analysis researchers use SEM to determine whether a certain model is valid Regression model: Y=aX1+bX2+ε DEPENDENT INDEPENDENT X1 a corr Y X2 ε b LIMITATIONS Multiple dependent (Y) variables are not permitted M Each independent variable (X) is assumed to be measured without error controlled experiments measurement errors are negligible and uncontrolled variation is at minimum observational studies all variables are subject to measurement error and uncontrolled variation Strong correlation (multicollinearity) may cause biased parameter estimates and inflated standard errors Indirect effects (mediating variables) cannot be included The error or residual variable is the only unobserved variable SEM deals with these limitations Works with multiple, related equations simultaneously Allows reciprocal relationships Ability to model constructs as latent variables Allows the modeller to explicitly capture unreliability of measurement in the model Indirect effects / mediating variables Compares the performance of a model across multiple populations Steps of SEM analysis 1. 2. 3. 4. 5. 6. 7. Development of hypothesis / theory Construction of path diagram Model specification Model identification Parameter estimation Model evaluation Model modification 1. Development of hypothesis SEM is a confirmatory technique: researcher needs to have established theory about the relationships suited for theory testing, rather than theory development 2. Construction of path diagram error error η path ξ correlation path η Endogenous latent variable Exogenous latent variable coefficients error 3. Model Specification Creating a hypothesized model that you think explains the relationships among multiple variables Converting the model to multiple equations 4. Model Identification (Just) identified • a unique estimate for each parameter • number of equations = number of parameters to be estimated • a+b=5, a-b=2 Under-identified (not identified) • number of equations < number of parameters • infinite number of solutions • a+b=7 • model can not be estimated Over-identified • number of equations > number of parameters • the model can be wrong Just identified model ξ1 η2 ξ2 ξ3 η1 Over-identified model (SEM usually) ξ1 η1 ξ2 ξ3 η2 5. Parameter estimation technique used to calculate parameters testing how well a model fits the data expected covariance structure is tested against the covariance matrix of oberved data H0: Σ=Σ(θ) estimating methods: e.g. maximum likelihood (ML), ordinary least Squares (OLS), etc. Measurement Model • • • The part of the model that relates indicators to latent factors The measurement model is the factor analytic part of SEM The respective regression coefficient is called lambda () / loading Structural model • • This is the part of the model that includes the relationships between the latent variables relation between endogenous and exogenous construct is called gamma (γ) and relation between two endogenous constructs is called beta (β) δ1 δ2 Measurement model λx11 X1 ξ1 λx21 X2 Structural model γ11 λy11 ϕ21 δ1 λx32 X3 ϕ31 δ2 γ12 ξ2 λx42 X4 γ22 ϕ32 δ1 δ2 X5 λx53 λx63 X6 γ23 η1 y1 ε1 y2 ε2 λy21 β21 λy32 η2 y3 ε3 y4 ε4 λy42 ξ3 Endogenous latent variables Exogenous latent variables 6. Model evaluation Total model • Chi Square (2) test • Because we are dealing with a measure of misfit, the p-value for 2 should be larger than .05 to decide that the theoretical model fits the data • fit indices e.g. RMSEA, CFI, NNFI etc. • the theoretically expected values vs. the empirical data Model parts • t-value for the estimated parameters showing whether they are different from 0 (or any other value that we want to fix!); t > 1.96, p < .05 7. Model modification Simplify the model (i.e., delete non-significant parameters or parameters with large standard error) Expand the model (i.e., include new paths) Confirmatory vs. explanatory • Don’t go too far with model modification! Advantages of SEM use of confirmatory factor analysis to reduce measurement error by having multiple indicators per latent variable graphical modeling interface testing models overall rather than coefficients individually testing models with multiple dependents modeling indirect variables testing coefficients across multiple between-subjects groups handling difficult data (time series with autocorrelated error, non-normal data, incomplete data). SEM in ecology, example Structural model Physical environment Water clarity Phytoplankton dynamics Nutrients Example from: G.B. Arhonditsis, C.A. Atow, L.J. Steinberg, M.A. Kenney, R.C. Lathrop, S.j. McBride, K.H. Reckhow. Exploring ecological patterns with structural equation modeling and Bayesian analysis. Ecological Modeling 192 (2006) 385-409 Herbivore Biovolume Chlorophyll a water clarity Epilimnion depth Phytoplankton dynamics Herbivore Nutrients Phosphorus (SRP) Nitrogen (DIN) Daphnia Zooplankton ε1 ε2 Biovolume Epilimnion depth (physical environment) φ12 λ4 λ5 γ1 γ2 Nutrients λ2 Chlorophyll a water clarity β1 Phytoplankton dynamics β2 ψ33 Herbivore ψ11 λ6 λ3 ψ22 λ7 Phosphorus (SRP) Nitrogen (DIN) Daphnia Zooplankton δ2 δ3 ε5 ε4 2 =22.473; df=19 0.67 0.79 p=0.261 >0.05 OK! Biovolume Chlorophyll a 0.89 0.82 Epilimnion depth (physical environment) -0.07 0.42 -0.84 0.84 -0.92 Phytoplankton dynamics water clarity -0.66 0.43 Nutrients 0.84 Herbivore 0.76 0.96 0.99 0.91 Phosphorus (SRP) Nitrogen (DIN) Daphnia Zooplankton 0.71 0.98 0.93 0.83 SEM Software packages LISREL AMOS Function sem in R MPlus EQS Mx SEPATH References: http://www.upa.pdx.edu/IOA/newsom/semrefs.htm