A simple guide to Mediation

Transcript A simple guide to Mediation

Mediation and Multi-group Analyses
Lyytinen & Gaskin
Mediation
 In an intervening variable model, variable X, is
postulated to exert an effect on an outcome variable,Y,
through one or more intervening variables called
mediators (M)
 “mediational models advance an X → M →Y causal
sequence, and seek to illustrate the mechanisms through
which X and Y are related.” (Mathieu & Taylor)
X
2
M
Y
Why Mediation?
 Seeking a more accurate explanation of the causal effect the
antecedent (predictor) has on the DV (criterion , outcome) –
focus on mechanisms that make causal chain possible
 Missing variables in the causal chain
 Intelligence  Performance
 Intelligence  Work Effectiveness  Performance
3
Conditions for mediation
(1) justify the causal order of variables including temporal
precedence;
(2) reasonably exclude the influence of outside factors;
(3) demonstrate acceptable construct validity of their
measures;
(4) articulate, a priori, the nature of the intervening effects
that they anticipate; and
(5) obtain a pattern of effects that are consistent with their
anticipated relationships while also disconfirming
alternative hypotheses through statistical tests.
4
Conditions for mediation
 Inferences of mediation are founded first and foremost in terms of
theory, research design, and the construct validity of measures
employed, and second in terms of statistical evidence of
relationships.
 Mediation analysis requires:
1) inferences concerning mediational X MY relationships hinge
on the validity of the assertion that the relationships depicted
unfold in that sequence (Stone-Romero & Rosopa, 2004). As with
SEM, multiple qualitatively different models can be fit equally well
to the same covariance matrix. Using the exact same data, one
could as easily ‘confirm’ a YMX mediational chain as one can
an XMY sequence (MacCallum, Wegener, Uchino, &
Fabrigar, 1993).
5
Conditions for mediation
2) experimental designs is to isolate and test, as best as possible, XY
relationships from competing sources of influence. In mediational
designs, however, this focus is extended to a three phase XMY
causal sequence requiring random assignments to both X and M and
related treatments
6
“Because researchers may not be able to randomly assign participants to conditions,
the causal sequence of XMY is vulnerable to any selection related threats to
internal validity (Cook & Campbell, 1979; Shadish et al., 2002). To the extent that
individuals’ status on a mediator or criterion variable may alter their likelihood of
experiencing a treatment, the implied causal sequence may also be compromised. For
example, consider a typical: trainingself-efficacyperformance, mediational
chain. If participation in training is voluntary, and more efficacious people are more
likely to seek training, then the true sequence of events may well be
self-efficacytrainingperformance. If higher performing employees develop
greater self-efficacy (Bandura, 1986), then the sequence could actually be
performanceefficacytraining. If efficacy and performance levels remain
fairly stable over time, one could easily misconstrue and find substantial support for
the trainingefficacyperformance sequence when the very reverse is actually
occurring.” (Mathieu and Taylor 2006)
Conditions of mediation
 It is a hallmark of good theories that they articulate the how and why
variables are ordered in a particular way (e.g., Sutton & Staw, 1995;
Whetten, 1989). This is perhaps the only basis for advancing a particular
causal order in non-experimental studies with simultaneous
measurement of the antecedent, mediator, and criterion variables (i.e.,
classic cross-sectional designs).
 Implicitly, mediational designs advance a time-based model of events
whereby X occurs before M which in turn occurs before Y. It is the
temporal relationships of the underlying phenomena that are at issue,
not necessarily the timing of measurements
 In other words, in mediation analyses, omitted variables represent a
significant threat to validity of the XM relationship if they are related
both to the antecedent and to the mediator, and have a unique influence
on the mediator. Likewise omitted variables (and related paths) may lead to
conclude falsely that no direct effect XY exists, while in fact it holds
in the population
7
Importance of theory –
Cause and effect
8
Training
Self-efficacy
Performance
Training
Performance
Self-efficacy
Performance
Self-efficacy
Training
Self-efficacy
Training
Performance
Types of Mediation
Significant Path
Insignificant Path
M
Indirect Effect
X
Y
M
Partial Mediation
X
Y
M
Full Mediation
X
9
Y
More complex mediation structures
Chain Model
X
M1
M2
M3
Y
M1
X
M2
M3
Parallel Model
10
Y
Hypothesizing Mediation
 All types of mediation need to be explicitly and with good
theoretical reasons and logic hypothesized before testing them
 Indirect Effect
 You still need to assume and test that X has an indirect effect on Y,
though there is no effect in path XY
 “X has an indirect, positive effect on Y, through M.”
 Partial or Full
 “M partially/fully mediates the effect of X on Y.”
 “The effect of X on Y is partially/fully mediated by M.”
 “The effect of X on Y is partially/fully mediated by M1, M2, & M3.”
11
Statistical evidence of relationships.
 Each type of mediation needs to be backed by appropriate statistical


1.
2.
3.
4.
12
analysis
Sometimes the analysis can be based on OLS, but in most cases it needs
to be backed by SEM based path analysis
There are four types of analyses to detect presence of mediation
relationships
Causal steps approach (Baron-Kenny 1986) (tests for significance of
different paths)
Difference in coefficients (evaluates the changes in betas/coefficients
and their significance when new paths are added to the model)
Product of effect approach (tests for indirect effects a*b’- this always
needs to be tested or evaluated using bootstrapping)
Sometimes evaluating differences in R squares
Statistical evidence of relationships
 Convergent validity is critical for mediation tests as this forms the basis
for reliability – especially poor reliability of mediator as “to the
extent that a mediator is measured with less than perfect reliability,
the MY relationship would likely be underestimated, whereas the
XY would likely be overestimated when the antecedent and
mediator are considered simultaneously” (see Baron & Kenny 1986)
 Discriminant validity must be gauged in the context of the larger
nomological network within which the relationships being
considered are believed to reside. Discriminant validity does not
imply that measures of different constructs are uncorrelated – the
issue is whether measures of different variables are so highly
correlated as to raise questions about whether they are assessing different
constructs. It is incumbent on researchers to demonstrate that their
measures of X, M, and Y evidence acceptable discriminant validity
before any mediational tests are justified.
13
Statistical evidence of relationships
14
Statistical evidence of the relationships
 In simple partial mediation βmx is the coefficient for X for
predicting M, and βym.x and βyx.m are the coefficients predicting Y
from both M and X, respectively. Here βyx.m is the direct effect of
X, whereas the product βmx*βym quantifies the indirect effect of X
on Y through M. If all variables are observed then βyx = βyx.m +
βmx*βym or βmx*βym = βyx - βyx.m
 Indirect effect is the amount by which two cases who differ by one
unit of X are expected to differ on Y through X’s effects on M,
which in turn affects Y
 Direct effect part of the effect of X on Y that is independent of the
pathway through M
 Similar logic can be applied to more complex situations
15
What would be the paths here?
16
Statistical analysis
 The testing of the existence of the mediational effect depends
on the type of indirect effect
 The lack of direct effect XY (βyx is either zero or not
significant) is not a demonstration of the lack of mediated
effect
 Therefore three different situations prevail (in this order)
1.
2.
3.
17
The presence of a indirect effect (βmx*βym is significant)
The presence of full mediation (βyx is significant but βyx.m is
not)
The presence of partial mediation (βyx is significant and βyx.m
is non zero and significant)
Testing for indirect effect
18
Testing for full mediation
19
Testing for partial mediation
20
Observations of statistical analysis
 The key is to test for the presence of a significant indirect effect – just
demonstrating the significant of paths βyx, βyx.m,βmx.y, and βmx is not
enough
 One reason is that Type I testing of statistical significance of paths is
not based on inferences on indirect effects as products of effects and
their quantities
 Can be done either using Sobel test (see e.g. www.quantpsy.org) or
bootstrapping
 Sobel tests assumes normality of product terms and relatively large
sample sizes (>200)
 Lacks power with small sample sizes or if the distribution is not normal
21
Bootstrapping
 Bootstrapping (available in most statistical packages, or there is additional





22
code to accomplish it for most software packages)
Samples the distribution of the indirect effect by treating the obtained sample
of size n as a representation of the population as a minitiature – and then
resampling randomly the sample with replacement so that a sample size n is built
by sampling cases from the original sample by allowing any case once drawn
to be thrown back to be redrawn as the resample of size n is constructed
βmx and βym and their product is estimated for each sample recorded
The process is repeated for k times where k is large (>1000)
Hence we have k estimates of the indirect effect and the distribution functions
as an empirical approximation of the sampling distribution of the indirect
effect when taking the sample of size n from the original population
Specific upper and lower bound for confidence intervals are established to
find ith lowest and jst largest value in the ordered rank of value estimates to
reject the null hypothesis that the indirect effect is zero with e.g. 95 level of
confidence
Observations of statistical analysis
 In full and partial mediation bivariate XY (assessed via correlation rYX






23
or coefficient βyx) must be nonzero in the population if the effects of X on
Y are mediated by M
Hence establishing a significant bivariate is conditional on sample size
For example Assume that N=100 and sample correlations rXM=.30 and
rMY =.30 and both would be significant at p<.05. However sample
correlation rXY =.09 would not!
Hence tests for full mediation can be precluded if this is the true model in
the population
This point become even more challenging when complex mediations
XM1M2M3Y are present.
Hence many times full mediations are not detected due to underpowered
designs; the same holds for interactions or suppression variables; in fact
four step Baron Kenny has power of .52 with a sample size of 200 to
detect medium effect!
This can be overcome by bootstrapping
Observations of statistical analysis
 Testing for full mediation requires that βyx.m is zero. When
βyx.m does not drop zero the evidence supports partial
mediation. This requires researchers to make a priori
hypotheses concerning full or partial mediation and
transforms confirmatory tests to exploratory data mining
 What counts as significant reduction in βyx vs. βyx.m is not
clear (c.f. from .15 to .05 vs. .75 to .65)
 Typically the baseline model for mediation is partial
mediation while theoretical clarity and Ockam’s razor would
speak for full mediation
24
Testing for Mediation in AMOS
 Direct Effects First
Regression Weights
loylong
loylong
25
<--<---
ctrust
atrust
Estimate
.282
.184
S.E.
.048
.048
C.R.
5.812
3.850
P
***
***
Testing for Mediation in AMOS
 Add Mediator
Regression Weights
26
value
value
loylong
loylong
loylong
<--<--<--<--<---
atrust
ctrust
ctrust
atrust
value
Estimate
.210
.602
.089
.123
.312
S.E.
.048
.048
.056
.047
.052
C.R.
4.400
12.452
1.592
2.638
5.935
P
***
***
.111
.008
***
Testing significance of partially
mediated paths – Sobel Test
 Use for partially mediated relationships.
 Use the Sobel Test online calculator
 Assumes normal distribution
and sufficiently large sample
http://www.danielsoper.com
/statcalc/calc31.aspx
Regression Weights
value
value
loylong
loylong
loylong
27
<--<--<--<--<---
atrust
ctrust
ctrust
atrust
value
Estimate
.210
.602
.089
.123
.312
S.E.
.048
.048
.056
.047
.052
C.R.
4.400
12.452
1.592
2.638
5.935
P
***
***
.111
.008
***
Testing significance of indirect effects–
Bootstrapping
At least 1000
28
No Missing Values Allowed!
Testing significance of indirect effects–
Bootstrapping
p- values
29
Direct Effects - Two Tailed Significance
No Mediation
• If Indirect is > 0.05
Full Mediation
• Given the direct
effects were
significant prior to
adding the mediator
• If Indirect < 0.05
and Direct is > 0.05
Partial Mediation
• If Direct & Indirect
< 0.05, check Total.
• If Total < 0.05 then
partial mediation is
significant.
30
burnm
burnc
satc
satw
wu
0.003
0.004
0.845
0.004
wf
0.033
0.969
0.026
0.836
aut
0.026
0.435
0.260
0.020
burnm
...
...
0.016
0.011
burnc
...
...
0.007
0.009
aut
...
...
0.016
0.016
burnm
...
...
...
...
burnc
...
...
...
...
aut
0.026
0.435
0.026
0.020
burnm
...
...
0.016
0.011
burnc
...
...
0.007
0.009
Indirect Effects - Two Tailed Significance
burnm
burnc
satc
satw
wu
...
...
0.005
0.003
wf
...
...
0.546
0.115
Total Effects - Two Tailed Significance
burnm
burnc
satc
satw
wu
0.003
0.004
0.033
0.003
wf
0.033
0.969
0.024
0.174
Findings
Partial Mediation
.23***
.37***
.20**
.17*
.08
Full Mediation
31
WORDING
Overall value partially mediates the effect of trust in agent on loyalty for
longterm (p < 0.000).
Overall value fully mediates the effect of trust in company on loyalty for
longterm (p < 0.000).
Using AMOS for testing chain models
and parallel models
32
Moderation concept
 Based on the observation that independent-dependent
variable relationship is affected by another independent
variable
 This situation is called moderator effect which occurs when a
moderator variable, a second independent variable changes
the form of the relationship between another independent
variable and the DV
 Can be expanded to a situation where the mediated
relationship is moderated
33
Moderation: affecting the effect
 Moderating variables must be chosen with strong theoretical




34
support (Hair et al 2010)
The causality of the moderator cannot be tested directly
Becomes potentially confounded as moderator becomes
correlated with either of the variables in the relationship
Testing easiest when moderator has no significant relationship
with other constructs
This assumption is important in distinguishing moderator
from mediators which (by definition) are related to both
constructs of the mediated relationship
Moderation: Multi-group
 Non-Metric moderators: categorical variables are hypothesized
as moderators (gender, age, turbulence vs. non-turbulence,
non customer vs. customer)
 For non-metric variables a multi-group analysis is applied i.e.
data is split for separate groups for analysis based on variable
values and tested for statistical difference (both for
measurement and structural model)
35
Multi-group example
Exercise
Low
Protein
Weight
Loss
Exercise
Weight
Loss
High
Protein
36
Moderator vs. Mediator
 Mediator: the means by which IV affects DV
A
K
M
C
B
E
 Moderator: a variable that influences the magnitude of the effect
an IV has on a DV
M
K
37
E
Mediation vs. Moderation Example
Notice that the mediator and the moderator can be the same!
Can a mediator also be used as a moderator?
Yes - see Baron and Kenny 1986 for a complex example
38
Some Theory-based Criteria
(i.e., arguments for mediation and moderation are based on theory
first, rather than statistical correlations)
 Mediator
 Logical effect of IV
 Logical cause of DV
 Moderator
 Not logically correlated to IV or DV (if categorical)
 Holistic/multiplicative effect (interaction)
 Varying effect for different categorical values (multi-group)
39
Either, Neither,
One or the
Other?
Driving home the point:
Moderator or Mediator?
 Caloric intake
 Exercise partner
 Positive reinforcement
 Exercise IQ
 Gender
 Activity level
 Age
 Protein intake
 Heredity
 Attitude
Exercise
40
M
M
Weight
Loss
Exercise
Weight
Loss
Koufteros & Marcoulides 2006
41