META-ANALYSIS OF RESEARCH

Download Report

Transcript META-ANALYSIS OF RESEARCH

Texas A&M University
SUMMER STATISTICS INSTITUTE
June 2009
Victor L. Willson, Instructor
TOPIC AREAS
 Background
 Research focus for meta-analysis
 Finding studies
 Coding studies
 Computing effect sizes
 Effect size distribution
 Mediators
 Moderators
 Report-writing
 Current issues
Background
 Purposes
 Historical
 Meta-analysis as survey research
 Strengths
 Weaknesses
Purposes for Meta-Analysis
• Cumulate findings of studies on a particular topic
• Examine homogeneity of outcomes
• Estimate effects of independent variables on outcomes
in a standardized format
– Evaluate moderator and mediator effects on outcomes
– Differentiate different types or classes of outcome effects
Historical background
 Criticism of traditional narrative reviews of research
 Exasperation in social sciences with constructs
measured different ways in terms of determining
consistencies
 Need to formulate theoretical relationships based on
many studies
History part 2
 Early 1970s efforts focused on significance testing and
“vote counts” of significance
 Glass (1976) presented a method he called “metaanlaysis” in Am. Ed. Research Assn. presidential
address
 Others proposed related methods, but Glass and
colleagues developed the most widely used approach
(Glass, McGaw, & Smith, 1981)
Meta-Analysis as Survey Research
• Research articles as unit of focus
• Population defined
– Conditions for inclusion of articles
•
•
•
Data requirements needed for inclusion
Completeness of data available in article or estimable
Publication sources available, selected
• Sample vs. population acquisition
– Availability of publications and cost
– Time to acquisition (how long to wait for retrieval)
Strengths of Meta-Analysis
 Definition of effect and effect size beyond “significant
or not”
 Focus on selection threats in traditional reviews (bias
in selection of articles for review)
 Systematic consideration of potential mediators and
moderators of effects
 Data organization of articles for public review
Weaknesses of Meta-Analysis
 Methodologically sophisticated and expensive
 Potential ignoring of contextual effects not easily
quantified; eg. historical/environmental placement
of research
 Potential improper mixing of studies
 Averages hiding important subgroupings
 Improperly weighting studies with different
methodological strength/rigor
Research focus for meta-analysis
 Defining and delineating the construct
 Determining a research outlet
 Meta-Analysis as an interactive, developing process
Recent Criticism
 Suri & Clarke (2009): Advancements in Research
Synthesis Methods: From a Methodologically Inclusive
Perspective (Review of Educational Research, pp. 395-430)
 They propose 6 overlapping approaches:
1. Statistical research syntheses (eg. meta-analysis)
2. Systematic reviews
3. Qualitative research syntheses
4. Qualitative syntheses of qualitative and quantitative
research
5. Critical impetus in reviewing research
6. Exemplary syntheses
Some critical comments on Suri & Clarke (2009)
 Systematic reviews- original Glass criticisms hold: what is
the basis for inclusion and exclusion; why are certain
articles privileged?
 Qualitative research syntheses- how can these be done
with situated contexts, small samples, environmentallydeveloped variables, sources, etc.? Will there be a review for
every reader, or for every researcher? Same limitation as all qual
research
 Qual syntheses of quant and qual research- potentially
doable, with an alternating order: qual first to focus emphases
in the quant analysis, or quant first to be validated with the qual studies
of particular environments and populations- do they fit/match in
reasonable ways?
 Critical impetus- code words for critical theory/Marxist etc.
Answer is already known, why do the research?
 Exemplary syntheses- what is the purpose?
Defining and Delineating the
Research Topic
• Outcome construct definition
– Importance to the field to know what has been learned
– How big is it? How many potential studies?
– Conduct preliminary searches using various databases
• Refining the construct
– How much resource is available? Eg. 1000 studies = 2-3 years
work
– Are there specific sub-constructs more important than
others? Select them or one of them
– Are there time-limitations (no studies before 19xx)
– Are there too few studies for the given construct, should it be
broadened? Too few-> less than 10?
Defining and Delineating the
Research
Topic
 What is the typical research approach for the topic
area?
 All quantitative
 All qualitative
 Mixed quantitative and qualitative
 Are there sufficient quantitative studies to provide
evidence for findings?
 Can qualitative studies be included as a separate part of
the study? How?
Determining Research Outlet
• Does the proposed journal
– publish research on the construct?
– Publish reviews or meta-analyses?
• Is there a journal devoted to reviews that your project
would fit with?
• Has a recent similar meta-analysis been published? If so,
will yours add anything new?
– Ex. Allen, et al (under review) evaluated articles on first grade
retention after 1990 focusing on the quality of the research
design in each study to determine if the effects were different
from a fairly recent meta-analysis by Jimerson (2001)
Meta-Analysis as an interactive,
developing process
• View meta-analysis as evolutionary
– As studies are reviewed and included, purpose and
scope may change
• Assume initial conceptualizations about both
outcomes and potential predictors may change over
time
– Definitions, instruments, coding may all change as
studies are found and included
• Plan for revisions to all aspects of the meta-analysis
FINDING STUDIES
 Searches
 Selection criteria
Searches
• Traditional literature review methods:
– Current studies are cumulated Branching backward search uses the
– Reference Lists of current studies
• Electronic searches
– Google, Google Scholar, PsyInfo, research library catalogs (for major
research institution libraries)
– Searches of major journal article titles and abstracts (commonly
available now through electronic libraries)
• Abstract vs. full content searches- electronic, pdf, hard copy
• Author requests: email or hard copy requests for newly
published articles or other works not found in typical search
outcomes
Selection Criteria
• In or out:
– Any quantitative data available?
•
•
Descriptive data- means and SDs for all groups of interest?
Analysis summaries- F- or t-tests, ANOVA tables etc. available that
may be utilized?
• Iterative process: outs may come back in given broader
definitions of a construct
• Duplicated articles/data reports? Decide on which to keep
(earliest? Most complete?) why were multiple articles
prepared? New groups included that can be used?
• Keep records of every study considered- excel or hard copy,
for example
Selection Criteria
• Useful procedure:
• Create an index card for each study along with notes of
each to refer to
• Organize studies into categories or clusters
• Review periodically as new studies are added, revise or
regenerate categories and clusters
• Consider why you organized the studies this way- does
it reflect the scope of research, construct organization,
or other classes?
CODING STUDIES
 Dependent variable(s)
 Construct(s) represented
 Measure name and related characteristics
 Effect size and associated calculations
 Independent variables
 Population
 Sample
 Design
 Potential Mediators and Moderators
 Bias mechanisms and threats to validity
CODING STUDIES- Dependent
Variables
 Construct name(s): eg. Receptive or Expressive Vocabulary
 Measurement name: Willson EV Test
 Raw score summary data (mean, SD for each group or
summary statistics and standard errors for dep. var):
Exp Mean= 22 Exp SD = 5 n=100, Con Mean = 19 Con SD = 4,
n=100
 Effect size (mean difference or correlation)
e = (22-19)/(20.5)
 Effect size transformation used (if any) for mean differences:



t-test transform ( e = t (1/n1 + 1/n2)½ ), F-statistic transform (F)½ = t for df = 1, 198
probability transform to t-statistic: t(198) = [probt(.02)]
point-biserial transform to t-statistic, regression coefficient t-statistic
 Effect size transformations used (if any) for correlations:


t-statistic to correlation: r2 = t2 / (t2+ df)
Regression coefficient t-statistic to correlation
CODING STUDIES- Independent
variables
 Population(s): what is the intended population, what
characterizes it?
Gender? Ethnicity? Age? Physical characteristics, Social
characteristics, Psychosocial characteristics? Cognitive
characteristics?
 Sample: population characteristics in Exp, Control
samples
eg. % female, % African-American, % Hispanic, mean IQ,
median SES, etc.
CODING STUDIES- Independent
variables
Design (mean difference studies):
1. Random assignment, quasi-experimental, or
nonrandom groups
2. Treatment conditions: treatment variables of
importance (eg. duration, intensity, massed or
distributed etc.); control conditions same
3. Treatment givers: experience and background
characteristics: teachers, aides, parents
4. Environmental conditions (eg. classroom, afterschool location, library)
CODING STUDIES- Independent
variables
Design (mean difference studies)
5. Time characteristics (when during the year, year of
occurrence)
6. Internal validity threats:
 maturation,
 testing,
 instrumentation,
 regression,
 history,
 selection
CODING STUDIES- Independent
variables
Mediators and Moderators
Mediators are indirect effects that explain part or all of
the relationship between hypothesized treatment and
effect:
(T)
e
M
In meta analysis we establish that the effect of T on the outcome is
nonzero, then if M is significantly related to the effect e. We do not
routinely test if T predicts M
CODING STUDIES- Independent
variables
Mediators and Moderators
Moderators are variable for which the relationship
changes from one moderator value to the next:
(T) .3
e for M=1
(T) .7
e for M=2
In meta analysis we establish that the effect of T on the outcome is
nonzero, then if M is significantly related to the effect e. We do not
routinely test if T predicts M
Coding Studies- Bias Mechanisms
 Researcher potential bias- membership in publishing
cohort/group
 Researcher orientation- theoretical stance or background
 Type of publication:
 Refereed vs. book chapter vs. dissertation vs. project report: do not
assume refereed articles are necessarily superior in design or
analysis- Mary Lee Smith’s study of gender bias in psychotherapy
indicated publication bias against mixed gender research showing
no effects by refereed journals with lower quality designs than nonrefereed works
 Year of publication- have changing definitions affected
effects? Eg. Science interest vs. attitude- terms used
interchangeably in 1940s-1950s; shift to attitude in 1960s
 Journal of publication- do certain journals only accept
particular methods, approaches, theoretical stances?
Computing Effect Sizes- Mean
Difference Effects
 Glass: e = (MeanExperimental – MeanControl)/SD
o SD = Square Root (average of two variances) for randomized
designs
o SD = Control standard deviation when treatment might affect
variation (causes statistical problems in estimation)
 Hedges: Correct for sampling bias:
g = e[ 1 – 3/(4N – 9) ]
 where N=total # in experimental and control groups
 Sg = [ (Ne + Nc)/NgNc + g2/(2(Ne + Nc) ]½
Computing Effect Sizes- Mean Difference Effects Example from
Spencer ADHD Adult study
 Glass: e = (MeanExperimental – MeanControl)/SD
= (82 – 101)/21.55
= .8817
 Hedges: Correct for sampling bias:
g = e[ 1 – 3/(4N – 9) ]
= .8817 (1 – 3/(4*110 – 9)
= .8762
Note: SD computed from t-statistic of 4.2 given in article:
e = t*(1/NE + 1/NC )½
A
1
effect
2
1
3
2
4
3
5
4
6
5
7
6
8
7
9
8
10
11
mean
12 s(mean)
B
Mean E
1
0.3
0.8
0.5
0.2
0.4
1
0.46
C
Mean C
0.2
-0.4
0.28
-0.46
-0.8
-0.12
0.36
-0.5
D
SDE
1
1
1
1
1
1
1
1
E
SDC
1
1
1
1
1
1
1
1
F
G
H
Hedge
s g Ctrl N
d
0.60 0.58
10
-0.05 -0.05 21
0.54 0.52
9
0.02 0.02
18
-0.30 -0.30 73
0.14 0.14
52
0.68 0.68 117
-0.02 -0.02
8
0.2154 38.50
0.0793
I
J
Trmt N
13
20
9
21
94
71
115
8
N
23
41
18
39
167
123
232
16
43.88
82.38
K
w
L
wd
5.43
3.14
10.24 -0.53
4.35
2.25
9.69
0.20
40.65 -12.10
29.94
4.20
54.85 37.17
4.00 -0.06
159.15
34.28
Computing Mean Difference Effect Sizes from
Summary Statistics
½
 t-statistic: e = t*(1/NE + 1/NC )
 F(1,dferror):
e = F½ *(1/NE + 1/NC )½
 Point-biserial correlation:
e = r*(dfe/(1-r2 ))½ *(1/NE + 1/NC )½
 Chi Square (Pearson association):
 = 2/(2 + N)
e = ½*(N/(1-))½ *(1/NE + 1/NC )½
 ANOVA results: Compute R2 = SSTreatment/Sstotal
Treat R as a point biserial correlation
Excel workbook for Mean difference computation
STATISTIC
OUTCOME# TYPE
MEAN E MEAN C SD E SD C Ne Nc N
STUDY#
1
2
3
4
5
6
means,
1 SDs
1 t-statistic
101
101
SUMMARY COMPUTATI
STATISTIC ON
d
intermediate intermediate
computation computation hedges g
82 22 21 78 32 110
82
78 32 110
19 0.8817 0.8817
4.2 0.881705 0.881705
0.875563
0.875568
1 F-statistic
point1 biserial r
78 32 110
17.64 0.881705 0.881705
4.2
0.875568
78 32 110 0.374701 0.881705 0.881705
17.64
4.2 0.875568
1 chi square
p(tstatistic)
47 76 123
3.66 0.654634 0.654634 0.169989 0.169989 0.650568
47 76 123
0.05 0.654634 0.654634 1.979764
0.650568
Story Book Reading
References
1 Wasik & Bond: Beyond the Pages of a Book: Interactive Book Reading
and Language Development in Preschool Classrooms. J. Ed Psych 2001
2 Justice & Ezell. Use of Storybook Reading to Increase Print Awareness in
At-Risk Children. Am J Speech-Language Path 2002
3 Coyne, Simmons, Kame’enui, & Stoolmiller. Teaching Vocabulary During
Shared Storybook Readings: An Examination of Differential Effects.
Exceptionality 2004
4 Fielding-Barnsley & Purdie. Early Intervention in the Home for Children
at Risk of Reading Failure. Support for Learning 2003
Coding the Outcome
1 open Wasik & Bond pdf
2 open excel file “computing mean effects example”
3 in Wasik find Ne and Nc
4 decide on effect(s) to be used- three outcomes are
reported: PPVT, receptive, and expressive vocabulary
at classroom and student level: what is the unit to be
focused on? Multilevel issue of student in classroom,
too few classrooms for reasonable MLM estimation,
classroom level is too small for good power- use
student data
Coding the Outcome
5 Determine which reported
data is usable: here the AM and
PM data are not usable because we don’t have the
breakdowns by teacher-classroom- only summary tests can
be used
6 Data for PPVT were analyzed
as a pre-post treatment
design, approximating a covariance analysis; thus, the
interaction is the only usable summary statistic, since it is
the differential effect of treatment vs. control adjusting for
pretest differences with a regression weight of 1 (ANCOVA
with a restricted covariance weight):
Interactionij = Grand Mean – Treat effect –pretest effect
= Y… - ai.. – b.j.
Graphically, the Difference of Gain inTreat(post-pre) and Gain in Control (post –
pre)
• F for the interaction was F(l,120) = 13.69, p < .001.
• Convert this to an effect size using excel file Outcomes Computation
• What do you get? (.6527)
Coding the Outcome
Y
Gain not
“predicted” from
control
post
gains
pre
Control
Treatment
Coding the Outcome
7 For Expressive and Receptive Vocabulary, only the Ftests for Treatment-Control posttest results are given:
Receptive: F(l, 120) = 76.61, p < .001
Expressive: F(l, 120) =128.43, p< .001
What are the effect sizes? Use Outcomes Computation
1.544
1.999
Getting a Study Effect
• Should we average the outcomes to get a single study
effect or
• Keep the effects separate as different constructs to
evaluate later (Expressive, Receptive) or
• Average the PPVT and receptive outcome as a total
receptive vocabulary effect?
Comment- since each effect is based on the same sample
size, the effects here can simply be averaged. If missing
data had been involved, then we would need to use the
weighted effect size equation, weighting the effects by
their respective sample size within the study
Getting a Study Effect
 For this example, let’s average the three effects to put
into the Computing mean effects example excel filenote that since we do not have means and SDs, we can
put MeanC=0, and MeanE as the effect size we
calculated, put in the SDs as 1, and put in the correct
sample sizes to get the Hedges g, etc.
 (.6567 + 1.553 + 2.01)/3 = 1.4036
2 Justice & Ezell
 Receptive: 0.403
 Expressive: 0.8606
 Average = 0.6303
3 Coyne et al
• Taught Vocab: 0.9385
• Untaught Vocab: 0.3262
• Average = 0.6323
4 Fielding
• PPVT: -0.0764
Computing mean effect size
 Use e:\\Computing mean effects1.xls
A
1
Study
2
1
3
2
4
3
5
4
6
7
mean
8 s(mean)
B
C
Mean E
1.4036
0.6303
0.6323
0.5
Mean C
0
0
0
-0.46
D
SDE
1
1
1
1
E
SDC
1
1
1
1
F
G
d
0.65
0.63
0.63
0.02
Hedge
sg
1.40
0.61
0.62
-0.08
H
Ctrl N Trmt N
61
63
15
15
30
34
23
26
0.8054 32.25
0.1297
Mean
I
34.50
J
N
124
30
64
49
66.75
K
w
L
wd
24.87 34.91
7.16 4.39
15.20 9.49
12.20 -0.93
59.43 47.86
Computing Correlation Effect Sizes
 Reported Pearson correlation- use that
 Regression b-weight: use t-statistic reported,
e = t*(1/NE + 1/NC )½
 t-statistics: r = [ t2 / (t2 + dferror) ] ½
Sums of Squares from ANOVA or ANCOVA:
r = (R2partial) ½
R2partial = SSTreatment/Sstotal
Note: Partial ANOVA or ANCOVA results should be noted as such and compared
with unadjusted effects
Computing Correlation Effect Sizes
 To compute correlation-based effects, you can use the




excel program
“Outcomes
Computation correlations”
The next slide gives an example.
Emphasis is on disaggregating effects of unreliability
and sample-based attenuation, and correcting samplespecific bias in correlation estimation
For more information, see Hunter and Schmidt (2004):
Methods of Meta-Analysis. Sage.
Correlational meta-analyses have focused more on
validity issues for particular tests vs. treatment or
status effects using means
Computing Correlation Effects Example
STUDY#
OUTCOME#
1
2
3
4
5
6
x alpha
y alpha Ne Nc N
reliabiltiy reliabiltiy
1
0.80 0.77 47
1
0.70 0.80 33
1
0.75 0.90 22
1
0.88 0.70 111
1
0.77 0.85 34
1
0.90 0.78 47
N(r-rmean) N(rdis-rdismean)
0.229994041 0.356773
0.466932722 0.442974
2.827858564 4.92834
16.73286084 27.27103
3.469040187 5.839757
4.389591025 7.831295
76
55
45
111
34
45
r
rcorrected s(r )
123
88
67
222
68
92
0.352646
0.323444
0.190571
0.67
0.169989
0.177133
r(mean)=
rdis(mean)=
Var(rmean )=
s(rmean)=
0.394623
0.50317
0.001138
0.033739
s(emean)= 0.080498
s(edismean)=
0.351381
0.32178
0.18918
0.669165
0.168757
0.17619
0.079277
0.095995
0.118621
0.037071
0.118639
0.101539
Nr
43.21983
28.31665
12.67504
148.5545
11.47548
16.20946
disattenuate
N*(r-rmean) d r
Ndisr
-5.1631
-6.26369
-13.6715
61.13375
-15.2751
-20.0091
0.449313
0.432221
0.231956
0.853659
0.210119
0.211412
55.26549
38.03544
15.54103
189.5123
14.28811
19.44991
s(edis)
0.101008
0.128279
0.144381
0.047233
0.146647
0.12119
EFFECT SIZE DISTRIBUTION
 Hypothesis: All effects come from the same
distribution
 What does this look like for studies with different
sample sizes?
 Funnel plot- originally used to detect bias, can show
what the confidence interval around a given mean
effect size looks like
 Note: it is NOT smooth, since CI depends on both
sample sizes AND the effect size magnitude
EFFECT SIZE DISTRIBUTION
 Each mean effect SE can be computed from
SE = 1/ (w)
For our 4 effects: 1: 0.200525
2: 0.373633
3: 0.256502
4: 0.286355
These are used to construct a 95% confidence interval
around each effect
EFFECT SIZE DISTRIBUTION- SE of
Overall Mean
 Overall mean effect SE can be computed from
SE = 1/ (w)
For our effect mean of 0.8054, SE = 0.1297
Thus, a 95% CI is approximately (.54, 1.07)
The funnel plot can be constructed by constructing a SE
for each sample size pair around the overall mean- this
is how the figure below was constructed in SPSS, along
with each article effect mean and its CI
EFFECT SIZE DISTRIBUTIONStatistical test
 Hypothesis: All effects come from the same
distribution: Q-test
 Q is a chi-square statistic based on the variation of the
effects around the mean effect
Q =  wi ( g – gmean)2
k
Q 2 (k-1)
Example Computing Q Excel file
effect
d
1
0.58
5.43
0.7151598
0.397736175 no
2
-0.05
10.24
0.7326248
0.392033721 no
3
0.52
4.35
0.3957949
0.52926895 no
4
0.02
9.69
0.366319
0.545017585 no
5
-0.30
40.65
10.697349
0.001072891 yes
6
0.14
29.94
0.1686616
0.681304025 no
7
0.68
54.85
11.727452
0.000615849 yes
8
-0.02
4.00
0.2125622
0.644766516 no
0.2154
w
Qi
Q=
df
prob(Q)=
prob(Qi)
25.015924
7
0.0007539
sig?
Computational Excel file
 Open excel file: Computing Q
 Enter the effects for the 4 studies, w for each study
(you can delete the extra lines or add new ones by
inserting as needed)
 from the Computing mean effect excel file
 What Q do you get?
Q = 39.57
df=3
p<.001
Interpreting Q
 Nonsignificant Q means all effects could have come
from the same distribution with a common mean
 Significant Q means one or more effects or a linear
combination of effects came from two different (or
more) distributions
 Effect component Q-statistic gives evidence for
variation from the mean hypothesized effect
Interpreting Q- nonsignificant
 Some theorists state you should stop- incorrect.
 Homogeneity of overall distribution does not imply
homogeneity with respect to hypotheses regarding
mediators or moderators
 Example- homogeneous means correlate perfectly
with year of publication (ie. r= 1.0, p< .001)
Interpreting Q- significant
 Significance means there may be relationships with
hypothesized mediators or moderators
 Funnel plot and effect Q-statistics can give evidence
for nonconforming effects that may or may not have
characteristics you selected and coded for
MEDIATORS
 Mediation: effect of an intervening variable that
changes the relationship between an independent and
dependent variable, either removing it or (typically)
reducing it.
 Path model conceptualization:
Treatment
Outcome
Mediator
MEDIATORS
 Statistical treatment typically requires both paths ‘a’
and ‘b’ to be significant to qualify as a mediator. Metaanalysis seems not to have investigated path ‘a’ but
referred to continuous predictors as regressors
 Lipsey and Wilson(2001) refer to this as “Weighted
Regression Analysis”
Treatment
a
Outcome
Mediator
b
Weighted Regression Analysis
 Model: e = b X + residual
 Regression analog: Q = Qregression + Qresidual
 Analyze as “weighted least squares” in programs such
as SPSS or SAS
 In SPSS the weight function w is a variable used as the
weighting
Weighted Regression Analysis
 Emphasis on predictor and its standard error: the
usual regression standard error is incorrect, needs to
be corrected (Hedges & Olkin, 1985):
SE’b = SEb / (MSe)½
where SEb is the standard error reported in SPSS,
and MSe is the reported regression mean square error
Weighted Regression Q-statistics
 Qregression = Sum of Squaresregression
df = 1 for single predictor
 Qresidual = Sum of Squaresresidual
df = # studies - 2
Significance tests: Each is a chi square test with
appropriate degrees of freedom
9
7
8
8
9
6
7
7
8
9
5
5
7
6
7
8
9
7
7
8
8
9
9
6
8.99
12.8
9.09
10.86
7.73
10.11
8.57
9.59
7.98
12.69
8.61
9.34
10.39
8.66
9.16
8.18
10.04
12.33
8.83
10.88
9.5
10.42
11.82
11.69
9.05
10.03
11.56
10.52
7.86
10.77
6.91
8.53
10.92
8.16
10.57
7.43
10.1
9.4
9.04
6.43
9.84
11.4
10.67
8.81
8.09
10.12
7.13
8.11
31
24
30
25
22
24
34
22
30
28
28
24
26
21
25
30
25
22
23
26
28
18
28
23
19
26
20
25
28
26
16
28
20
22
22
26
24
29
25
20
25
28
27
24
22
32
22
27
1.7026
0.267
0.7561
0.6532
1.5414
0.3507
0.4438
1.1245
0.542
0.6337
-0.5976
0.3771
0.7234
0.2413
0.6637
0.9038
0.4603
0.3948
-0.1726
0.4633
0.8481
0.7114
0.5407
0.4926
2
2
2
1
2
1
1
2
1
1
2
1
2
1
1
2
1
1
2
1
2
1
1
1
8.781319
12.369946
11.22962
11.867084
9.530347
12.291338
10.651743
10.659409
11.591384
11.739213
11.80079
12.262378
11.714913
12.094229
11.847643
10.928737
12.177485
12.087879
12.374215
12.154409
11.317137
10.885366
11.891682
12.05664
SPSS ANALYSIS OUTPUT
ANOVAb,c
Model Sum of Squares df
Mean Square
F
Regression
19.166 1
19.166
12.096 .002a
Residual
Total
34.858 22
54.024 23
1.584
Sig.
a. Predictors: (Constant), AGE
b. Dependent Variable: HEDGE d*
c. Weighted Least Squares Regression - Weighted by w
Coefficientsa,b
Model
Unstandardized Coefficients Standardized Coefficients t
B
Std. Error
Beta
(Constant)
-1.037 .465
-2.230
AGE
.215
.062
.596
3.478
a. Dependent Variable: HEDGE d*
b. Weighted Least Squares Regression - Weighted by w
Sig.
.036
.002
Example
 See SPSS “sample meta data set.sav” or the excel
version “sample meta data set regression”
 The d effect is regressed on Age
 b = 0.215, SEb = 0.062, MSe = 1.584
 Thus, SE’b = 0.062 / (1.584)½
= 0.0493
A 95% CI around b gives (0.117, 0.313) for the regression
weight of age on outcome, p<.001
Q-statistic tests
 Qregression = 19.166 with df=1, p < .001
 Qresidual = 34.858 with df=22, p = .040
 So- are the residuals homogeneous or not? Given a
large number of significance tests, one might require
the Type I error rate for such tests to be .001 or
something small
MODERATORS
 Moderators are typically considered categorical
variables for which effects differ across categories or
levels
 In a limited form, this can be considered a treatmentmoderator interaction
 Moderator analysis is more general in the sense that
any parameters of a within-category analysis may
change across categories (multigroup analysis concept
in Structural Equation Modeling)
Moderator Analysis- QBetween
 Analog to ANOVA- split into Qbetween and Qwithin
 QB = wiEi2– (wiEi)2 /wi
where Ei is the mean for category i and wi is the total
weight function for Ei
 Remember that you constructed a mean effect for a
study; the weight function for that mean effect is the
sum of the weights that made up the mean: Ei =
wjgj/wj for J effects in study I
wi = wj
Moderator Analysis- QWithin
 Analog to ANOVA- split into Qbetween and Qwithin
 QW = wj(i)(Ej(i) - MeanEi)2
I
j
where MeanEi is the mean for each category i, Ej(i) is
an effect j in category i and wj(i) is the weight function
for the jth effect in category i
 This is analogous to the within-subjects term in
ANOVA
 Lipsey and Wilson do not give a very good equation for
this on p. 121- confusing
Computational Issues
 The excel file “Meta means working
COMPUTATIONS” provides a workbook to compute
such effects
 An exemplar is shown below, is in your set of materials
 Computation of QB and QW are done from the
summary data of Hedge’s g and sample sizes
A
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
B
STUDY Hedge
ID
s's G
1
2
3
4
5
6
7
0.4409
0.3180
0.3650
0.3617
2.2650
0.2513
-0.061
C
D
SE
W
0.34
0.31
0.40
0.16
0.23
0.26
0.29
8.79
10.47
6.29
40.44
18.46
15.36
11.74
E
F
Ctrl N Trmt N
18
24
8
73
52
30
17
18
19
31
94
71
32
38
G
H
N
Design
36
43
39
167
123
62
55
1
1
1
2
2
2
2
I
E*W
J
Sum
W(group)
3.8743
3.3307
2.2961
14.627
41.813
3.8614
-0.719
K
Group
Means
L
Wgp*
Mgp
9.5011
M
N
O
Wgp
(Wgp*Mgp Wtd Within
*Mgp^2 )^2
GP SS
25.5512
0.3718
3.5329
90.2704
86.0028
0.6928 59.5819 41.2777 3550.0033
111.5540
69.0830 44.8107 3640.2737
0.0419
0.0304
0.0003
4.4329
45.6314
2.9937
6.6756
SumEW(gp1)= 9.5011
SumEW(gp2)= 59.582
0.0726
59.7336
All computations per Lipsey & Wilson
QB=
df=
p(QB)=
QW=
df=
p(QW)=
12.1783
1.0000
0.0005
59.8063
5.0000
0.0000
Q=
71.9845
Moderator Example
 For our Storybook reading example, we can break the
effect into two design types:
1 = no baseline equivalence
2 = baseline equivalence
Wasik = 2
Coyne = 1
Justice = 2
Fielding = 1
Moderator Example
• Select “Meta means working COMPUTATIONS” excel
file
• Reduce the number of studies to 2 in Design 1 and 2 in
design 2
• Insert the Hedge’s g effects, Cntrl N, Trmt N into the
correct boxes, all other effects will be correctly
computed
Storybook Reading Design Moderator effect
STUDY
ID
Hedges's G
SE
W
Ctrl N
Trmt N
N
Design
E*W
3
0.6200
0.26
15.21
30
34
64
1
9.4299
4
-0.0800
0.29
12.19
23
26
49
1
-0.976
1
1.4000
0.20
24.89
61
63
124
2
34.852
2
0.6100
0.37
7.17
15
15
30
2
4.3717
Sum
W(group)
27.4039
Group
Means
0.3085
Wgp*
Mgp
Wgp
*Mgp^2
8.4544
2.6083
(Wgp*Mgp)^ Wtd Within GP
2
SS
71.4763
1.4757
1.8406
32.0611
1.2234 39.2238
47.9868 1538.5078
0.7763
2.6966
8.4544
SumEW(gp1)=
59.4650
SumEW(gp2)=
47.6782
50.5951 1609.9840
39.224
3.3163
3.4729
All
computations
per Lipsey &
Wilson
QB=
1.0000
p(QB)=
0.0000
QW=
6.7893
df=
5.0000
p(QW)=
0.2368
Q=
QB sig., two design
means are different
23.5206
df=
30.3098
QW nonsig., homogeneous effects within
the two design categories
Meta-Analysis Report-Writing
 Traditional journal approach:
-Intro, lit review, methods, results, discussion
-References: background, studies in meta analysis*
-Tables: effects, SEs, Q’s, mediators, moderators
-Figures: Cluster diagrams, funnel plots, graphs of effects
by features
 Literature review approach:
-Thematic or theory focus: what lit exists, what does it say
Tabular summarizations of works
Current Issues
 Multi-level models: Raudenbush & Bryk analysis
in HLM6
 Structural equation modeling in meta-analysis
 Clustering of effects: cluster analysis vs. latent
class modeling
 Multiple studies by same authors- how to treat
(beyond ignoring follow-on studies), the study
dependence problem
 Multiple meta-analyses: consecutive, overlapping
Multilevel Models
 Raudenbush & Bryk HLM 6
 One effect per study
 Two level model, mediators and moderators at the
second level
 Known variance for first level (wi)
 Mixed model analysis: requires 30+ studies for
reasonable estimation, per power analysis
 Maximum likelihood estimation of effects
Multilevel Models
 Model:
Level 1:
gi = gi + ei
where there is one effect g per study i
Level 2:
gi = 0 + 1W + ui
where W is a study-level predictor such as design
in our earlier example
Assumption: the variance of gi is known = wi
Structural Equation Modeling in SEM
 New area- early work in progress:
 Cheung & Chan (2005, Psych Methods), (2009, Struc
Eqn Modeling)- 2-step approach using correlation
matrices (variables with different scales) or covariance
matrices (variables measured on the same
scale/scaling)
 Stage 1: create pooled correlation (covariance) matrix
 Stage 2: fit SEM model to Stage 1 result
Structural Equation Modeling in SEM
 Pooling correlation matrices:
 Get average r:
rmean(jk) = wi riij/ wijk
I
i
where j and k are the subscripts for the correlation
between variables j and k,
where i is the ith data set being pooled
Cheung & Chan propose transforming all r’s to Fisher Zstatistics and computing above in Z
If using Z, then the SE for Zi is (1-r2)/n½ and
Structural Equation Modeling in SEM
 Pooling correlation matrices: for each study,
COVg(rij, rkl) = [ .5rij rkl (r2ik + r2il + r2jk + r2jl) +
rik*rjl + ril*rjk –
(rij*rik*ril + rji*rjk*rjl + rki*rkj*rkl +
rli*rlj*rlk)]/n
Let i = covariance matrix for study i, G = {0,1} matrix
that selects a particular correlation for examination,
Then G = [ |G1|’ G2 |’…| Gk |’]’
and  = diag [1, 2, … k]
Structural Equation Modeling in SEM
Beretvas & Furlow (2006) recommended
transformations of the variances and covariances:
SDrtrans = log(s) + 1/(2(n-1)
COV(ri,rj)trans = r2ij/(2(n-1))
The transformed covariance matrices for each study are
then stacked as earlier
Clustering of effects: cluster analysis vs.
latent class modeling
 Suppose Q is significant. This implies some subset
of effects is not eq
ual to some other subset
 Cluster analysis uses study-level variables to
empirically cluster the effects into either
overlapping or nonoverlapping subsets
 Latent class analysis uses mixture modeling to
group into a specified # of classes
 Neither is fully theoretically developed- existing
theory is used, not clear how well they work
Multiple studies by same authors- how to treat (beyond
ignoring follow-on studies), the study dependence problem
 Example: in storybook telling literature,
Zevenberge, Whitehurst, & Zevenbergen (2003) was
a subset of Whitehurst, Zevenbergen, Crone,
Schultz, Velging, & Fischel (1999), which was a
subset of Whitehurst, Arnold, Epstein, Angell,
Smith, & Fischel (1994)
 Should 1999 and 2003 be excluded, or included
with adjustments to 1994?
 Problem is similar to ANOVA: omnibus vs.
contrasts
 Currently, most people exclude later subset articles
Multiple meta-analyses:
consecutive, overlapping
 The problem of consecutive meta-analyses is now
arising:
 Follow-ons typically time-limited (after last m-a)
 Some m-a’s partially overlap others: how should they be
compared/integrated/evaluated?
 Are there statistical methods, such as the correlational
approach detailed above, that might include partial
dependence?
 Can time-relatedness be a predictor? Willson (1985)
CONCLUSIONS
 Meta-analysis continues to evolve
 Focus in future on complex modeling of outcomes
(SEM, for example)
 More work on integration of qualitative studies with
meta-analysis findings