Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December 11, 2007 Module 2, Unit 13 of.
Download ReportTranscript Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December 11, 2007 Module 2, Unit 13 of.
Statistical Applications for Meta-Analysis
Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December 11, 2007 Module 2, Unit 13 of NCDDR’s course for NIDRR Grantees
Developing Evidence-Based Products Using the Systematic Review Process
Two Main Purposes of a Meta-Analysis
• • Estimate the population central tendency and variability of effect sizes between an intervention (treatment) condition and a control condition.
Explore unexplained variability through the analysis of methodological and substantive coded study features.
12/6/06 2
Effect Size Extraction
Effect sizes extraction involves locating and converting descriptive or other statistical information contained in studies into a standard metric (effect size) by which studies can be compared and/or combined. 12/6/06 3
What is an Effect size?
• A descriptive metric that characterizes the standardized difference (in
SD
units) between the mean of a treatment group (educational intervention) and the mean of a control group • Can also be calculated from correlational data derived from pre-experimental designs or from repeated measures designs 12/6/06 4
Characteristics of Effect Sizes
• • Can be positive or negative Interpreted as a
z
-score, in SD units , although individual effect sizes are not part of a
z
-score distribution • Can be aggregated with other effect sizes and subjected to statistical procedures such as ANOVA and multiple regression • Magnitude interpretation: 0.50 is a moderate effect size and ≥ 0.80 is a large effect size (Cohen, 1992) ≤ 0.20 is a small effect size, 12/6/06 5
12/6/06 Control Condition
Zero Effect Size
ES = 0.00
Treatment Condition Overlapping Distributions 6
12/6/06
Moderate Effect Size
ES = 0.40
Control Condition Treatment Condition 7
12/6/06
Large Effect Size
ES = 0.85
Control Condition Treatment Condition 8
ES Calculation: Descriptive Statistics
SD pooled
Glass
Y Experimental
SD Control Y Control d Cohen
Y Experimental
SD Pooled Y Control
((
N E
1)
SD
2
E
(
N C
1)
SD
2
C
)) / (
N Total
2)
Note:
this equation is the same as adding two
SS
s and dividing by
df Total
12/6/06 9
Adjustment for Small Samples: Hedges’
g
• Cohen’s
d
is inaccurate for small samples
(N
< 20)
,
so Hedges’
g
was developed (Hedges & Olkin, 1985)
g Hedges
Y Experimental
Y Control
((
N E
1)
SD
2
E
(
N C
1)
SD
2
C
)) / (
N Tot
2) 1 4(
N E
3
N C
) 9
g
= Cohen’s
d
times a multiplier based on sample size 12/6/06 10
Example of
ES
Extraction with Descriptive Statistics
Study reports:
Treatment mean = 42.8
Treatment
SD
= 8.6
n
= 26 Control Mean = 32.5
Control
SD
= 7.4
n
= 31
Procedure:
Calculate
SD pooled
Calculate
d
and
g SD pooled
((26 1)8.6
2 ) (31
SD pooled
(1849 1642.8) / 55
d g
42.8
32.5
7.97
d
1 (4(
N E
10.3
7.97
3
N C
)) 1.29
9 1)7.4
2 3491.8 / 55 )) / (57 2) 63.49
4(26 3 31) 9 7.97
1.29 1 3 219 1.27
12/6/06 11
Alternative Methods of
ES
Extraction: Exact Statistics
• Study Reports:
t
(60) = 2.66,
p
< .05
d
2
t df
2(2.66) 60 5.32
7.46
0.687
• Study Reports:
F
(1, 61) = 7.08,
p
< .05
Convert
F
to
t
and apply the above equation:
t
d
F
2.66;
df
60 2
t df
2(2.66) 60 2(2.66) 7.46
5.32
7.46
0.687
12/6/06 12
Alternative Methods of
ES
Extraction: Exact
p-
value
• Study Reports:
t
(60) is sig.
p
= 0.013
Look up
t
-value for
p
= 0.013
t
= 2.68
d
t
1
N E
1
N C d
2.68
1 31 1 31 2.68(0.254) 0.681
12/6/06 13
Calculating Standard Error
The
standard error of
g
is an estimate of the “standard deviation” of the population, based on the sampling distribution of an infinite number of samples all with a given sample size. Smaller samples tend to have larger standard errors and larger samples have smaller standard errors.
Standard Error: 12/6/06 ˆ
g
1
n e
1
n c
2(
n e g
2
n c
) 1 4(
n e
3
n c
) 9 ˆ
g
ˆ
g
ˆ
g
ˆ
g
1 30 1 30 0.687
2 2(30 30) 1 0.071
1 0.1298
0.266) (0.987
0.262
4(30 3 30) 9 14
Test Statistic and Confidence Interval
Z
-test
(Null test: g =
0):
z g z g
g
ˆ
i
0.687 0.26
z g
2.62
Conclusion:
2.62 > 1.96 (p < 0.05); Reject H 0 : g > 0
12/6/06 95th Confidence Interval
CI UL CI U
g
(1.96
0.687
ˆ
i
) (1.96
0.26)
CI U
1.97
CI L
0.687
(1.96
0.26)
CI L
0.177
Conclusion:
Confidence interval does not cross 0 (g falls within the 95th confidence interval).
15
Other Important Statistics
Variance: ˆ 2
g
ˆ 2
g
ˆ 2
g
g
) 2 (0.262) 2 0.069
The variance is the standard error squared.
Inverse Variance (
w): w i w i w i
2 1 0.069
14.54
The inverse variance (
w
) provides a weight that is proportional to the sample size. Larger samples are more heavily weighted than small samples.
Weighted g (
g
*
w
):
Weightedg
(
w i
)(
g i
) 14.54
0.687
9.99
Weighted
g
is the weight (
w
) times the value of
g.
It can be + or –, depending on the sign of g.
12/6/06 16
Hedges ’
g
Standard Error
g
) Variance 2
g
) 2.44 2.31 1.38 1.17 0.88 0.81 0.80 0.68 0.63 0.60 0.58 0.32 0.25 0.24 0.24 0.19 0.11 0.09 0.02 0.02 0.02 -0.11 -0.11 -0.18 -0.30 0.330 0.22 0.17 0.30 0.19 0.17 0.12 0.08 0.18 0.51 0.13 0.29 0.11 0.08 0.20 0.15 0.12 0.12 0.08 0.24 0.17 0.26 0.24 0.28 0.22 0.06 0.03 0.05 0.03 0.09 0.04 0.03 0.01 0.01 0.03 0.26 0.02 0.08 0.01 0.01 0.04 0.02 0.01 0.01 0.01 0.06 0.03 0.07 0.06 0.08 0.05 0.00 0.00 12/6/06 95 th Lower Limi t 2.00 1.98 0.79 0.80 0.55 0.57 0.64 0.33 -0.37 0.35 0.01 0.10 0.09 -0.15 -0.05 -0.05 -0.13 -0.07 -0.45 -0.31 -0.49 -0.58 -0.66 -0.61 -0.42 0.28 95 th Upper Limi t z-Value 2.88 2.64 1.97 1.54 1.21 1.05 0.96 1.03 1.63 0.85 1.15 0.54 0.41 0.63 0.53 0.43 0.35 0.25 0.49 0.35 0.53 0.36 0.44 0.25 -0.18 0.38 10.89 13.59 4.60 6.16 5.18 6.75 10.00 3.78 1.24 4.62 2.00 2.91 3.13 1.20 1.60 1.58 0.92 1.13 0.08 0.12 0.08 -0.46 -0.39 -0.82 -5.00 12.62 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.22 0.00 0.05 0.00 0.00 0.23 0.11 0.11 0.36 0.26 0.93 0.91 0.94 0.65 0.69 0.41 0.00 0.00 p-Value Weights (
w i
) Weighted
g
(
w i
)(
g i
) 19.94 34.60 11.11 27.70 34.60 69.44 156.25 30.86 3.84 59.17 11.89 82.64 156.25 25.00 44.44 69.44 69.44 156.25 17.36 34.60 14.79 17.36 12.76 20.66 277.78 48.65 79.93 15.33 32.41 30.45 56.25 125.00 20.99 2.42 35.50 6.90 26.45 39.06 6.00 10.67 13.19 7.64 14.06 0.35 0.69 0.30 -1.91 -1.40 -3.72 -83.33 1458.21* 481.87* Average
g
(
g
+) is the sum of the weights divided by the sum of the weighted
g
s.
g
w i
(
w i
)(
g i
)
g
481.87
1458.21
g
0.333
17
12/6/06 18
ES Extraction Exercise
Materials:
•
EXCEL SE Calculator
•
5 studies from which to extract effect sizes
Mean and Variability
ES+
12/6/06 Variability
Note:
Results from Bernard, Abrami, Lou, et al. (2004)
RER
20
g+
Var SE
z
12/6/06
Mean Effect Size
g
k
i
1 (
w i
)
i k
i
1
w i
ˆ 2
g
k
i
1 1 ˆ 2 1 1
k
i
1 ˆ 1 2 ˆ
g
ˆ 2 1
k
i
1
w i g
481.87
1458.21
0.330
ˆ 2
g
1 1458.21
0.0007
ˆ
g
0.00
0.0265
z g
g
ˆ
g
z g
0.330
0.0265
12.62
Conclusion: Mean
g
= 0.33 and it is significant.
21
Variability (
Q
-Statistic)
Question: How much variability surrounds g+ and is it significant? Are the effect sizes heterogeneous or homogeneous?
Q
k
i
1 (
g
ˆ
g
) 2 2
g Q Total
(2.44
0.330) 2 ) 0.11
(2.31
0.330) 2 0.03
...
( 0.18
0.330) 2 0.22
( 0.30
0.330) 0.07
2 469.54
Q-value 469.54
df (Q) 24 Tested with the 2 distribution.
P-value 0.000
12/6/06 Conclusion: Effect sizes are heterogeneous.
22
Homogeneity vs. Heterogeneity of Effect Size
• If homogeneity of effect size is established, then the studies in the meta-analysis can be thought of as sharing the same effect size (i.e., the mean) • If homogeneity of effect size is violated (heterogeneity of effect size), then no single effect size is representative of the collection of studies (i.e., the “true” mean effect size remains unknown) 12/6/06 23
Statistics in Comprehensive Meta-Analysis™
Effect size and 95% confidence interval Test of null (2-Tail) Number Studies Point estimate Standard error Variance Lower limit Upper limit
25 0.33
0.03
0.00
0.28
0.38
Z-value
12.62
P-value
0.00
Heterogeneity
Q-value 469.54
df (Q) 24 P-value 0.00
Interpretation:
Moderate
ES
intervention condition.
for all outcomes (
g+
= 0.33) in favor of the Homogeneity of ES is violated.
Q-value
is significant (i.e., there is too much variability for
g+
to represent a true average in the population).
Comprehensive Meta-Analysis 2.0.027 is a trademark of BioStat® 12/6/06 24
Back to ES Calculator
1.
Interpretation of Mean Effect Size 2.
Interpretation of Q-Statistic
12/6/06 25
12/6/06
Homogeneity versus Heterogeneity of Effect Size
Gray shaded area is variation left to be explained by moderators.
No variation left to be explained by moderators.
Distribution1: Homogeneous Distribution 2: Heterogeneous
g
+ 26
Examining the Study Feature “Method of ES Extraction”
Overall Effect
g
+ = +0.33
Exact Descriptive Estimated Statistics Exact Statistics 12/6/06 27
Tests of Levels of “Method of ES Extraction”
Groups Group Descriptive Statistics Est. Statistics Effect size and 95% confidence interval N of Studies Point estimate Standard error Lower limit Upper limit Q-value Heterogeneity df (Q) P-value
15 3 7 0..29
0.21
0.63
0.03
0.06
0.06
0.22
0.09
0.50
0.35
0.33
0.75
402.56
0.97
37.00
14 2 6 0.00
0.62
0.00
Total within Total between Overall
25 0.33
0.03
0.28
0.38
442.50
27.04
469.54
22 2 24 0.00
0.00
0.00
Interpretation:
Small to Moderate intervention condition.
ESs
for all categories in favor of the Homogeneity of ES is violated.
Q-value
is significant for all categories (i.e., “Method of ES Extraction” does not explain enough variability to reach homogeneity).
12/6/06 28
Meta-Regression
Seeks to determine if “Method of ES Extraction” predicts effect size.
Model Residual Total
Q
15.50 454.04 469.54
df
1 23 24
p-value
0.00 0.00 0.00
Conclusion:
“Method of Extraction” design is a significant predictor of
ES
but
ES
is still heterogeneous.
12/6/06 29
Sensitivity Analysis
• Tests the robustness of the findings • Asks the question: Will these results stand up when potentially distorting or deceptive elements, such as outliers, are removed?
• Particularly important to examine the robustness of the effect sizes of study features, as these are usually based on smaller numbers of outcomes 12/6/06 30
Sensitivity Analysis: Low Standard Error Samples
12/6/06 31
12/6/06
Study
1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00 16.00 17.00 18.00 20.00 19.00 21.00 22.00 23.00 24.00 25.00
Total Point
0.34 0.36 0.33 0.34 0.33 0.34 0.33 0.34 0.48 0.30 0.28 0.32 0.31 0.32 0.31 0.27 0.32 0.33 0.32 0.33 0.33 0.34 0.33 0.33 0.34
0.33 SE
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 z-Value
11.42 10.65 12.26 11.88 11.96 11.42 9.89 12.20 12.57 11.93 12.49 12.28 12.27 12.57 12.53 12.58 12.73 12.96 12.69 12.75 12.68 12.74 12.71 12.81 16.45
12.62 p-Value
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00
32
Sensitivity Analysis of CT Data
0.60
0.50
0.40
0.30
0.20
0.10
0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Studies 1 to 25
12/6/06 Studies with High Weighted
g
+ Study 7 Study 13 Study 18 Study 25
g
0.80 0.25 0.02 -0.30
g
+ 0.330 0.330 0.330 0.330 Totals *% Influence = (
g
)(
w
)/481.87 (100)
g
+ with study removed 0.27 0.34 0.36 0.48 Diff erence -0.06 +0.04 +0.06 +0.15 (
w
) 156.25 156.25 156.25 277.78 746.53 (
g
)(
w
) 125.00 39.09 14.06 -83.33 %* Influence 25.9 8.1 2.9 17.41 54.31 33
Steps in Controlling for Study Quality
• • • •
Step one: Are the effect sizes homogeneous? Step two: Does study quality explain the heterogeneity? Step three: matter? Which qualities of studies Step four: How do we deal with the differences?
12/6/06 34
Controlling Study Quality Using Dummy Coding in Meta-Regression
Categories of Study Quality Dummy 1 Dummy 2 Dummy 3 Dummy 4 1 2 3 4 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 5 0 0 0 1 12/6/06 35
Adjusting Effect Sizes
g+
Before Categories Adjustment
g+
After Adjustment 1 2 3 4 5 Total -0.185 -0.218 0.683 0.565 0.390 0.247 -0.185 -0.218 -0.065 -0.183 -0.358 -0.202 Adjusted Heterogeneity
Q
Within 2.243 3.302 3.252 4.953 1.985 15.734
df
3 3 3 3 3 15
p
0.524 0.347 0.354 0.175 0.576 0.400 12/6/06 36
Selected References
Bernard, R. M., Abrami, P. C., Lou, Y. Borokhovski, E., Wade, A., Wozney, L., Wallet, P.A., Fiset, M., & Huang, B. (2004). How does distance education compare to classroom instruction? A meta-analysis of the empirical literature.
Review of Educational Research, 74
(3), 379-439.
Glass, G. V., McGaw, B., & Smith, M. L. (1981).
Meta-analysis in social research
. Beverly Hills, CA: Sage.
Hedges, L. V., & Olkin, I. (1985).
Statistical methods for meta analysis
. Orlando, FL: Academic Press.
Hedges, L. V., Shymansky, J. A., & Woodworth, G. (1989).
A practical guide to modern methods of meta-analysis
. [ERIC Document Reproduction Service No. ED 309 952].
12/6/06 37