Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December 11, 2007 Module 2, Unit 13 of.

Download Report

Transcript Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December 11, 2007 Module 2, Unit 13 of.

Statistical Applications for Meta-Analysis

Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December 11, 2007 Module 2, Unit 13 of NCDDR’s course for NIDRR Grantees

Developing Evidence-Based Products Using the Systematic Review Process

Two Main Purposes of a Meta-Analysis

• • Estimate the population central tendency and variability of effect sizes between an intervention (treatment) condition and a control condition.

Explore unexplained variability through the analysis of methodological and substantive coded study features.

12/6/06 2

Effect Size Extraction

Effect sizes extraction involves locating and converting descriptive or other statistical information contained in studies into a standard metric (effect size) by which studies can be compared and/or combined. 12/6/06 3

What is an Effect size?

• A descriptive metric that characterizes the standardized difference (in

SD

units) between the mean of a treatment group (educational intervention) and the mean of a control group • Can also be calculated from correlational data derived from pre-experimental designs or from repeated measures designs 12/6/06 4

Characteristics of Effect Sizes

• • Can be positive or negative Interpreted as a

z

-score, in SD units , although individual effect sizes are not part of a

z

-score distribution • Can be aggregated with other effect sizes and subjected to statistical procedures such as ANOVA and multiple regression • Magnitude interpretation: 0.50 is a moderate effect size and ≥ 0.80 is a large effect size (Cohen, 1992) ≤ 0.20 is a small effect size, 12/6/06 5

12/6/06 Control Condition

Zero Effect Size

ES = 0.00

Treatment Condition Overlapping Distributions 6

12/6/06

Moderate Effect Size

ES = 0.40

Control Condition Treatment Condition 7

12/6/06

Large Effect Size

ES = 0.85

Control Condition Treatment Condition 8

ES Calculation: Descriptive Statistics

SD pooled

 

Glass

Y Experimental

SD Control Y Control d Cohen

Y Experimental

SD Pooled Y Control

((

N E

1)

SD

2

E

(

N C

1)

SD

2

C

)) / (

N Total

2)

Note:

this equation is the same as adding two

SS

s and dividing by

df Total

12/6/06 9

Adjustment for Small Samples: Hedges’

g

• Cohen’s

d

is inaccurate for small samples

(N

< 20)

,

so Hedges’

g

was developed (Hedges & Olkin, 1985)

g Hedges

Y Experimental

Y Control

((

N E

 1) 

SD

2

E

 (

N C

 1)

SD

2

C

)) / (

N Tot

 2)    1  4(

N E

 3

N C

)   9 

g

= Cohen’s

d

times a multiplier based on sample size 12/6/06 10

Example of

ES

Extraction with Descriptive Statistics

Study reports:

Treatment mean = 42.8

Treatment

SD

= 8.6

n

= 26 Control Mean = 32.5

Control

SD

= 7.4

n

= 31

Procedure:

Calculate

SD pooled

Calculate

d

and

g SD pooled

 ((26  1)8.6

2 )  (31 

SD pooled

 (1849  1642.8) / 55

d g

  42.8

 32.5

7.97

d

  1  (4(

N E

 10.3

7.97

  3

N C

))  1.29

9     1)7.4

  2 3491.8 / 55  )) / (57  2)  63.49

4(26  3 31)   9   7.97

  1.29 1  3 219    1.27

12/6/06 11

Alternative Methods of

ES

Extraction: Exact Statistics

• Study Reports:

t

(60) = 2.66,

p

< .05

d

 2

t df

 2(2.66)  60 5.32

7.46

 0.687

• Study Reports:

F

(1, 61) = 7.08,

p

< .05

Convert

F

to

t

and apply the above equation:

t

d

F

 2.66;

df

 60 2

t df

 2(2.66)  60 2(2.66)  7.46

5.32

7.46

 0.687

12/6/06 12

Alternative Methods of

ES

Extraction: Exact

p-

value

• Study Reports:

t

(60) is sig.

p

= 0.013

Look up

t

-value for

p

= 0.013

t

= 2.68

d

t

1

N E

 1

N C d

 2.68

1 31  1 31  2.68(0.254)  0.681

12/6/06 13

Calculating Standard Error

The

standard error of

g

is an estimate of the “standard deviation” of the population, based on the sampling distribution of an infinite number of samples all with a given sample size. Smaller samples tend to have larger standard errors and larger samples have smaller standard errors.

Standard Error: 12/6/06  ˆ

g

 1

n e

 1

n c

 2(

n e g

2 

n c

)   1  4(

n e

 3

n c

)   9   ˆ

g

 ˆ

g

 ˆ

g

 ˆ

g

 1 30  1 30  0.687

2 2(30  30)   1    0.071

  1  0.1298

  0.266)  (0.987

  0.262

4(30  3 30)  9   14

Test Statistic and Confidence Interval

Z

-test

(Null test: g =

0):

z g z g

g

 ˆ

i

 0.687 0.26

z g

 2.62

Conclusion:

2.62 > 1.96 (p < 0.05); Reject H 0 : g > 0

12/6/06 95th Confidence Interval

CI UL CI U

 

g

 (1.96

 0.687

  ˆ

i

) (1.96

 0.26)

CI U

  1.97

CI L

 0.687

 (1.96

 0.26)

CI L

  0.177

Conclusion:

Confidence interval does not cross 0 (g falls within the 95th confidence interval).

15

Other Important Statistics

Variance:  ˆ 2

g

 ˆ 2

g

 ˆ 2

g

 

g

) 2  (0.262) 2  0.069

The variance is the standard error squared.

Inverse Variance (

w): w i w i w i

 2  1 0.069

 14.54

The inverse variance (

w

) provides a weight that is proportional to the sample size. Larger samples are more heavily weighted than small samples.

Weighted g (

g

*

w

):

Weightedg

 (

w i

)(

g i

)  14.54

 0.687

 9.99

Weighted

g

is the weight (

w

) times the value of

g.

It can be + or –, depending on the sign of g.

12/6/06 16

Hedges ’

g

Standard Error 

g

) Variance 2

g

) 2.44 2.31 1.38 1.17 0.88 0.81 0.80 0.68 0.63 0.60 0.58 0.32 0.25 0.24 0.24 0.19 0.11 0.09 0.02 0.02 0.02 -0.11 -0.11 -0.18 -0.30 0.330 0.22 0.17 0.30 0.19 0.17 0.12 0.08 0.18 0.51 0.13 0.29 0.11 0.08 0.20 0.15 0.12 0.12 0.08 0.24 0.17 0.26 0.24 0.28 0.22 0.06 0.03 0.05 0.03 0.09 0.04 0.03 0.01 0.01 0.03 0.26 0.02 0.08 0.01 0.01 0.04 0.02 0.01 0.01 0.01 0.06 0.03 0.07 0.06 0.08 0.05 0.00 0.00 12/6/06 95 th Lower Limi t 2.00 1.98 0.79 0.80 0.55 0.57 0.64 0.33 -0.37 0.35 0.01 0.10 0.09 -0.15 -0.05 -0.05 -0.13 -0.07 -0.45 -0.31 -0.49 -0.58 -0.66 -0.61 -0.42 0.28 95 th Upper Limi t z-Value 2.88 2.64 1.97 1.54 1.21 1.05 0.96 1.03 1.63 0.85 1.15 0.54 0.41 0.63 0.53 0.43 0.35 0.25 0.49 0.35 0.53 0.36 0.44 0.25 -0.18 0.38 10.89 13.59 4.60 6.16 5.18 6.75 10.00 3.78 1.24 4.62 2.00 2.91 3.13 1.20 1.60 1.58 0.92 1.13 0.08 0.12 0.08 -0.46 -0.39 -0.82 -5.00 12.62 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.22 0.00 0.05 0.00 0.00 0.23 0.11 0.11 0.36 0.26 0.93 0.91 0.94 0.65 0.69 0.41 0.00 0.00 p-Value Weights (

w i

) Weighted

g

(

w i

)(

g i

) 19.94 34.60 11.11 27.70 34.60 69.44 156.25 30.86 3.84 59.17 11.89 82.64 156.25 25.00 44.44 69.44 69.44 156.25 17.36 34.60 14.79 17.36 12.76 20.66 277.78 48.65 79.93 15.33 32.41 30.45 56.25 125.00 20.99 2.42 35.50 6.90 26.45 39.06 6.00 10.67 13.19 7.64 14.06 0.35 0.69 0.30 -1.91 -1.40 -3.72 -83.33 1458.21* 481.87* Average

g

(

g

+) is the sum of the weights divided by the sum of the weighted

g

s.

g

   

w i

(

w i

)(

g i

)

g

  481.87

1458.21

g

  0.333

17

12/6/06 18

ES Extraction Exercise

Materials:

EXCEL SE Calculator

5 studies from which to extract effect sizes

Mean and Variability

ES+

12/6/06 Variability

Note:

Results from Bernard, Abrami, Lou, et al. (2004)

RER

20

g+

Var SE

z

12/6/06

Mean Effect Size

g

 

k

i

 1 (

w i

)  

i k

i

 1

w i

 ˆ 2

g

   

k

i

 1 1  ˆ 2    1  1

k

i

 1  ˆ 1 2  ˆ

g

   ˆ 2  1

k

i

 1

w i g

  481.87

1458.21

 0.330

 ˆ 2

g

  1 1458.21

 0.0007

 ˆ

g

  0.00

 0.0265

z g

 

g

  ˆ

g

z g

  0.330

0.0265

 12.62

Conclusion: Mean

g

= 0.33 and it is significant.

21

Variability (

Q

-Statistic)

Question: How much variability surrounds g+ and is it significant? Are the effect sizes heterogeneous or homogeneous?

Q

k

i

 1 (

g

  ˆ

g

 ) 2 2

g Q Total

 (2.44

 0.330) 2 ) 0.11

 (2.31

 0.330) 2 0.03

 ...

(  0.18

 0.330) 2 0.22

 (  0.30

 0.330) 0.07

2  469.54

Q-value 469.54

df (Q) 24 Tested with the  2 distribution.

P-value 0.000

12/6/06 Conclusion: Effect sizes are heterogeneous.

22

Homogeneity vs. Heterogeneity of Effect Size

• If homogeneity of effect size is established, then the studies in the meta-analysis can be thought of as sharing the same effect size (i.e., the mean) • If homogeneity of effect size is violated (heterogeneity of effect size), then no single effect size is representative of the collection of studies (i.e., the “true” mean effect size remains unknown) 12/6/06 23

Statistics in Comprehensive Meta-Analysis™

Effect size and 95% confidence interval Test of null (2-Tail) Number Studies Point estimate Standard error Variance Lower limit Upper limit

25 0.33

0.03

0.00

0.28

0.38

Z-value

12.62

P-value

0.00

Heterogeneity

Q-value 469.54

df (Q) 24 P-value 0.00

Interpretation:

Moderate

ES

intervention condition.

for all outcomes (

g+

= 0.33) in favor of the Homogeneity of ES is violated.

Q-value

is significant (i.e., there is too much variability for

g+

to represent a true average in the population).

Comprehensive Meta-Analysis 2.0.027 is a trademark of BioStat® 12/6/06 24

Back to ES Calculator

1.

Interpretation of Mean Effect Size 2.

Interpretation of Q-Statistic

12/6/06 25

12/6/06

Homogeneity versus Heterogeneity of Effect Size

Gray shaded area is variation left to be explained by moderators.

No variation left to be explained by moderators.

Distribution1: Homogeneous Distribution 2: Heterogeneous

g

+ 26

Examining the Study Feature “Method of ES Extraction”

Overall Effect

g

+ = +0.33

Exact Descriptive Estimated Statistics Exact Statistics 12/6/06 27

Tests of Levels of “Method of ES Extraction”

Groups Group Descriptive Statistics Est. Statistics Effect size and 95% confidence interval N of Studies Point estimate Standard error Lower limit Upper limit Q-value Heterogeneity df (Q) P-value

15 3 7 0..29

0.21

0.63

0.03

0.06

0.06

0.22

0.09

0.50

0.35

0.33

0.75

402.56

0.97

37.00

14 2 6 0.00

0.62

0.00

Total within Total between Overall

25 0.33

0.03

0.28

0.38

442.50

27.04

469.54

22 2 24 0.00

0.00

0.00

Interpretation:

Small to Moderate intervention condition.

ESs

for all categories in favor of the Homogeneity of ES is violated.

Q-value

is significant for all categories (i.e., “Method of ES Extraction” does not explain enough variability to reach homogeneity).

12/6/06 28

Meta-Regression

Seeks to determine if “Method of ES Extraction” predicts effect size.

Model Residual Total

Q

15.50 454.04 469.54

df

1 23 24

p-value

0.00 0.00 0.00

Conclusion:

“Method of Extraction” design is a significant predictor of

ES

but

ES

is still heterogeneous.

12/6/06 29

Sensitivity Analysis

• Tests the robustness of the findings • Asks the question: Will these results stand up when potentially distorting or deceptive elements, such as outliers, are removed?

• Particularly important to examine the robustness of the effect sizes of study features, as these are usually based on smaller numbers of outcomes 12/6/06 30

Sensitivity Analysis: Low Standard Error Samples

12/6/06 31

12/6/06

Study

1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00 16.00 17.00 18.00 20.00 19.00 21.00 22.00 23.00 24.00 25.00

Total Point

0.34 0.36 0.33 0.34 0.33 0.34 0.33 0.34 0.48 0.30 0.28 0.32 0.31 0.32 0.31 0.27 0.32 0.33 0.32 0.33 0.33 0.34 0.33 0.33 0.34

0.33 SE

0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03

0.03 z-Value

11.42 10.65 12.26 11.88 11.96 11.42 9.89 12.20 12.57 11.93 12.49 12.28 12.27 12.57 12.53 12.58 12.73 12.96 12.69 12.75 12.68 12.74 12.71 12.81 16.45

12.62 p-Value

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

0.00

32

Sensitivity Analysis of CT Data

0.60

0.50

0.40

0.30

0.20

0.10

0.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Studies 1 to 25

12/6/06 Studies with High Weighted

g

+ Study 7 Study 13 Study 18 Study 25

g

0.80 0.25 0.02 -0.30

g

+ 0.330 0.330 0.330 0.330 Totals *% Influence = (

g

)(

w

)/481.87 (100)

g

+ with study removed 0.27 0.34 0.36 0.48 Diff erence -0.06 +0.04 +0.06 +0.15 (

w

) 156.25 156.25 156.25 277.78 746.53 (

g

)(

w

) 125.00 39.09 14.06 -83.33 %* Influence 25.9 8.1 2.9 17.41 54.31 33

Steps in Controlling for Study Quality

• • • •

Step one: Are the effect sizes homogeneous? Step two: Does study quality explain the heterogeneity? Step three: matter? Which qualities of studies Step four: How do we deal with the differences?

12/6/06 34

Controlling Study Quality Using Dummy Coding in Meta-Regression

Categories of Study Quality Dummy 1 Dummy 2 Dummy 3 Dummy 4 1 2 3 4 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 5 0 0 0 1 12/6/06 35

Adjusting Effect Sizes

g+

Before Categories Adjustment

g+

After Adjustment 1 2 3 4 5 Total -0.185 -0.218 0.683 0.565 0.390 0.247 -0.185 -0.218 -0.065 -0.183 -0.358 -0.202 Adjusted Heterogeneity

Q

Within 2.243 3.302 3.252 4.953 1.985 15.734

df

3 3 3 3 3 15

p

0.524 0.347 0.354 0.175 0.576 0.400 12/6/06 36

Selected References

Bernard, R. M., Abrami, P. C., Lou, Y. Borokhovski, E., Wade, A., Wozney, L., Wallet, P.A., Fiset, M., & Huang, B. (2004). How does distance education compare to classroom instruction? A meta-analysis of the empirical literature.

Review of Educational Research, 74

(3), 379-439.

Glass, G. V., McGaw, B., & Smith, M. L. (1981).

Meta-analysis in social research

. Beverly Hills, CA: Sage.

Hedges, L. V., & Olkin, I. (1985).

Statistical methods for meta analysis

. Orlando, FL: Academic Press.

Hedges, L. V., Shymansky, J. A., & Woodworth, G. (1989).

A practical guide to modern methods of meta-analysis

. [ERIC Document Reproduction Service No. ED 309 952].

12/6/06 37