Transcript Document

C2 Training: May 9 – 10, 2011
Data Analysis and Interpretation: Computing effect sizes
The Campbell Collaboration
www.campbellcollaboration.org
A brief introduction to effect sizes
Meta-analysis expresses the results of each study using a quantitative
index of effect size (ES).
ESs are measures of the strength or magnitude of a relationship of
interest.
ESs have the advantage of being comparable (i.e., they estimate the
same thing) across all of the studies and therefore can be
summarized across studies in the meta-analysis.
Also, they are relatively independent of sample size.
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Effect Size Basics
• Effect sizes can be expressed in many different metrics
– d, r, odds ratio, risk ratio, etc.
• So be sure to be specific about the metric!
• Effect sizes can be unstandardized or standardized
– Unstandardized = expressed in measurement units
– Standardized = expressed in standardized measurement units
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Unstandardized Effect Sizes
• Examples
– 5 point gain in IQ scores
– 22% reduction in repeat offending
– €600 savings per person
• Unstandardized effect sizes are helpful in communicating
intervention impacts
– But in many systematic reviews are not usable since not all studies
will operationalize the dependent variable in the same way
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standardized Effect Sizes
• Some standardized effect sizes are relatively easy to
interpret
– Correlation coefficient
– Risk ratio
• Others are not
– Standardized mean difference (d)
– Odds ratio, logged odds ratio
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Types of effect size
Most reviews use effect sizes from one of three families of effect sizes:
•
the d family, including the standardized mean difference,
the r family, including the correlation coefficient, and
• the odds ratio (OR) family, including proportions and other measures
for categorical data.
•
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Effect size computation
• Compute a measure of the “effect” of each study as our
outcome
• Range of effect sizes:
– Differences between two groups on a continuous measure
– Relationship between two continuous measures
– Differences between two groups on frequency or incidence
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Types of effect sizes
• Standardized mean difference
• Correlation Coefficient
• Odds Ratios
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standardized mean difference
• Used when we are interested in two-group comparisons
using means
• Groups could be two experimental groups, or in an
observational study, two groups of interest such as boys
versus girls.
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Notation for study-level statistics
Group 1 Group 2
X11
X 21
X12
X 22
X1n G1
C2 Training Materials – Oslo – May 2011
X 2n G2
n is sample
size
www.campbellcollaboration.org
Notation for study-level statistics
Group means: X G1 , X G 2
Group sample sizes: nG1 , nG 2
Total sample size: N  nG1 nG 2
Group standard deviations: sG1 , sG 2
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standardized mean difference
X G1  X G 2
ESsm 
sp
Pooled sample standard deviation
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Pooled sample standard deviation
(n G1 1)s  (n G 2 1)s
sp 
(n G1 1)  (n G 2 1)
2
G1
C2 Training Materials – Oslo – May 2011
2
G2
www.campbellcollaboration.org
Correction to ESsm

3 
ESsm  1 
ES
sm

 4N  9 
where N is the total sample size
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standard error of standardized mean difference
n G1  n G 2
(ES )
SE sm 

n G1n G 2
2(n G1  n G 2 )
'
2
sm
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Example
• Table 1 from:
Henggeler, S. W., Melton, G. B. & Smith, L. A. (1992). Family
preservation sing multisystemic therapy: An effective
alternative to incarcerating seriuos juvenile offenders.
Journal of Consulting and Clinical Psychology, 60(6), 953961.
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Note: Text of paper (p. 954) indicates that MST n = 43, usual
services n = 41.
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing pooled sd
(43 1) (13.9)  (41 1) (19.1)
sp 
(43 1)  (41 1)
2
2
 16.6
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing ESsm
(5.8 16.2)
ES sm 
16.6
  0.63
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing unbiased ESsm


3
ES  1 
* ( 0.63)

 4(43  41)  9 
 (0.99) * ( 0.63)
  0.62
'
sm
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing SEsm
43  41 (0.62)
SE sm 

43* 41 2(43  41)
2
 0.22
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
95% Confidence interval for ES’sm
0.62  1.96(0.22)  [1.06,  0.18]
The 95% confidence interval for the standardized mean
difference in weeks of incarceration ranges from -1 sds to -0.2
sds. Given that the sd of weeks is 16.6, the juveniles in MST
were incarcerated on average -1.06*16.6 = -17.6 to -0.18*16.6 =
-3 less weeks than juveniles in the standard treatment. In weeks,
the confidence interval is [-17.6, -3.0].
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Note: Text of paper (p. 954) indicates that MST n = 43, usual
services n = 41.
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Practice computations
• Compute effect size for number of arrests
• Compute effect size with bias correction
• Compute 95% confidence interval for effect size
• Interpret the effect size
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Pooled sd for arrests
(43 1) (1.34)  (41  1) (1.55)
sp 
(43 1)  (41  1)
2
2
 2.09
 1.44
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
ESsm for arrests
(0.87 1.52)
ES sm 
1.45
0.65

1.45
  0.45
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing unbiased ESsm


3
ES  1 
* ( 0.45)

 4(43  41)  9 
 (0.99) * ( 0.45)
  0.44
'
sm
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing SEsm
43  41 (0.44)
SE sm 

43* 41 2(43  41)
2
 0.22
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
95% Confidence interval for ES’sm
0.44  1.96(0.22)  [0.87,  0.01]
The 95% confidence interval for the standardized mean
difference in number of arrests is from -0.87 sds to -0.01 sds.
Given that the sd of arrests is 1.44, the juveniles in MST were
arrested on average -0.87*1.44 = -1.25 to -0.01*1.44 = -0.01 less
than juveniles in the standard treatment. In arrests, the
confidence interval is [-1.25, -0.01].
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Computing standardized mean differences
The first steps in computing d effect sizes involve assessing
what data are available and what’s missing. You will look for:
• Sample size and unit information
• Means and SDs or SEs for treatment and control groups
• ANOVA tables
• F or t tests in text, or
• Tables of counts
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Sample sizes
Regardless of exactly what you compute you will need to get
sample sizes (to correct for bias and compute variances).
Sample sizes can vary within studies so check initial reports of
n against
(1) n for each test or outcome or
(2) df associated with each test
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standardized Mean Differences
• Means, standard deviations and sample sizes the most
direct method
• Without individual group sample sizes (n1 and n2), assume
equal group n’s
• Can compute standardized mean differences from t-statistic
and from one-way F-statistic
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
ESsm from t-tests
ES sm 
C2 Training Materials – Oslo – May 2011
nG1  nG 2
t
nG1nG 2
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standardized mean difference
from t-test
ES sm
C2 Training Materials – Oslo – May 2011
282  270

*0.46
282* 270
 0.039
www.campbellcollaboration.org
Standardized mean difference
from means and sds
203.24  202.3
ES sm 
24.14
 0.039
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
ESsm from F-tests (one-way)
ES sm
nG1  nG 2

Fbetween
nG1nG 2
Note that you have to decide the
direction of the effect given the
results.
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standardized mean difference
from F-test
ESsm
43  41

*3.94
43*41
  0.43
Note that we choose a negative effect size since
the number of arrests is less for the MST group
than for the control group
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
From means and sds from before
0.87 1.52
ES sm 
1.45
  0.45
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Correlational data
C2 Training Materials – Oslo – May 2011
1
X11
X12
n
X n1
X n2
www.campbellcollaboration.org
Correlation data
ES r  r
ES Zr
C2 Training Materials – Oslo – May 2011
1  ES r 
 0.5 log e 

1

ES

r 
www.campbellcollaboration.org
Standard error of z-transform
SEZr 
C2 Training Materials – Oslo – May 2011
1
n 3
www.campbellcollaboration.org
Example
ES r  0.39
ES Z r
1  0.39 
 0.5 log e 

1  0.39 
 0.41
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standard error of z-transform
1
SEZr 
100  3
 0.10
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
95% confidence interval for z
0.411.96*0.10 0.21,0.61
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
To translate back to r-metric
r 
C2 Training Materials – Oslo – May 2011
e
2 ES zr
1
e
2 ES zr
1
www.campbellcollaboration.org
Confidence interval in r-metric
[0.21, 0.54]
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Outcomes of one study
Drummond et
al. (1990)
Success
Failure
TOTAL
Treatment
5
14
19
Comparison
6
12
18
TOTAL
11
26
37
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Odds of improving, ΩTrt
Prob(Success|Treatment)
T 
Prob(Failure|Treatment)
Prob(S|Trt)

1- Prob(S|Trt)
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Odds of improving, ΩTrt
Estimate ΩTrt by OE
#successes / total # trt
OE 
# failures / total # trt
5 /19
5


14 /19 14
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Odds of improving, ΩCntl
Estimate ΩCntl by OE’
#successes / total # cntl
OE ' 
# failures / total # cntl
6 /18
6


12 /18 12
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Odds ratio, ω
Trt

estimated by
Cntl
OE
# trt success /# trt failures
o

OE '
# cntlsuccess /# cntl failures
# trt s # cntls # trt s*# cntl f



# trt f # cntl f # trt f *# cntls
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Example
5*12
Odds ratio, o 
6*14
60

 0.71
84
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Outcomes of one study
Frequencies
Success
Failure
Treatment
a
b
Comparison
c
d
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Odds ratio, o or ESOR
ESOR
C2 Training Materials – Oslo – May 2011
ad

bc
www.campbellcollaboration.org
Interpretation of ESOR
• ESOR = 1, Treatment & Control equally effective
• ESOR > 1, Treatment successes more likely than Control
successes
• 0 < ESOR < 1, Treatment successes less likely than Control
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
ESLOR , log-odds ratio
 ad 
ESLOR  log e  
 bc 
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Standard Error of ESLOR
SE LOR 
C2 Training Materials – Oslo – May 2011
1 1 1 1 





a b c d 
www.campbellcollaboration.org
Interpretation of ESLOR
• ESLOR = 0, No difference between Treatment and Control
• ESLOR > 0, Treatment successes more likely than control
successes
• ESLOR < 0, Treatment successes less likely than control
successes
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Information for a 2 x 2 table
• MST n = 92
• IT (Control) n = 84
• 26.1% of MST group re-arrested
• 71.4% of IT group re-arrested
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
2 x 2 Table
Not arrested
Re-arrested
MST
92 – 24 = 68
26.1% of 92 =
24
IT
84 – 60 = 24
71.4% of 84 =
60
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Log-odds ratio
 68*60 
ESLOR  log e 

 24*24 
 1.96
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
SE of log-odds ratio
SE LOR
1
1
1 
 1
 




 68 24 24 60 
 0.34
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
95% Confidence interval
1.96  1.96*0.34  1.29, 2.62
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
2 x 2 Table
In home placement
Out of home
placement
MST
90.6% of 59 = 53.45
9.4% of 59 = 5.55
Usual child welfare
services
58.1% of 37 = 21.5
41.9% of 37 =
15.5
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org
Log-odds ratio
 53.45*15.5 
ESLOR  log e 

 5.55* 21.5 
 828.48 /119.325
 6.94
C2 Training Materials – Oslo – May 2011
www.campbellcollaboration.org