Slide 1 - Department of Education

Download Report

Transcript Slide 1 - Department of Education

Funded through the ESRC’s Researcher
Development Initiative
Session 2.3 – Publication bias
Prof. Herb Marsh
Ms. Alison O’Mara
Dr. Lars-Erik Malmberg
Department of Education,
University of Oxford
Session 2.3 – Publication bias
Establish
research
question
Define
relevant
studies
Develop code
materials
Data entry
and effect size
calculation
Pilot coding;
coding
Locate and
collate studies
Main analyses
Supplementary
analyses
 A type of selection bias in which only published
studies are analysed.
 Assumption that studies that are published are
more likely to report statistically significant
findings.
 Defined as “a bias against negative findings on the
part of those involved in deciding whether to
publish a study” (Soeken & Sripusanapan, 2003, p.
57).
 It could lead to an overestimation of effect sizes, as
published studies tend to have higher effect sizes
than unpublished studies (Lipsey & Wilson, 1993).
 Can result in inflation of Type I error rates in the
published literature
 The debate about using only published studies:
 peer-reviewed studies are presumably of a higher quality
VERSUS
 significant findings are more likely to be published than
non-significant findings
 There is no agreed upon solution. However, one
should retrieve all studies that meet the eligibility
criteria, and be explicit with how they dealt with
publication bias.
 Some methods for dealing with publication bias
have been developed (e.g., Fail-safe N, Trim and Fill
method).
 Inclusion of unpublished papers is likely to add
considerable “noise” to the analyses
 Methods typically used to find unpublished papers
are ‘ad hoc’
 the resulting selection of studies is likely to be less
representative of the unknown population of studies than
is the population of published studies, and typically will
be more homogeneous (White, 1994).
 “Whether bias is reduced or increased by including
unpublished studies cannot formally be assessed as it is
impossible to be certain that all unpublished studies
have been located (Smith & Egger, 1998)”
 Hence, for published papers, there is a more clearly
defined population of studies to which to generalize
than would be the case if unpublished studies were
included.
 A central goal of meta-analysis is to be inclusive.
 Meta-analyses call for a balance between
practicality and comprehensiveness (Durlak &
Lipsey, 1991).
 A compromise is for meta-analysts to report how
they dealt with publication bias
 Examination of the focus of the included studies
 Fail-safe N
 Trim & Fill
 Sensitivity analysis (Vevea & Woods, 2005)
 Imagine a meta-analysis in which gender
differences in self-concept is the focus
 If the main aim of an included study is on a
completely different issue, so that gender
differences are incidental to the main focus of the
study, the statistical significance of gender
differences per se are unlikely to contribute to
publication bias.
 Hence, the generalisability of the effect size over
this aspect of the study (degree to which gender
effects are a main focus) provides a test of
publication bias and an alternative perspective
against which to validate other more generalisable
approaches to this issue.
 Run an ANOVA in which effect sizes are grouped by
1. Studies in which the research question of the meta-
analysis is central to the study (e.g., gender differences
in self-concept)
2. Studies in which the research question of the metaanalysis is not central to the study (e.g., gender
differences in achievement)
 If significant difference in which studies with (1) as
focus have higher mean effect size, then this could
suggest publication bias (but does not confirm it!!!)
The fail-safe N (Rosenthal, 1991) determines the
number of studies with an effect size of zero
needed to lower the observed effect size to a
specified (criterion) level.
See Orwin (1983) for formula and discussion
For example, assume that you want to test the
assumption that an effect size is at least .20.
If the observed effect size was .26 and the fail-safe
N was found to be 44, this means that 44
unpublished studies with a mean effect size of zero
would need to be included in the sample to reduce
the observed effect size of .26 to .20.
Trim and fill procedure (Duval & Tweedie, 2000a,
2000b) calculates the effect of potential data
censoring (including publication bias) on the
outcome of the meta-analyses.
Nonparametric, iterative technique examines the
symmetry of effect sizes plotted by the inverse of
the standard error. Ideally, the effect sizes should
mirror on either side of the mean.
 The trim and fill process is as follows (O’Mara,
2008; see also Duval & Tweedie, 2000a, 2000b):
1. The overall mean is calculated in the usual
manner.
2. The number of hypothetical missing studies is
estimated and their corresponding non-missing
effect sizes are “trimmed” from the dataset.
3. The overall mean is re-estimated (excluding the
omitted effect sizes).
4. The newly calculated “trimmed” mean is used to
re-estimate the number of hypothetical missing
studies (as in Step One).
5. Steps Two to Four are repeated until the signed
6.
7.
ranks of the effect sizes do not change, and the
algorithm is terminated.
The original dataset is “filled” with symmetric data
points representing the potential omitted studies.
The overall mean effect size and confidence
intervals are recalculated, incorporating the
influence of the hypothetical missing studies
(Gilbody, Song, Eastwood & Sutton, 2000; Sutton,
Duval, Tweedie, Abrams & Jones, 2000).
 The L0 and R0 estimators indicate how many
hypothetical, non-published studies with a negative
effect size were missing from the meta-analysis
(step 6 on the previous slide)
 Assumption that all suppressed studies are those
with the most negative effect sizes. If this
assumption is inaccurate, the “corrected” effectsize estimate will be inaccurate (Vevea & Woods,
2005)
 The robustness of the findings to different
assumptions can be examined through sensitivity
analysis
 It can be used to assess whether study quality
affects the results, or whether publication bias is
likely
 Sensitivity analysis can take various forms. For
example, can include study quality as a predictor to
see if this has an impact on the effect sizes
 In publication bias, Vevea & Woods (2005) applied it
to a weight function
 “A new approach is proposed that is suitable for
application to meta-analytic data sets that are too
small for the application of existing methods. The
model estimates parameters relevant to fixedeffects, mixed-effects or random-effects metaanalysis contingent on a hypothetical pattern of
bias that is fixed independently of the data (p. 428)”
 Imposes a set of fixed weights determined a priori
and chosen to represent a specific form and
severity of biased selection
 Useful also when datasets are small
 “The author of the meta-analysis, then, is faced
with a logically impossible task: to show that
publication bias is not a problem for the particular
data set at hand. We describe the task as logically
impossible because it amounts, in essence, to an
attempt at confirming a null hypothesis. (Vevea &
Woods, 2005, p. 438)”
 Different methods can attempt to address (assess?)
the issue, but none is perfect.
 At least we can conclude that the Fail-safe N is not
appropriate!
 Include unpublished studies?
 Duval, S., & Tweedie, R. (2000a). A Nonparametric "Trim and Fill" Method of
Accounting for Publication Bias in Meta-Analysis. Journal of the American
Statistical Association, 95, 89-98.
 Duval, S., & Tweedie, R. (2000b). Trim and fill: A simple funnel-plot-based
method of testing and adjusting for publication bias in meta-analysis.
Biometrics, 56, 455–463
 Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks,
CA: Sage Publications.
 Orwin, R. G. (1983). A Fail-Safe N for effect size in meta-analysis . Journal of
Educational Statistics, 8, 157-159.
 Smith, G. D., & Egger, M. (1998). Meta-analysis: Unresolved issues and future
developments. BMJ, 7126.
 Van den Noortgate, W., & Onghena, P. (2003). Multilevel meta-analysis: A
comparison with traditional meta-analytical procedures. Educational and
Psychological Measurement, 63, 765-790.
 Vevea, J. L., & Woods, C. M. (2005). Publication bias in research synthesis:
Sensitivity analysis using a priori weight functions. Psychological Methods,
10, 428–443.