No Slide Title

Transcript No Slide Title

Systematic Reviews: Methods and Procedures

George A. Wells Editor, Cochrane Musculoskeletal Review Group Department of Epidemiology and Community Medicine University of Ottawa Ottawa, Ontario, Canada

Meta-analysis

:

•

Meta-analysis is a statistical analysis of a collection of studies

•

Meta-analysis methods focus on contrasting and comparing results from different studies in anticipation of identifying consistent patterns and sources of disagreements among these results

•

Primary objective:

•

Synthetic goal (estimation of summary effect) vs

•

Analytic goal (estimation of differences)

•

Systematic Review :

–

the application of scientific strategies that limit bias to the systematic assembly, critical appraisal and synthesis of all relevant studies on a specific topic

•

Meta-Analysis :

–

a systematic review that employs statistical methods to combine and summarize the results of several studies

Features of narrative reviews and systematic reviews

QUESTION SOURCES/ SEARCH SELECTION APPRAISAL SYNTHESIS INFERENCE NARRATIVE SYSTEMATIC

Broad Focused Usually unspecified Comprehensive; Possibly biased explicit Unspecified; biased?Criterion-based; uniformly applied Variable Rigourous Usually qualitative Sometimes evidence-based Quantitative Usually evidence based

Steps of a Cochrane Systematic Review

• Clearly formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report

•

What is the study objective

 

to validate results in a large population to guide new studies



Pose question in both biologic and health care terms specifying with operational definitions

  

population intervention outcomes (both beneficial and harmful)

Inclusion Criteria

•

Study design

•

Population

•

Interventions

•

Outcomes

Steps of a Cochrane Systematic Review

• • • • •

Need a well formulated and co-ordinated effort Seek guidance from a librarian Specify language constraints Requirements for comprehensiveness of search depends on the field and question to be addressed Possible sources include:

          

computerized bibliographic database review articles abstracts conference proceedings dissertations books experts granting agencies trial registries industry journal handsearching

•

Procedure:



usually begin with searches of biblographic reports (citation indexes, abstract databases)



publications retrieved and references therein searched for more references

Published Reports (publication bias ie. tendency to publish statistically significant results) 

as a step to elimination of publication bias need information from unpublished research



databases of unpublished reports

   

clinical research registries clinical trial registries unpublished theses conference indexes

Steps of a Cochrane Systematic Review

Study Selection

• •

2 independent reviewers select studies Selection of studies addressing the question posed based on a priori specification of the population, intervention, outcomes and study design

• •

Level of agreement: kappa Differences resolved by consensus

•

Specify reasons for rejecting studies

Data Extraction

•

2 independent reviewers extract data using predetermined forms

– – – –

Patient characteristics Study design and methods Study results Methodologic quality

• •

Level of agreement: kappa Differences resolved by consensus

Data Extraction ….

• • • •

Be explicit, unbiased and reproducible Include all relevant measures of benefit and harm of the intervention Contact investigators of the studies for clarification in published methods etc.

Extract individual patient data when published data do not answer questions about: intention to treat analyses, time-to event analyses, subgroups, dose response relationships

Steps of a Cochrane Systematic Review

• Well formulated question • Comprehensive data search • Unbiased selection and extraction process • Critical appraisal of data • Synthesis of data • Perform sensitivity and subgroup analyses if appropriate and possible • Prepare a structured report

Description of Studies

•

Size of study

•

Characteristics of study patients

•

Details of specific interventions used

•

Details of outcomes assessed

Methodologic Quality Assessment

• • • • •

Can use as:

• •

threshold for inclusion possible explanation form heterogeneity Base quality assessments on extent to which bias is minimized Make quality assessment scoring systems transparent and parsimonious Evaluate reproducibility of quality assessment Report quality scoring system used

Quality Assessment: Example

Study Adami 1995 Black 1996 Bone 1997 Random Blinding Dropouts Chestnut 1995 McClung 1998 + Hosking 1998 + Liberman 1995 + + + ++ + + + + + - + + + + - + + + + ++ indicates that randomization was appropriate ( e Random numbers were computer generated) g

Steps of a Cochrane Systematic Review

Discrete (event) Outcome

Odds Relative Ratio (OR) Risk (RR)

(Basic Data)

Risk Difference (RD)

Continuous (measured)

Mean Difference (MD) Standardized Mean Difference (SMD)

(Basic Data) Overall Estimate Fixed Effects Random Effects Overall Estimate Fixed Effects Random Effects

Effect measures: discrete data

P1 = event rate in experimental group P2 = event rate in control group

• • • • •

RD = Risk difference RR = Relative risk RRR = Relative risk reduction = (P2-P1)/P2 OR = Odds ratio = P1/(1-P1)/[P2/(1-P2)] NNT = No. needed to treat = P2 - P1 = P1 / P2 = 1 / (P2-P1)

Example

Experimental event rate = 0.3

Control event rate = 0.4

RD = 0.4 - 0.3 RR = 0.3 / 0.4 RRR = (0.4 - 0.3) / 0.4

OR = (0.3/0.7)/(0.4/0.6) NNT = 1 / (0.4 - 0.3) = 0.1

= 0.75

= 0.25

= 0.64

= 10

Discrete - Odds Ratio (OR)

Experimental Event a No event b Control Odds: c P e



a n e d P c



c n c number of patients experiencing event number of patients not experiencing event Odds ratio: Odds in Experimental group Odds in Control group

OR=   P e 1-P e     P c 1-P c   = ad bc

Basic Data a/n e c/n c n e n c

Discrete - Odds Ratio Example

Experimental Control Event 13 7 No event 33 31 P e



13 46 P c



7 38 46 38 OR



13 * 31



1 .

745 7 * 33 Basic Data 13/46 7/38

Discrete - Relative Risk (RR)

Experimental Event a No event b Control Risk: c d P e



a n e P c



c n c number of patients experiencing event number of patients Risk Ratio: Basic Data Risk in Experimental group Risk in Control group RR



P e P c



a ( c ( a



b ) c ) a/n e c/n c n e n c

Discrete - Relative Risk - Example

Experimental Control Event 13 7 No event 33 31 P e



13 46 P c



7 38 46 38 RR



P e P c



13 / 46 7/38



1.534

Basic Data 13/46 7/38

Discrete - Risk Difference (RD)

Experimental Event a No event b Control Risk: c d P e



a n e P c



c n c number of patients experiencing event number of patients Risk Difference: (Risk in Experimental group) - (Risk in Control group) n e n c RD = P e - P c



a a





c c



d Basic Data a/n e c/n c

Discrete - Risk Difference - Example

Experimental Control Event 13 7 P e



13 46 No event 33 31 P c



7 38 46 38 RD = P e - P c = 13/46 - 7/38 = 0.098

Basic Data 13/46 7/38

Discrete - Odds Ratio

(O)

Experimental Event a No event b n e Control c d n c

Estimator: o  pˆ e

/( 1  /( 1  p e

)

)  a Standard Error: s L o n e L o p c  c n c  ln(o)    n e

p e

1 ( 1 

p e

)  1

n c p c

( 1 

p c

)   1/2 L o  Z  /2 s L o exp(L o  Z  /2 s L o )

Discrete - Relative Risk

(R)

Experimental Control

Estimator: Standard Error:

Event a No event b n e c

p e  a n e r  pˆ e / pˆ c

p c  c n c

n c

L r  ln(r) s L r    1 n e p p e e  1 

p n c p c c

  1/2 exp(L r  Z  /2 s L r ) L r  Z  /2 s L r

Discrete - Risk Difference

(D)

Experimental Control

Estimator: Standard Error:

Event a c

p e  a n e d  pˆ e pˆ c

No event b d

p c  c n c s d    p e (1 n e p e ) 

p c

( 1 

n c p c

)   1/2 d  Z  /2 s d

n e n c

When to use OR / RR / RD

Association ‘Decreased’ None ‘Increased’ OR (0,



) <1 1 >1 RR (0,



) <1 1 >1 RD (- 1,1) <0 0 >0 OR vs RR Odds Ratio



Relative Risk if event occurs infrequently RR = a(c+d) (i.e. a and c small relative to b and d)



ad = OR (a+b)c bc Odds Ratio > Relative Risk if event occurs frequently RD vs RR When interpretation in terms of absolute difference is better than in relative terms (eg. Interest in absolute reduction in adverse events)

PROPERTIES OF RISK DIFFERENCE (RD), RELATIVE RISK (RR) AND ODDS RATIO (OR) Simple measure?

Symmetric (measure unaffected by labelling of study groups)?

Predicted event rates restricted to [0,1] if measure is assumed constant?

Unbiased estimate available?

Efficient estimation in small samples?

Motivating biological model available?

Yes Yes No Yes No Yes

Yes No No No No Yes

No Yes Yes No Yes Yes

Continuous Data - Mean Difference (MD) Experimental Control number n e n c mean

x e x c

standard deviation s e s c Mean difference (MD) : x e x c se ( x e x c

) 

s e

n e



s c

n c

100(1 -

 )

% CI

: (

x e x c

) 

 / 2 se

( x e x c )

Continuous Data - Standardized Mean Difference (SMD) Experimental Control number n e n c mean

x e x c

standard deviation s e s c SMD : d



f x e s x c where : s



(n e



1)s 2 e n e

 

n c (n c



1)s 2 c f



4(n e 4(n e

 

n c n c



se(d)   

n e n e



n c n c

 2 (

n e d

2 

n c

)   1 / 2 100(1  )% CI : d  Z  /2 se(d)

When to use MD / SMD

•

Mean Difference When studies have comparable outcome measures (ie. Same scale, probably same length of follow-up)

•

A meta-analysis using MDs is known as a weighted mean difference (WMD)

•

Standardized Mean Difference When studies use different outcome measurements which address the same clinical outcome (eg different scales)

•

Converts scale to a common scale: number of standard deviations

Example: Combining different scales for Swollen Joint Count

Study Expt Mean SD Andersen 6.9

5.2

N 12 Control Mean 19.4

SD N 12.2 12 MD SMD -12.5 -1.287

Furst 18.0

Pinheiro - 11.0 17 27.0

- - - 15.0 16 - - -9.0

- -0.671

- Weinblatt 20.0

7.75 15 23.0

Williams 17.0

12.6 56 25.0

8.0

16 -3.0

-0.371

13.4 48 -8.0

-0.612

Sources of Variation over Studies

•

“True” inter-study variation may exist (fixed/random-effects model)

•

Sampling error may vary among studies (sample size)

•

Characteristics may differ among studies (population, intervention)

Modelling Variation

• • • •

average treatment effect) Number of independent studies: k Summary Statistic: Y i (i=1,2,…,k) Large sample size: asymptotic normal distribution

Fixed-effects model vs Random-effects model

Fixed-Effects Model

•

Outcome Y i from study i is a sample from a



(ie. common mean across studies)

•

Y i



s i

) and assume E(Y i



Fixed-Effects Model

Random-Effects Model

•

Outcome Y i from study i is a sample from a



(ie. study-specific means)

•

Y i



i s

) and assume

E(Y i



• 

•

is a realization from a distribution of ‘effects’



(i=1,2,…,k) where

  2 • •   2 

is the average treatment effect

Random-Effects Model

Random Effects Model …..

Estimating Average Study Effect  • •

after averaging study-specific effects, distribution of Y i

•  

s i

2   2  2

considered and estimated

Estimating Study-Specific Effects 

•

where F

F i F

 

s i

  ( 1 

F i

)

Y i

s i

( 1 

F i

)

is the shrinkage factor for the ith

s i

2   2 )  ,  2

Modelling Variation

• • •

Studies are stratified and then combined to account for differences in sample size and study characteristics A weighted average of estimates from each study is calculated Question of whether a common or study-specific parameter is to be estimated remains …. Procedure:

• • •

perform test of homogeneity if no significant difference use fixed-effects model otherwise identify study characteristics that stratifies studies into subsets with homogeneous effects or use random effects model

Fixed Effects Model

•

Require from each study

 

effect estimate; and standard error of effect estimate Combine these using a weighted average: pooled estimate = sum of (estimate



weight) where weight sum of weights = 1 / variance of estimate

•

Assumes a common underlying effect behind every trial

Fixed-Effects Model: General Scheme

Study Measure Std Error Weight

W i

 1

s i

1 2 .

k .

Y 1 Y 2 Y k .

s 1 s 2 s k .

W 1 W 2 W k (no association: Y i =0) Overall Measure:

 ˆ

mle

 



i W W i i Y i se

(  ˆ )  1



W i

100 ( 1   )%

:  ˆ 

 / 2 se(  ˆ )

Chi-Square Tests:

 2

total

  2

assoc

  2 hom

og df (k) (

) (k-

)

   2 2

total assoc

2 hom

  ( k  i  1  i W i

Y i

W i

Y i i



W i

2 2 ) 2     1

2 2 



W i

(

Y i

  ˆ ) 2  

2  1

1 2 1 2



2 assoc



N (0,1) Cochran' s Q test If ‘large’ association If ‘large’ heterogeneity

Features in Graphic Display

•

For each trial

– – –

estimate (square) 95% confidence interval (CI) (line) size (square) indicates weight allocated

•

Solid vertical line of ‘no effect’

–

if CI crosses line then effect not significant (p>0.05)

•

Horizontal axis

–

arithmetic: RD, MD, SMD

–

logarithmic: OR, RR

• •

Diamond represents combined estimate and 95% CI Dashed line plotted vertically through combined estimate

Odds Ratio

Three methods for combining (1) Mantel-Haenszel method (2) Peto’s method (3) Maximum likelihood method Relative Risk Risk Difference

Peto Odds Ratio Mantel-Haenszel Odds Ratio

Relative Risk

Risk Difference

Weighted Mean Difference

Standardized Mean Difference

Weighted Mean Difference Standardized Mean Difference

Heterogeneity

• • • • • •

Define meaning of heterogeneity for each review Define a priori the important degree of heterogeneity (in large data sets trivial heterogeneity may be statistically significant) If heterogeneity exists examine potential sources (differences in study quality, participants, intervention specifics or outcome measurement/definition) If heterogeneity exists across studies, consider using random effects model If heterogeneity can be explained using a priori hypotheses, consider presenting results by these subgroups If heterogeneity cannot be explained, proceed with caution with further statistical aggregation and subgroup analysis

Heterogeneity: How to Identify it

•

Common sense



are the patients, interventions and outcomes in each of the included studies sufficiently similar

•

Exploratory analysis of study-specific estimates

•

Statistical tests

Heterogeneity: How to deal with it

Lau et al. 1997

Heterogeneity: Exploring it

•

Subgroup analyses

  

subsets of trials subsets of patients SUBGROUPS SHOULD BE PRE-SPECIFIED TO AVOID BIAS

•

Meta-regression

–

relate size of effect to characteristics of the trials

Exploring Heterogeneity: subgroup analysis

Random Effects Model

•

Assume true effect estimates really vary across studies

•

Two sources of variation:

within studies (between patients)

between studies (heterogeneity)

•

What the software does:

Revise weights to take into account both components of variation:

•

weight = 1 variance+heterogeneity

•

When heterogeneity exists we get

 a different pooled estimate (but not necessarily) with a different interpretation   a wider confidence interval a larger

-value

Random Effects Model

 ˆ (  )

mle





W W i

( 

(  )

Y i

)

where W i

(  ) 

s i

2 1   2  2 If is unknown three common methods of inference can be used: Restricted Maximum Likelihood (REML) Bayesian Method of Moments (MOM)

Method of Moments (Random effects model)

 2

 max   0 ,  2 hom 

i W i



 

W i

(

2  1 ) 

W i

 

k Study 1 2 Measure .

Y 1 Y 2 Y k Weight (FE) .

W 1 W 2 W k Overall Measure

 ˆ *  



i W i W i

Y i

(  ˆ 100 ( 1 * )    )%

1  :

 

W i

*  *

 / 2 se(  ˆ * )

Weight (RE) .

w w 1 2 * * =(w =(w 1 2 -1 -1 + )

+ )

-1 -1 .

w k * =(w k -1 + )

-1

Effect of model choice on study weights

Larger studies receive

proportionally

less

weight in RE model than in FE model

Fixed Effects Fixed vs Random Effects: Discrete Data Random Effects

Fixed Effects Fixed vs Random Effects: Continuous Data Random Effects

Omission of Outlier - Chestnut Study

Analysis

• • • • • • •

Include all relevant and clinically useful measures of treatment effect Perform a narrative, qualitative summary when data are too sparse, of too low quality or too heterogeneous to proceed with a meta-analysis Specify if fixed or random effects model is used Describe proportion of patients used in final analysis Use confidence intervals Include a power analysis Consider cumulative meta-analysis (by order of publication date, baseline risk, study quality) to assess the contribution of successive studies

Steps of a Cochrane Systematic Review

Subgroup Analyses

•

Pre-specify hypothesis-testing subgroup analyses and keep few in number

•

Label all a posteriori subgroup analyses

•

When subgroup differences are detected, interpret in light of whether they are:

• • • • • •

established a priori few in number supported by plausible causal mechanisms important (qualitative vs quantitative) consistent across studies statistically significant (adjusted for multiple testing)

Sensitivity Analyses

•

Test robustness of results relative to key features of the studies and key assumptions and decisions

•

Include tests of bias due to retrospective nature of systematic reviews (eg.with/without studies of lower methodologic quality)

•

Consider fragility of results by determining effect of small shifts in number of events between groups

•

Consider cumulative meta-analysis to explore relationship between effect size and study quality, control event rates and other relevent features

•

Test a reasonable range of values for missing data from studies with uncertain results

Funnel Plot

• • •

Scatterplot of effect estimates against sample size Used to detect publication bias If no bias, expect symmetric, inverted funnel

x x x x x x x x x x x x •

If bias, expect asymmetric or skewed shape

x x x x x x x x x x

Suggestion of missing small studies

Funnel Plot Example 1: Prophylaxis of NSAID induced Gastric Ulcers

700 600 500 400 300 200 100 0 0.0

Effect Size (RR)

1.0

1.2

Intervention

H2-Blockers

Funnel Plot Example 2: Alendronate for Postmenopausal Osteoporosis

2500 2000 1500 1000 500 0 0 5

Weighted Mean Difference

10 WMD of % change in lumbar bone mineral density

Steps of a Cochrane Systematic Review

Presentation of Results

•

Include a structured abstract

•

Include a table of the key elements of each study

•

Include summary data from which the measures are computed

•

Employ informative graphic displays representing confidence intervals, group event rates, sample sizes etc.

Interpretation of Results

• • • • • • •

Interpret results in context of current health care State methodologic limitations of studies and review Consider size of effect in studies and review, their consistency and presence of dose-response relationship Consider interpreting results in context of temporal cumulative meta-analysis Interpret results in light of other available evidence Make recommendations clear and practical Propose future research agenda (clinical and methodological requirements)

Generic Inferential Framework

Generic inferential framework

(1) Conceptually, think of a ‘generic’ effect size statistic T (2) corresponding effect size parameter θ (3) associated standard error SE(T), square root of variance (4) for some effect sizes, some suitable transformation may be needed to make inference based on normal distribution theory

Generic inferential framework ...

(A) Fixed-Effects Model (FEM):

–

Assume a common effect size

–

Obtain average effect size as a weighted mean (unbiased)

•

Optimal weight is reciprocal of variance (inverse variance weighted method)

Generic inferential framework ...

•

Variances inversely proportional to within study sample sizes

–

what is the effect of larger studies in calculating weights?

–

may also weigh by ‘quality’ index, q, scaled from 0 to 1

Generic inferential framework ...

• • •

Average effect size has conditional variance (a function of conditional variances of each effect size, quality index, …)

–

e.g.. V = 1/total weight Multiply the resulting standard error by appropriate critical value (1.96, 2.58, 1.645) Construct confidence interval and/or test statistic

Generic inferential framework ...

• •

Test the homogeneity assumption using a weighted effect size sums of squares of deviations, Q If Q exceeds the critical value of chi square at k-1 d.f. (k = number of studies), then observed between-study variance significantly greater than what would be expected under the null hypothesis

Generic inferential framework ...

• •

When within-study sample sizes are very large, Q may be rejected even when individual effect size estimates do not differ much One can take different courses of action when Q is rejected (see next page)

Generic inferential framework ...

•

Methodologic choices in dealing with ‘heterogeneous’ data

Generic inferential framework ...

(B) –

Random-Effects Model (REM): Total variability of an observed study effect size reflects within and between variance (extra variance component)

–

If between-studies variance is zero, equations of REM reduce to those of FEM

–

Presence of a variance component which is significantly different from zero may be indicative of REM

Generic inferential framework ...

•

Once significance of variance component is established (e.g.. Q test for homogeneity of effect size),

– –

its magnitude should be estimated variance components can be estimated in many ways!

•

the most commonly used method is the so-called the DerSimonian-Laird method which is based on method-of moments approach

–

Compute random effects weighted mean as an estimate of the average of the random effects in the population

–

construct confidence interval and conduct hypothesis tests as before (new variance and thus new weights!!!)

Correlation Coefficient

Example: Correlation coefficient

• • • •

A measure of association more popular in cross sectional observational studies than in RCTs is Pearson’s correlation coefficient, r given by

   (

(



) 2  (

Y Y

 )

) 2

X and Y must be continuous (e.g. blood pressure and weight) r lies between -1 to 1 not available in RevMan / MetaView at this time

Correlation coefficient (cont’d)

•

Following the generic framework discussed earlier:

–

the effect size statistic is r

–

the corresponding effect size parameter is the underlying population correlation coefficient,

 –

in this case, a suitable transformation is needed to achieve approximate normality of effect size

–

inference is conducted on the scale of the transformed variable and final results are back-transformed to the original scale

Correlation coefficient (cont’d)

Assuming X and Y have a bivariate normal distribution, the Fisher’s Z transformed variable

 1 2 log 1 1  

r r

has, for large sample, an approximate normal distribution with mean of and a variance of

  1 2 log 1 1      1

 3

Hence, weighting factor associated with Z is W = 1/Var = n-3.

Correlation coefficient (cont’d)

•

meta-analysis is carried out on Z-transformed measures and final results are transformed back to the scale of correlation using



Z Z



1



1

Numerical Example

•

Source: Fleiss J., Statistical Methods in Medical Research 1993; 2: 121 - 145.

• •

correlation coefficients reported by 7 independent studies in education are included in the meta-analysis Comparison: association between a characteristic of the teacher and the mean measure of his or her student’s achievement

Study n r Z* W ** WZ WZ 2 ============================================================== 1 2 3 4 15 -0.073 -0.073 16 0.308 0.318 15 0.481 0.524 16 0.428 0.457 12 -0.876 0.064

13 12 13 4.134 1.315

6.288 3.295

5.941 2.715

5 15 0.180 0.182 12 2.184 0.397

6 17 0.290 0.299 Sum 14 4.186 1.252

7 __ 15 0.400 0.424 _ 12 ___5.088 2.157__

88 26.945 11.195

=================================================== * Z = Fisher’s Z-transformation of r ** W = n-3

    

i W Z i i

 2

 

(



)

W Z i i

) /

 2 

W i

Q = 2.94 on 6 df is not statistically significant.

Results and discussions

• • •

No evidence for heterogeneous association across studies Fixed effect analysis may be undertaken Questions:

–

Would a random effect analysis as shown earlier produce a different numerical value for the combined correlation coefficient?

–

How would the weights be modified to carry out a REM?

Results and discussions (cont’d)

•

the weighted mean of Z is

 

W Z i i

/ 

W i

 26.945 / 88  0.306

•

the approximate standard error of the combined mean is

 1 

W i

 1 88  0.107

Results and discussions (cont’d)

•

Test of significance is carried out using



 0.306

0.107

 2.86

–

this value exceeds the critical value 1.96 (corresponding to 5% level of significance), so we conclude that average value of Z (hence the average correlation) is statistically significant

Results and discussions (cont’d)

•

95% confidence interval for



 1.96



) 0.096

   0.516

•

Transforming back to the original scale, a 95% CI for the parameter of interest,



, is

0.096

0.474

–

again confirming a significant association

Critical Appraisal of a Systematic Review

(A) The Message

•

Does the review set out to answer a precise question about patient care?

–

Should be different from an uncritical encyclopedic presentation

(B) The Validity

•

Have studies been sought thoroughly:

   

Medline and other relevant bibliographic database Cochrane controlled clinical trials register Foreign language literature "Grey literature" (unpublished or un-indexed reports: theses, conference proceedings, internal reports, non-indexed journals, pharmaceutical industry files)

 

Reference chaining from any articles found Personal approaches to experts in the field to find unpublished reports



Hand searches of the relevant specialized journals.

Validity (cont’d)

•

Have inclusion and exclusion criteria for studies been stated explicitly, taking account of the patients in the studies, the interventions used, the outcomes recorded and the methodology?

Validity (cont’d)

•

Have the authors considered the homogeneity of the studies: the idea that the studies are sufficiently similar in their design, interventions and subjects to merit combination.

–

this is done either by eyeballing graphs like the forest plot or by applications of chi-square tests (Q test)

(C) The Utility

•

The various studies may have used patients of different ages or social classes, but if the treatment effects are consistent across the studies, then generalisation to other groups or populations is more justified.

Utility (cont’d)

•

Be wary of sub-group analyses where the authors attempt to draw new conclusions by comparing the outcomes for patients in one study with the patients in another study

–

Be wary of "data-dredging" exercises, testing multiple hypotheses against the data, especially if the hypotheses were constructed after the study had begun data collection.

Utility (cont’d)

•

One may also want to ask:

 

Were all clinically important outcomes considered? Are the benefits worth the harms and costs?