Globalize the evidence – individualize Decisions

Download Report

Transcript Globalize the evidence – individualize Decisions

Holger Schünemann, MD, PhD
Professor
McMaster University, Hamilton, Canada
Madrid, Spain
November 26, 2008
GRADE INTRODUCTION
1
Content
 Background and rationale for revisiting
guideline methodology
 GRADE approach
 Quality of evidence
 Strength of recommendations
Content
 Background and rationale for revisiting
guideline methodology
 GRADE approach
 Quality of evidence
 Strength of recommendations
Confidence in evidence
 There always is evidence
 “When there is a question there is evidence”
 Evidence alone is never sufficient to make a
clinical decision
 Better research  greater confidence in the
evidence and decisions
Hierarchy of evidence

Randomized Controlled
Trials

Cohort Studies and Case
Control Studies

Case Reports and Case
Series, Non-systematic
observations
Expert Opinion
BIAS
Expert Opinion
Expert Opinion
STUDY DESIGN
Can you explain the following?
 Concealment of randomization
 Blinding (who is blinded in a double blinded trial?)
 Intention to treat analysis and its correct
application
 Why trials stopped early for benefit overestimate
treatment effects?
 P-values and confidence intervals
Hierarchy of evidence

Randomized Controlled
Trials

Cohort Studies and Case
Control Studies

Case Reports and Case
Series, Non-systematic
observations
BIAS
Expert Opinion
STUDY DESIGN
Reasons for grading evidence?
 People draw conclusions about the
 quality of evidence and strength of recommendations
 Systematic and explicit approaches can help
 protect against errors, resolve disagreements
 communicate information and fulfil needs
 Change practitioner behavior
 However, wide variation in approaches
GRADE working group. BMJ. 2004 & 2008
Which grading system?
Recommendation for use of oral anticoagulation
in patients with atrial fibrillation and rheumatic
mitral valve disease
Evidence
B
A
 IV
Recommendation
Class I
1
C
Organization
 AHA
 ACCP
 SIGN
10
A COPD guidelines
11
Another COPD guidelines
12
And another COPD guideline
13
What to do?
14
Content
 Background and rationale for revisiting
guideline methodology
 GRADE approach
 Quality of evidence
 Strength of recommendations
Limitations of existing
systems
 confuse quality of evidence with strength of
recommendations
 lack well-articulated conceptual framework
 criteria not comprehensive or transparent
 GRADE unique




breadth, intensity of development process
wide endorsement and use
conceptual framework
comprehensive, transparent criteria
 Focus on all important outcomes related to a
specific question and overall quality
Grades of Recommendation Assessment,
Development and Evaluation
GRADE
WORKING GROUP
CMAJ 2003, BMJ 2004, BMC 2004, BMC
2005, AJRCCM 2006, Chest 2006, BMJ 2008
GRADE Working Group
David Atkins, chief medical officera
Dana Best, assistant professorb
Martin Eccles, professord
Francoise Cluzeau, lecturerx
Yngve Falck-Ytter, associate directore
Signe Flottorp, researcherf
Gordon H Guyatt, professorg
Robin T Harbour, quality and information director h
Margaret C Haugh, methodologisti
David Henry, professorj
Suzanne Hill, senior lecturerj
Roman Jaeschke, clinical professork
Regina Kunx, Associate Professor
Gillian Leng, guidelines programme directorl
Alessandro Liberati, professorm
Nicola Magrini, directorn
James Mason, professord
Philippa Middleton, honorary research fellowo
Jacek Mrukowicz, executive directorp
Dianne O’Connell, senior epidemiologistq
Andrew D Oxman, directorf
Bob Phillips, associate fellowr
Holger J Schünemann, professorg,s
Tessa Tan-Torres Edejer, medical officert
David Tovey, Editory
Jane Thomas, Lecturer, UK
Helena Varonen, associate editoru
Gunn E Vist, researcherf
John W Williams Jr, professorv
Stephanie Zaza, project directorw
a) Agency for Healthcare Research and Quality, USA
b) Children's National Medical Center, USA
c) Centers for Disease Control and Prevention, USA
d) University of Newcastle upon Tyne, UK
e) German Cochrane Centre, Germany
f) Norwegian Centre for Health Services, Norway
g) McMaster University, Canada
h) Scottish Intercollegiate Guidelines Network, UK
i) Fédération Nationale des Centres de Lutte Contre le Cancer,
France
j) University of Newcastle, Australia
k) McMaster University, Canada
l) National Institute for Clinical Excellence, UK
m) Università di Modena e Reggio Emilia, Italy
n) Centro per la Valutazione della Efficacia della Assistenza Sanitaria,
Italy
o) Australasian Cochrane Centre, Australia
p) Polish Institute for Evidence Based Medicine, Poland
q) The Cancer Council, Australia
r) Centre for Evidence-based Medicine, UK
s) National Cancer Institute, Italy
t) World Health Organisation, Switzerland
u) Finnish Medical Society Duodecim, Finland
v) Duke University Medical Center, USA
w) Centers for Disease Control and Prevention, USA
x) University of London, UK
Y) BMJ Clinical Evidence, UK
GRADE Uptake















World Health Organization
Allergic Rhinitis in Asthma Guidelines (ARIA)
American Thoracic Society
British Medical Journal
Infectious Disease Society of America
American College of Chest Physicians
UpToDate
American College of Physicians
Cochrane Collaboration
National Institute Clinical Excellence (NICE)
Infectious Disease Society of America
European Society of Thoracic Surgeons
Clinical Evidence
Agency for Health Care Research and Quality (AHRQ)
Over 20 major organizations
The GRADE approach
Clear separation of 2 issues:
1) 4 categories of quality of evidence: very low, low,
moderate, or high quality?



methodological quality of evidence
likelihood of systematic deviation from truth
by outcome
2) Recommendation: 2 grades - weak or strong (for
or against)?
 Quality of evidence only one factor
*www.GradeWorking-Group.org
GRADE Quality of Evidence
“Extent to which confidence in estimate of effect
adequate to support decision”
 high: considerable confidence in estimate of effect.
 moderate: further research likely to have impact on
confidence in estimate, may change estimate.
 low: further research is very likely to impact on
confidence, likely to change the estimate.
 very low: any estimate of effect is very uncertain
Determinants of quality
 RCTs start high
 observational studies start low
 5 factors lower the quality of evidence





limitations in detailed design and execution
inconsistency
indirectness
reporting bias
imprecision
 3 factors can increase the quality of evidence
Quality assessment criteria
Quality of
evidence
High
Study design
Lower if
Higher if
Randomised trial
Study quality:
Serious
limitations
Very serious
limitations
Strong association:
Strong, no
plausible
confounders
Very strong,
no major
threats to
validity
Moderate
Low
Very low
Observational
study
Important
inconsistency
Directness:
Some
uncertainty
Major
uncertainty
Sparse or
imprecise data
High probability
of reporting bias
Evidence of a
Dose response
gradient
All plausible
confounders
would have
reduced the
effect
Example: Design and Execution
 limitations
 inappropriate randomization
 lack of concealment
 intention to treat principle violated
 inadequate blinding
 loss to follow-up
 early stopping for benefit
Design and Execution

CDSR 2008
From Cates , CDSR 2008
Design and Execution
Overall judgment required
What can raise quality?
3 Factors
 large magnitude can upgrade one level
 very large two levels
 common criteria
 everyone used to do badly
 almost everyone does well
 Epinephrin in allergic shock
 dose response relation
(higher INR – increased bleeding)
 Residual confounding likely to reduce (increase)
observed (lack of) effect
From Evidence to
Recommendation
28
The clinical scenario
A 68 year old male long-term patient of yours.
He suffers from COPD but is unable to stop
smoking after over 30 years of tobacco use.
He has been taking beta-carotene
supplements for several months because
someone in the “healthy food” store
recommended it to prevent cancer.
He wants to know whether this will prevent
him from getting cancer and whether he
should use beta-carotene.
Strength of recommendation
“The strength of a recommendation reflects the
extent to which we can, across the range of
patients for whom the recommendations are
intended, be confident that desirable effects of a
management strategy outweigh undesirable
effects.”
Desirable and undesirable
effects
 Desirable effects
 Mortality
 improvement in quality of life, fewer
hospitalizations/infections
 reduction in the burden of treatment
 reduced resource expenditure
 Undesirable effects
• deleterious impact on morbidity, mortality or quality of life,
increased resource expenditure
Determinants of the strength
of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence
The higher the quality of evidence, the
more likely is a strong
recommendation.
Balance between desirable
The larger the difference between the
and undesirable effects
desirable and undesirable
consequences, the more likely a strong
recommendation warranted. The
smaller the net benefit and the lower
certainty for that benefit, the more likely
weak recommendation warranted.
Values and preferences
The greater the variability in values and
preferences, or uncertainty in values
and preferences, the more likely weak
recommendation warranted.
Costs (resource allocation)
The higher the costs of an intervention
– that is, the more resources
consumed – the less likely is a strong
recommendation warranted
Developing recommendations
Implications of
a strong recommendation
 Patients: Most people in this situation would want
the recommended course of action and only a small
proportion would not
 Clinicians: Most patients should receive the
recommended course of action
 Policy makers: The recommendation can be
adapted as a policy in most situations
Implications of
a weak recommendation
 Patients: The majority of people in this situation
would want the recommended course of action,
but many would not
 Clinicians: Be prepared to help patients to make a
decision that is consistent with their own
values/decision aids and shared decision making
 Policy makers: There is a need for substantial
debate and involvement of stakeholders
Exercise!
36
A COPD guidelines
37
Another COPD guidelines
38
And another COPD guideline
39
The clinical question
Population:
Intervention:
Comparison:
Outcomes:
In smokers with COPD
does beta-carotene suppl
compared to no suppl.
reduce the risk of COPD
symptoms, lung cancer
and death and improve PFTs?
Two trials
1) The Alpha-Tocopherol Beta-Carotene (ATBC)
trial randomly assigned 29,133 people to
receive beta carotene, tocopherol, both, or
placebo.
Study participants averaged 57.2 years of
age, 20.4 cigarettes per day, and 35.9 years of
smoking. They were followed up for 5 to 8
years.
RR for lung cancer = 1.16 (95% CI 1.02-1.33)
Albanes et al, JNCI, 1996
Two trials
2) The Beta-Carotene and Retinol Efficacy Trial
(CARET) evaluated high-risk current and
former smokers with a 20–pack-year history
of smoking (n = 14,254), ~ 60 years old.
The participants were randomly assigned to
receive either a combination of beta carotene
and vitamin A or placebo. Mean length of
follow up: 4 years.
RR for lung cancer = 1.28 (95% CI 1.04-1.57)
Determinants of the strength
of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence
The higher the quality of evidence, the
more likely is a strong
recommendation.
Balance between desirable
The larger the difference between the
and undesirable effects
desirable and undesirable
consequences, the more likely a strong
recommendation warranted. The
smaller the net benefit and the lower
certainty for that benefit, the more likely
weak recommendation warranted.
Values and preferences
The greater the variability in values and
preferences, or uncertainty in values
and preferences, the more likely weak
recommendation warranted.
Costs (resource allocation)
The higher the costs of an intervention
– that is, the more resources
consumed – the less likely is a strong
recommendation warranted
Determinants of the strength
of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence
High
Balance between desirable
and undesirable effects
Values and preferences
Clear balance towards harm
Costs (resource allocation)
Lowering use of supplements will
reduce resource use
Little variability
Determinants of the strength
of recommendation
Factors that can weaken the
strength of a recommendation.
Example:
Lower quality evidence
Decision
□
□
Uncertainty about the balance of
benefits versus harms and burdens
Uncertainty or differences in values
□
□
□
□
Uncertainty about whether the net
benefits are worth the costs
□
□
Explanation
Yes
No
Yes
No
Yes
No
Yes
No
Table. Decisions about the strength of a recommendation
Frequent “yes” answers will increase the likelihood of a weak recommendation
Your recommendation
 Team up in pairs of two or three or four and
formulate your recommendation for the
guideline on COPD
 I will collect your answers
Your recommendation
Our recommendation
In patients with COPD who continue to smoke,
we recommend stopping beta-carotene
supplementation.
OR:
In patients with COPD who continue to smoke,
clinicians should stop beta-carotene
supplementation.
Studies
Health Care
Question
(PICO)
Systematic reviews
Outcomes
Important
outcomes
S1
S2
S3
S4
OC1
OC2
OC3
OC4
OC1
OC2
Critical
outcomes
S5
OC3
OC4
Rate the quality of evidence for each outcome, across studies
RCTs start high, observational studies start low
(-)
Study limitations
Imprecision
Inconsistency of results
Indirectness of evidence
Publication bias likely
(+)
Large magnitude of effect
Dose response
Plausible confounders would ↓ effect when an
effect is present or ↑ effect if effect is absent
Final rating of quality for each outcome: high, moderate, low, or very low
Reevaluate estimate of effect for each outcome
Rate overall quality of evidence (GRADE)
(lowest quality among critical outcomes)
Decide on the direction (for/against) and grade strength of
the recommendation (strong/weak*) considering:
*also labeled “conditional”
Quality of the evidence
Balance benefits/harms
Values and preferences
Decide if any revision of direction or strength is
necessary considering:
Guideline development process
Prioritise Problems, establish panel

Systematic Review

Evidence Profile

Relative importance of outcomes

Overall quality of evidence

GRADE
Benefit – downside evaluation

Strength of recommendation

Implementation and evaluation of guidelines
Guideline development process
Prioritise Problems, establish panel

Systematic Review

Summary
Evidence Profile

of Findings
Relative importance of outcomes

Overall quality of evidence
GRADE

Benefit – downside evaluation

Strength of recommendation

Implementation and evaluation of guidelines
GRADE Profiles
Summary of Findings Tables
Conclusions
 GRADE is gaining acceptance as international
standard
 Criteria for evidence assessment across questions and
outcomes
 Criteria for moving from evidence to
recommendations
 Simple, transparent, systematic
 four categories of quality of evidence
 two grades for strength of recommendations
 Transparency in decision making and judgments is
key
55
Assessing the quality of
evidence
56
1. Design and Execution
 limitations
 lack of concealment
 intention to treat principle violated
 inadequate blinding
 loss to follow-up
 early stopping for benefit
 selective outcome reporting
 Example: RCT suggests that danaparoid sodium is of
benefit in treating HIT complicated by thrombosis
 Key outcome: clinicians’ assessment of when the
thromboembolism had resolved
 Not blinded – subjective judgement
2. Inconsistency of results
(Heterogeneity)
 if inconsistency, look for explanation
 patients, intervention, outcome, methods
 unexplained inconsistency downgrade quality
 Bleeding in thrombosis-prophylaxed hospitalized
patients
 seven RCTs
 4 lower, 3 higher risk
Example: Bleeding in the
hospital
Dentali et al. Ann Int Med, 2007
 Judgment
 variation in size of effect
 overlap in confidence intervals
 statistical significance of heterogeneity
 I2
Heparin or vitamin K
antagonists for survival in
patients with cancer
Akl E, Barba M, Rohilla S, Terrenato I, Sperati F, Schünemann HJ. “Anticoagulation for the long term treatment of venous
thromboembolism in patients with cancer”. Cochrane Database Syst Rev. 2008 Apr 16;(2):CD006650.
Non-steroidal drug use and
risk of pancreatic cancer
Capurso G, Schünemann HJ, Terrenato I, Moretti A, Koch M, Muti P, Capurso L, Delle Fave G.
Meta-analysis: the use of non-steroidal anti-inflammatory drugs and pancreatic cancer risk for different exposure categories.
Aliment Pharmacol Ther. 2007 Oct 15;26(8):1089-99.
3. Directness of Evidence
 differences in
 populations/patients (mild versus severe COPD, older,
sicker or more co-morbidity)
 interventions (all inhaled steroids, new vs. old)
 outcomes (important vs. surrogate; long-term healthrelated quality of life, short –term functional capacity,
laboratory exercise, spirometry)
 indirect comparisons
 interested in A versus B
 have A versus C and B versus C
 formoterol versus salmeterol versus tiotropium
Directness
interested in A versus B
available data A vs C, B vs C
Alendronate
Risedronate
Placebo
4. Publication Bias &
Imprecision
 Publication bias
 number of small studies
I.V. Mg in
acute
myocardial
infarction
ISIS-4
Lancet 1995
Meta-analysis
Yusuf S.Circulation 1993
Publication bias
Egger M, Smith DS. BMJ 1995;310:752-54
66
Funnel plot
Standard Error
0
Symmetrical:
No publication bias
1
2
3
0.1
0.3
0.6 1
3
10
Odds ratio
Egger M, Cochrane Colloquium Lyon 2001
67
Funnel plot
Standard Error
0
Asymmetrical:
Publication bias?
1
2
3
0.1
0.3
0.6 1
3
10
Odds ratio
Egger M, Cochrane Colloquium Lyon 2001
68
I.V. Mg in
acute
myocardial
infarction
ISIS-4
Lancet 1995
Meta-analysis
Yusuf S.Circulation 1993
Publication bias
Egger M, Smith DS. BMJ 1995;310:752-54
69
Metaanalysis
confirme
d by
megatrials
Egger M, Smith DS. BMJ 1995;310:752-54
70
Publication bias (File
Drawer Problem)
 Faster and multiple publication of “positive”
trials
 Fewer and slower publication of “negative”
trials
71
5. Imprecision
 small sample size
 small number of events
 wide confidence intervals
 uncertainty about magnitude of effect
 how to decide if CI too wide?
 grade down one level?
 grade down two levels?
 extent to which confidence in estimate of effect
adequate to support decision
Example: Bleeding in the
hospital
Dentali et al. Ann Int Med, 2007
Offer all effective
treatments?
 atrial fib at risk of stroke
 warfarin increases serious gi bleeding
 3% per year
 1,000 patients 1 less stroke
 30 more bleeds for each stroke prevented
 1,000 patients 100 less strokes
 3 strokes prevented for each bleed
 where is your threshold?
 how many strokes in 100 with 3% bleeding?
1.0%
0
1.0%
0
1.0%
0
1.0%
0
What can raise quality?
1. large magnitude can upgrade (RRR 50%)
 very large two levels (RRR 80%)
 common criteria
 everyone used to do badly
 almost everyone does well
 oral anticoagulation for mechanical heart valves
 insulin for diabetic ketoacidosis
 hip replacement for severe osteoarthritis
2. dose response relation
(higher INR – increased bleeding)
3. all plausible confounding may be working to reduce the
demonstrated effect or increase the effect if no effect was
observed
All plausible confounding
would result in an underestimate of
the treatment effect
 Higher death rates in private for-profit versus
private not-for-profit hospitals
 patients in the not-for-profit hospitals likely sicker
than those in the for-profit hospitals
 for-profit hospitals are likely to admit a larger
proportion of well-insured patients than not-forprofit hospitals (and thus have more resources
with a spill over effect)
All plausible biases
would result in an overestimate of
effect



Hypoglycaemic drug phenformin causes
lactic acidosis
The related agent metformin is under
suspicion for the same toxicity.
Large observational studies have failed to
demonstrate an association
 Clinicians would be more alert to lactic acidosis in
the presence of the agent
82
Hands-on exercise
 Work in small groups
 Select someone to report back to the whole group
 Watch the time
 Read the following instructions
1. Familiarize yourself with the review that you
have been given (if you are not yet familiar
with it): read the abstract
2. Identify the clinical question in the PICO
(Population, Intervention, Comparison,
Outcome) format. Work in your small group.
83
Hands-on exercise cont’d
3. Select up to 7 important outcomes for this
comparison (consider the suggestions)
 transfer them to a blank evidence profile or
use GRADEpro (see worksheet 3 on page 6 of
this handout or go to ). Work in your small
group or in pairs.
4. Assess the quality of evidence for an
outcome according to the GRADE approach
5. Move from evidence to recommendations
using the evidence profile
84
85
Holger Schünemann, MD, PhD
Professor
GRADE FROM EVIDENCE TO
RECOMMENDATIONS
86
Evidence based clinical
decisions
Patient values
and preferences
Clinical state and
circumstances
Expertise
Research evidence
Equal for all
Haynes et al. 2002
The GRADE approach
Separation of 2 issues:
1) 4 categories of quality of evidence: very low, low,
moderate, or high quality?



methodological quality of evidence
likelihood of bias
by outcome and across outcomes
2) Recommendation: 2 grades - weak or strong (for
or against)?
 Quality of evidence only one factor
www.GradeWorkingGroup.org
Grades of recommendation:
Strength of recommendations
Strong recommendations
 high quality methods with large precise effect
 benefits much greater than downsides, or downsides
much greater than benefits
 we recommend
 Grade 1
Weak/conditional recommendations
 lower quality methods
 benefits not clearly greater or smaller than downsides
 values and preferences uncertain or very variable
 we suggest
 Grade 2
Case scenario
A 13 year old girl who lives in rural Indonesia presented
with flu symptoms and developed severe respiratory
distress over the course of the last 2 days. She
required intubation. The history reveals that she
shares her living quarters with her parents and her
three siblings. At night the family’s chicken stock
shares this room too and several chicken had died
unexpectedly a few days before the girl fell sick.
Relevant clinical question?
Example from a not so common disease
Clinical question:
Population:
Avian Flu/influenza A (H5N1) patients
Intervention: Oseltamivir (or Zanamivir)
Comparison: No pharmacological intervention
Outcomes:
Mortality, hospitalizations,
resource use, adverse outcomes,
antimicrobial resistance
Schunemann et al. The Lancet ID, 2007
Methods – WHO Rapid Advice Guidelines
for management of Avian Flu
 Applied findings of a recent systematic evaluation of
guideline development for WHO/ACHR
 Group composition (including panel of 13 voting
members):





clinicians who treated influenza A(H5N1) patients
infectious disease experts
basic scientists
public health officers
methodologists
 Independent scientific reviewers:
 Identified systematic reviews, recent RCTs, case series,
animal studies related to H5N1 infection
Evidence Profile
Oseltamivir for treatment of H5N1 infection:
Summary of findings
Quality assessment
No of studies
(Ref)
Design
Limitations
Consistency
No of patients
Other
considerations
Directness
Effect
Oseltamivir
Placebo
Relative
(95% CI)
Absolute
(95% CI)
Quality
Importance
Healthy adults:
Mortality
0
Hospitalisation (Hospitalisations from influenza – influenza cases only)
-
-
-
-
-
5
(TJ 06)
Imprecise or
sparse data (-1)
-
-
OR 0.22
(0.02 to 2.16)
-

Very low
6
-
-
-
-
-
-
7
2/982
(0.2%)
9/662
(1.4%)
RR 0.149
(0.03 to 0.69)
-

Very low
8
Randomised
trial
No limitations One trial only
-
Major
uncertainty
(-2)1
9
Duration of hospitalization
0
LRTI (Pneumonia - influenza cases only)
5
(TJ 06)
Randomised
trial
-
No limitations One trial only
-
Major
uncertainty
(-2)1
Imprecise or
sparse data (-1)2
Duration of disease (Time to alleviation of symptoms/median time to resolution of symptoms – influenza cases only)
Randomised
53
No limitations4 Important
trials
inconsistency
(TJ 06)
(DT 03)
(-1)5
Viral shedding (Mean nasal titre of excreted virus at 24h)
26
(TJ 06)
Randomised
trials
No limitations
-7
Major
uncertainty
(-2)1
-
-
-
HR 1.303
(1.13 to 1.50)
-

Very low
5
Major
uncertainty
(-2)1
None
-
-
-
WMD -0.738
(-0.99 to -0.47)

Low
4
-
-
-
-
-
-
4
-
-
-
-
-
-
7
-
-
-
-
-
-
7
Imprecise or
sparse data (-1)14
-
-
OR range15
(0.56 to 1.80)
-

Low
-
-
-
-
-
-
Outbreak control
0
Resistance
-
-
-
-
0
Serious adverse effects (Mention of significant or serious adverse effects)
09
Minor adverse effects
311
(TJ 06)
10
-
-
-
(number and seriousness of adverse effects)
Randomised
trials
No limitations
-12
Some
uncertainty
(-1)13
Cost of drugs
0
-
-
-
-
4
Oseltamivir for Girl with
Avian Flu
Summary of findings:
 No clinical trial of oseltamivir for treatment of
H5N1 patients.
 4 systematic reviews and health technology
assessments (HTA) reporting on 5 studies of
oseltamivir in seasonal influenza.
 Hospitalization: OR 0.22 (0.02 – 2.16)
 Pneumonia: OR 0.15 (0.03 - 0.69)




3 published case series.
Many in vitro and animal studies.
No alternative that is more promising at present.
Cost: ~ Euro 40$ per treatment course
Determinants of the strength
of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence
The higher the quality of evidence, the
more likely is a strong
recommendation.
Balance between desirable
The larger the difference between the
and undesirable effects
desirable and undesirable
consequences, the more likely a strong
recommendation warranted. The
smaller the net benefit and the lower
certainty for that benefit, the more likely
weak recommendation warranted.
Values and preferences
The greater the variability in values and
preferences, or uncertainty in values
and preferences, the more likely weak
recommendation warranted.
Costs (resource allocation)
The higher the costs of an intervention
– that is, the more resources
consumed – the less likely is a strong
recommendation warranted
Example: Oseltamivir for
Avian Flu
Recommendation: In patients with confirmed or
strongly suspected infection with avian influenza A
(H5N1) virus, clinicians should administer
oseltamivir treatment as soon as possible (?????
recommendation, very low quality evidence).
Schunemann et al. The Lancet ID, 2007
Example: Oseltamivir for
Avian Flu
Recommendation: In patients with confirmed or
strongly suspected infection with avian influenza A
(H5N1) virus, clinicians should administer
oseltamivir treatment as soon as possible (strong
recommendation, very low quality evidence).
Values and Preferences
Remarks: This recommendation places a high
value on the prevention of death in an illness
with a high case fatality. It places relatively low
values on adverse reactions, the development
of resistance and costs of treatment.
Schunemann et al. The Lancet ID, 2007
Other explanations
Remarks: Despite the lack of controlled treatment
data for H5N1, this is a strong recommendation, in
part, because there is a lack of known effective
alternative pharmacological interventions at this
time.
The panel voted on whether this recommendation
should be strong or weak and there was one
abstention and one dissenting vote.
ACCP: Acute coronary syndrome
Would a recommendation ignoring V & P be
possible?
For all patients presenting with NSTE ACS, without
a clear allergy to aspirin, we recommend
immediate aspirin, 75 to 325 mg po, and then daily,
75 to 162 mg po (strong recommendation, high
quality evidence).
Value sensitive
recommendation
 Idiopathic deep venous thrombosis (DVT) is potentially life
threatening condition
 Patients usually receive blood thinners for one year
 Continuing therapy will reduce a patients absolute risk for
recurrent DVT by 7% per year for several years
 The burdens include:
 taking a blood thinner daily
 keeping dietary intake of vitamin K constant
 blood tests to monitor the intensity of anticoagulation
 living with increased risk of minor and major bleeding
Value sensitive
recommendations
→ Patients who are very averse to a recurrent DVT would
consider the benefits of avoiding DVT worth the downsides of
taking warfarin. Other patients are likely to consider the benefit
not worth the harms and burden.
For patients with idiopathic DVT, without elevated bleeding risk,
we suggest long term warfarin therapy (weak recommendation,
high quality evidence).
Values and preferences: this recommendation ascribes a low
value to bleeding complications and burden from therapy and a
high value to avoiding DVTs
Quality assessment criteria
Quality of
evidence
High
Study design
Lower if
Higher if
Randomised trial
Study quality:
Serious
limitations
Very serious
limitations
Strong association:
Strong, no
plausible
confounders
Very strong,
no major
threats to
validity
Moderate
Low
Observational
study
Very low
Any other
evidence
Important
inconsistency
Directness:
Some
uncertainty
Major
uncertainty
Sparse or
imprecise data
High probability
of reporting bias
Evidence of a
Dose response
gradient
All plausible
confounders
would have
reduced the
effect
Strength of recommendation
 “The strength of a recommendation reflects
the extent to which we can, across the range
of patients for whom the recommendations
are intended, be confident that desirable
effects of a management strategy outweigh
undesirable effects.”
 Strong or weak
Quality of evidence &
strength of recommendation
 Linked but no automatism
 Other factors beyond the quality of evidence
influence our confidence that adherence to a
recommendation causes more benefit than harm
 Systems/approaches failed to make this explicit
 GRADE separates quality of evidence from
strength of recommendation
Implications of
a strong recommendation
 Patients: Most people in this situation would want
the recommended course of action and only a small
proportion would not
 Clinicians: Most patients should receive the
recommended course of action
 Policy makers: The recommendation can be
adapted as a policy in most situations
Implications of
a weak recommendation
 Patients: The majority of people in this situation
would want the recommended course of action,
but many would not
 Clinicians: Be prepared to help patients to make a
decision that is consistent with their own
values/decision aids and shared decision making
 Policy makers: There is a need for substantial
debate and involvement of stakeholders
Respiratory disease guidelines ?
Factors determining strength
of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence
The higher the quality of evidence, the
more likely is a strong
recommendation.
Balance between desirable and The larger the difference between the
undesirable effects
desirable and undesirable
consequences, the more likely a strong
recommendation warranted. The
smaller the net benefit and the lower
the certainty for that benefit, the more
likely is a weak recommendation.
Values and preferences
The greater the variability in values and
preferences, or uncertainty in values
and preferences, the more likely weak
recommendation warranted.
Costs (resource allocation)
The higher the costs of an intervention
– that is, the more resources
consumed – the less likely is a strong
recommendation warranted
Values & Preferences
 Patients’ perspectives, beliefs, expectations,
and goals for health and life.
 Underlying processes used in considering the
benefits, harms, costs, and inconveniences
patients will experience with each
management option and the resulting
preferences for each option.
Relative importance of
outcomes and management
approaches
 Guideline panels should be explicit about the relative
value they place on the range of relevant patientimportant outcomes. If values and preferences vary
widely, a strong recommendation becomes less likely
 Example: Patients vary widely in their view of how
aversive they find the risk of a stroke versus the risk of
a gastrointestinal bleed when deciding about oral
anticoagulation for atrial fibrillation.
Desirable and undesirable
effects
 desirable effects
 Mortality
 improvement in quality of life, fewer hospitalizations
 reduction in the burden of treatment
 reduced resource expenditure
 undesirable consequences
 deleterious impact on morbidity, mortality or quality
of life, increased resource expenditure
Conclusion
 clinicians, policy makers need summaries
 quality of evidence
 strength of recommendations
 explicit rules
 transparent, informative
 GRADE





four categories of quality of evidence
two grades for strength of recommendations
transparent, systematic by and across outcomes
applicable to diagnosis
wide adoption
113
Questions for you
 Are systematic reviews for every recommendation
in your guidelines a reality/possibility?
 What about cost – how do you deal with cost and
how should we deal with it?
Content
 Study design – bias
 Levels/quality of evidence - GRADE
 Guidelines/Recommendations
Content
 Study design – bias
 Levels/quality of evidence - GRADE
 Guidelines/Recommendations
Confidence in evidence
 There always is evidence
 “When there is a question there is evidence”
 Better research  greater confidence in the
evidence and decisions
 Evidence alone is never sufficient to make a
clinical decision
Evidence based clinical
decisions
Patient values
and preferences
Clinical state and
circumstances
Expertise
Research evidence
Equal for all
Haynes et al. 2002
About GRADE
 Since 2000
 Researchers/guideline developers with interest
in methodology
 Aim: to develop a common, transparent and
sensible system for grading the quality of
evidence and the strength of
recommendations
 Evaluation of existing systems
GRADE Evidence Profiles
The GRADE approach
Clear separation of 2 issues:
1) 4 categories of quality of evidence: very low, low,
moderate, or high quality?



methodological quality of evidence
likelihood of bias
by outcome and across outcomes
2) Recommendation: 2 grades - weak or strong (for
or against)?
 Quality of evidence only one factor
*www.GradeWorking-Group.org
Determinants of quality
 RCTs start high
 observational studies start low
 what can lower quality?
1.
2.
3.
4.
5.
detailed design and execution
inconsistency
indirectness
reporting bias
imprecision
123
The GRADE approach
Separation of 2 issues:
1) 4 categories of quality of evidence: very low, low,
moderate, or high quality?



methodological quality of evidence
likelihood of bias
by outcome and across outcomes
2) Recommendation: 2 grades - weak or strong (for
or against)?
 Quality of evidence only one factor
www.GradeWorkingGroup.org
Grades of recommendation:
Strength of recommendations
Strong recommendations
 high quality methods with large precise effect
 benefits much greater than downsides, or downsides
much greater than benefits
 we recommend
 Grade 1
Weak/conditional recommendations
 lower quality methods
 benefits not clearly greater or smaller than downsides
 values and preferences uncertain or very variable
 we suggest
 Grade 2
Case scenario
A 13 year old girl who lives in rural Indonesia presented
with flu symptoms and developed severe respiratory
distress over the course of the last 2 days. She
required intubation. The history reveals that she
shares her living quarters with her parents and her
three siblings. At night the family’s chicken stock
shares this room too and several chicken had died
unexpectedly a few days before the girl fell sick.
Relevant clinical question?
Example from a not so common disease
Clinical question:
Population:
Avian Flu/influenza A (H5N1) patients
Intervention: Oseltamivir (or Zanamivir)
Comparison: No pharmacological intervention
Outcomes:
Mortality, hospitalizations,
resource use, adverse outcomes,
antimicrobial resistance
Schunemann et al. The Lancet ID, 2007
Methods – WHO Rapid Advice Guidelines
for management of Avian Flu
 Applied findings of a recent systematic evaluation of
guideline development for WHO/ACHR
 Group composition (including panel of 13 voting
members):





clinicians who treated influenza A(H5N1) patients
infectious disease experts
basic scientists
public health officers
methodologists
 Independent scientific reviewers:
 Identified systematic reviews, recent RCTs, case series,
animal studies related to H5N1 infection
Evidence Profile
Oseltamivir for treatment of H5N1 infection:
Summary of findings
Quality assessment
No of studies
(Ref)
Design
Limitations
Consistency
No of patients
Other
considerations
Directness
Effect
Oseltamivir
Placebo
Relative
(95% CI)
Absolute
(95% CI)
Quality
Importance
Healthy adults:
Mortality
0
Hospitalisation (Hospitalisations from influenza – influenza cases only)
-
-
-
-
-
5
(TJ 06)
Imprecise or
sparse data (-1)
-
-
OR 0.22
(0.02 to 2.16)
-

Very low
6
-
-
-
-
-
-
7
2/982
(0.2%)
9/662
(1.4%)
RR 0.149
(0.03 to 0.69)
-

Very low
8
Randomised
trial
No limitations One trial only
-
Major
uncertainty
(-2)1
9
Duration of hospitalization
0
LRTI (Pneumonia - influenza cases only)
5
(TJ 06)
Randomised
trial
-
No limitations One trial only
-
Major
uncertainty
(-2)1
Imprecise or
sparse data (-1)2
Duration of disease (Time to alleviation of symptoms/median time to resolution of symptoms – influenza cases only)
Randomised
53
No limitations4 Important
trials
inconsistency
(TJ 06)
(DT 03)
(-1)5
Viral shedding (Mean nasal titre of excreted virus at 24h)
26
(TJ 06)
Randomised
trials
No limitations
-7
Major
uncertainty
(-2)1
-
-
-
HR 1.303
(1.13 to 1.50)
-

Very low
5
Major
uncertainty
(-2)1
None
-
-
-
WMD -0.738
(-0.99 to -0.47)

Low
4
-
-
-
-
-
-
4
-
-
-
-
-
-
7
-
-
-
-
-
-
7
Imprecise or
sparse data (-1)14
-
-
OR range15
(0.56 to 1.80)
-

Low
-
-
-
-
-
-
Outbreak control
0
Resistance
-
-
-
-
0
Serious adverse effects (Mention of significant or serious adverse effects)
09
Minor adverse effects
311
(TJ 06)
10
-
-
-
(number and seriousness of adverse effects)
Randomised
trials
No limitations
-12
Some
uncertainty
(-1)13
Cost of drugs
0
-
-
-
-
4
Oseltamivir for Girl with
Avian Flu
Summary of findings:
 No clinical trial of oseltamivir for treatment of
H5N1 patients.
 4 systematic reviews and health technology
assessments (HTA) reporting on 5 studies of
oseltamivir in seasonal influenza.
 Hospitalization: OR 0.22 (0.02 – 2.16)
 Pneumonia: OR 0.15 (0.03 - 0.69)




3 published case series.
Many in vitro and animal studies.
No alternative that is more promising at present.
Cost: ~ Euro 40$ per treatment course
Determinants of the strength
of recommendation
Factors that can strengthen a Comment
recommendation
Quality of the evidence
The higher the quality of evidence, the
more likely is a strong
recommendation.
Balance between desirable
The larger the difference between the
and undesirable effects
desirable and undesirable
consequences, the more likely a strong
recommendation warranted. The
smaller the net benefit and the lower
certainty for that benefit, the more likely
weak recommendation warranted.
Values and preferences
The greater the variability in values and
preferences, or uncertainty in values
and preferences, the more likely weak
recommendation warranted.
Costs (resource allocation)
The higher the costs of an intervention
– that is, the more resources
consumed – the less likely is a strong
recommendation warranted
Example: Oseltamivir for
Avian Flu
Recommendation: In patients with confirmed or
strongly suspected infection with avian influenza A
(H5N1) virus, clinicians should administer
oseltamivir treatment as soon as possible (?????
recommendation, very low quality evidence).
Schunemann et al. The Lancet ID, 2007
Example: Oseltamivir for
Avian Flu
Recommendation: In patients with confirmed or
strongly suspected infection with avian influenza A
(H5N1) virus, clinicians should administer
oseltamivir treatment as soon as possible (strong
recommendation, very low quality evidence).
Values and Preferences
Remarks: This recommendation places a high
value on the prevention of death in an illness
with a high case fatality. It places relatively low
values on adverse reactions, the development
of resistance and costs of treatment.
Schunemann et al. The Lancet ID, 2007
Other explanations
Remarks: Despite the lack of controlled treatment
data for H5N1, this is a strong recommendation, in
part, because there is a lack of known effective
alternative pharmacological interventions at this
time.
The panel voted on whether this recommendation
should be strong or weak and there was one
abstention and one dissenting vote.
ACCP: Acute coronary syndrome
Would a recommendation ignoring V & P be
possible?
For all patients presenting with NSTE ACS, without
a clear allergy to aspirin, we recommend
immediate aspirin, 75 to 325 mg po, and then daily,
75 to 162 mg po (strong recommendation, high
quality evidence).
Value sensitive
recommendation
 Idiopathic deep venous thrombosis (DVT) is potentially life
threatening condition
 Patients usually receive blood thinners for one year
 Continuing therapy will reduce a patients absolute risk for
recurrent DVT by 7% per year for several years
 The burdens include:
 taking a blood thinner daily
 keeping dietary intake of vitamin K constant
 blood tests to monitor the intensity of anticoagulation
 living with increased risk of minor and major bleeding
Value sensitive
recommendations
→ Patients who are very averse to a recurrent DVT would
consider the benefits of avoiding DVT worth the downsides of
taking warfarin. Other patients are likely to consider the benefit
not worth the harms and burden.
 For patients with idiopathic DVT, without elevated bleeding
risk, we suggest long term warfarin therapy (weak
recommendation, high quality evidence).
Values and preferences: this recommendation ascribes a low
value to bleeding complications and burden from therapy and a
high value to avoiding DVTs
Background to GRADEpro
People faced with a health care decision need summaries,
including information about the quality of evidence
 Detailed summaries: providing guidance for large
audiences, e.g. guideline panels – detailed profiles
 Usable shorter summaries: users of systematic reviews,
e.g. clinicians
How do I create a summary of
findings table or evidence profile
 GRADEpro – software to create SoF
 Import data from RevMan 5 into GRADEpro
 Create table – author makes suggestions
about information to present and GRADEs
the evidence
 Export table from GRADEpro and import into
RevMan 5
Creating a new GRADEpro
file
Profile groups
Profiles
Profiles: Questions
Importing a RevMan 5 file
of a systematic review
Imported data from RevMan 5 file:
• outcomes
• meta-analyses results
• bibliographic information
Managing outcomes to include a
maximum of 7
Entering/editing information for dichotomous
outcomes
Entering/editing information to grade the quality of the
evidence
155
2. Consistency of results
 Look for explanation for inconsistency
 patients, intervention, comparator, outcome, methods
 Judgment
 variation in size of effect
 overlap in confidence intervals
 statistical significance of heterogeneity
 I2
3. Directness of Evidence
 indirect comparisons
 interested in A versus B
 have A versus C and B versus C
 differences in
 patients
 interventions
 outcomes
Directness of Evidence
Table 5. Sources of likely indirectness of evidence
Source of indirectness Question of interest
Indirect comparison
Early emergency department
systemic corticosteroids to
treat acute exacerbations in
adult patients with asthma
Differences in
Anti-leukotrienes plus inhaled
populations
glucocorticosteroids vs.
inhaled glucocorticosteroids
alone to prevent asthma
exacerbations and nighttime
symptoms in patients with
chronic asthma and allergic
rhinitis.
Differences in
Avoidance of pet allergens in
intervention
non-allergic infants or
preschool children to prevent
development of allergy.
Differences in
outcomes of interest
Intranasal glucocorticosteroids
vs. oral H1-antihistamines in
children with seasonal allergic
rhinitis.
Example
Both oral and intravenous routes are
effective but there is no direct
comparison of these two routes of
administration in adults.
Trials that measured asthma
exacerbations and nighttime symptoms
did not include patients with allergic
rhinitis.
Available studies used multifaceted
interventions directed at multiple
potential risk factors in addition to pet
avoidance.
In the available study parents were rating
the symptoms and quality of life of their
teenage children, instead the children
themselves.
4. Reporting Bias
 reporting bias
 reporting of studies
 publication bias
 number of small studies
 reporting of outcomes
Example: a systematic review of topical treatments for seasonal allergic
conjunctivitis showed that patients using topical sodium cromoglycate
were more likely to perceive benefit than those using placebo. However,
only small trials reported clinically and statistically significant benefits of
active treatment, while a larger trial showed a much smaller and a
statistically not significant effect (Owen 2004 [53]). These findings suggest
that smaller studies demonstrating smaller effects might not have been
published.
5. Imprecision
 small sample size
 small number of events
 wide confidence intervals
 uncertainty about magnitude of effect
Disclosure
In the past three years, Dr. Schünemann received no
personal payments for service from the pharmaceutical
industry. His research group received research grants and
- until April 2008 - fees and/or honoraria that were
deposited into research accounts from Chiesi Foundation
and Lily, as lecture fees related to research methodology.
He is documents editor for the American Thoracic Society.
Institutions or organizations that he is affiliated with likely
receive funding from for-profit sponsors that are
supporting infrastructure and research that may serve his
work.