Methodology for Guideline Development for the 7th ACCP

Download Report

Transcript Methodology for Guideline Development for the 7th ACCP

Grading evidence and
recommendations
The GRADE approach
Holger Schünemann, MD, PhD
for the GRADE Working Group
Professional good intentions and
plausible theories are insufficient
for selecting policies and
practices for protecting,
promoting and restoring health.
Iain Chalmers
How can we judge the
extent of our confidence
that adherence to a
recommendation will do
more good than harm?
GRADE
Grades of Recommendation
Assessment, Development and
Evaluation
What do you know about GRADE?




Have prepared a guideline
Read the BMJ paper
Have prepared a systematic review and a summary of
findings table
Have attended a GRADE meeting, workshop or talk
About GRADE
o
o
o
o
o
o
Began as informal working group in 2000
Researchers/guideline developers with interest in
methodology
Aim: to develop a common system for grading
the quality of evidence and the strength of
recommendations that is sensible and to explore
the range of interventions and contexts for
which it might be useful*
13 meetings (~10 – 35 attendants)
Evaluation of existing systems and reliability*
Workshops at Cochrane Colloquia, WHO and GIN
since 2000
*Grade Working Group. CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005
GRADE Working Group
David Atkins, chief medical officera
a) Agency for Healthcare Research and Quality,
b
USA
Dana Best, assistant professor
c
b) Children's National Medical Center, USA
Peter A Briss, chief
d
c) Centers for Disease Control and Prevention, USA
Martin Eccles, professor
e
d) University of Newcastle upon Tyne, UK
Yngve Falck-Ytter, associate director
f
e) German Cochrane Centre, Germany
Signe Flottorp, researcher
f) Norwegian Centre for Health Services, Norway
Gordon H Guyatt, professorg
Robin T Harbour, quality and information director h g) McMaster University, Canada
h) Scottish Intercollegiate Guidelines Network, UK
Margaret C Haugh, methodologisti
i) Fédération Nationale des Centres de Lutte
David Henry, professorj
Contre le Cancer, France
Suzanne Hill, senior lecturerj
j) University of Newcastle, Australia
Roman Jaeschke, clinical professork
k) McMaster University, Canada
Gillian Leng, guidelines programme directorl
l) National Institute for Clinical Excellence, UK
Alessandro Liberati, professorm
m) Università di Modena e Reggio Emilia, Italy
Nicola Magrini, directorn
n) Centro per la Valutazione della Efficacia della
James Mason, professord
Assistenza Sanitaria, Italy
Philippa Middleton, honorary research fellowo
o) Australasian Cochrane Centre, Australia
Jacek Mrukowicz, executive directorp
p) Polish Institute for Evidence Based Medicine,
Dianne O’Connell, senior epidemiologistq
Poland
Andrew D Oxman, directorf
q) The Cancer Council, Australia
Bob Phillips, associate fellowr
r) Centre for Evidence-based Medicine, UK
Holger J Schünemann, associate professorg,s
s) National Cancer Institute, Italy
Tessa Tan-Torres Edejer, medical
t) World Health Organisation, Switzerland
officer/scientistt
u) Finnish Medical Society Duodecim, Finland
Helena Varonen, associate editoru
v) Duke University Medical Center, USA
Gunn E Vist, researcherf
w) Centers for Disease Control and Prevention, USA
John W Williams Jr, associate professorv
Stephanie Zaza, project directorw
Why guidelines?
Guideline users look for different things

just tell me what to do (recommendation)

what to do, and on strong or weak grounds
– recommendation and grade

recommend, grade, evidence summary, values
– systematic review, value statement

evidence from individual studies
When to make a recommendation?
– never
•
•
patient values differ
just lay out benefits and risks
– when evidence strong enough
•
when very weak, too uncertain
– clinicians need guidance
•
intense study demands decision
Why bother about grading?



People draw conclusions about the
– quality of evidence
– strength of recommendations
Systematic and explicit approaches can
help
– protect against errors
– resolve disagreements
– facilitate critical appraisal
– communicate information
However, there is wide variation in
currently used approaches
Who is confused?
Evidence
 II-2
 C+
 Strong
Recommendation
B
1
Strongly
recommended
Organization
 USPSTF
 ACCP
 GCPS
Still not confused?
Recommendation for use of oral anticoagulation
in patients with atrial fibrillation and rheumatic
mitral valve disease
Evidence
 B
 C+
 IV
Recommendation
Class I
1
C
Organization
 AHA
 ACCP
 SIGN
Guidelines development process
Quality of evidence
The extent to which one can be confident that an estimate of
effect or association is correct.
It depends on the:
– study design (e.g. RCT, cohort study)
– study quality/limitations (protection against bias; e.g.
concealment of allocation, blinding, follow-up)
– consistency of results
– directness of the evidence including the




populations (those of interest versus similar; for example,
older, sicker or more co-morbidity)
interventions (those of interest versus similar; for example,
drugs within the same class)
outcomes (important versus surrogate outcomes)
comparison (A - C versus A - B & C - B)
Quality of evidence
The quality of the evidence (i.e. our confidence) may
also be REDUCED when there is:
 Sparse or imprecise data
 Reporting bias
The quality of the evidence (i.e. our confidence) may
be INCREASED when there is:




A strong association
A dose response relationship
All plausible confounders would have reduced the
observed effect
All plausible biases would have increased the
observed lack of effect
Quality assessment criteria
Quality of
evidence
High
Study design
Lower if
Higher if
Randomised trial
Study quality:
-1 Serious
limitations
-2 Very serious
limitations
Strong association:
+1 Strong, no
plausible
confounders
+2 Very strong,
no major
threats to
validity
Moderate
Low
Observational
study
Very low
Any other
evidence
-1 Important
inconsistency
Directness:
-1 Some
uncertainty
-2 Major
uncertainty
-1 Sparse or
imprecise data
-1 High probability
of reporting bias
+1 Evidence of a
Dose response
gradient
+1 All plausible
confounders
would have
reduced the
effect
Categories of quality




High: Further research is very unlikely
to change our confidence in the
estimate of effect.
Moderate: Further research is likely to
have an important impact on our
confidence in the estimate of effect
and may change the estimate.
Low: Further research is very likely to
have an important impact on our
confidence in the estimate of effect
and is likely to change the estimate.
Very low: Any estimate of effect is
very uncertain.












Judgements about the overall
quality of evidence




Most systems not explicit
Options:
– strongest outcome
– primary outcome
– benefits
– weighted
– separate grades for benefits and harms
– no overall grade
– weakest outcome
Based on lowest of all the critical outcomes
Beyond the scope of a systematic review
Strength of recommendation
The extent to which one can be confident that
adherence to a recommendation will do more
good than harm.
 trade-offs (the relative value attached to the
expected benefits, harms and costs)
 quality of the evidence
 translation of the evidence into practice in a
specific setting
 uncertainty about baseline risk
Values and preferences
Where would you prefer to live?
← Option 1
Option 2 →
← Option 1
(pink card)
Option 2 →
(green card)
You are hiking.
Which of the following animals
would you prefer to encounter?
← Option 1
(pink card)
Option 2 →
(green card)
You are buying an ice cream.
Which flavor do you prefer?
Strawberry
← Option 1
(pink card)
Chocolate
Option 2 →
(green card)
You are buying a new car.
Which one would you buy?
Red Ferrari
Option 2 →
(green card)
← Option 1
(pink card)
Yellow fox
Judgements about the balance
between benefits and harms


Before considering cost and making a
recommendation
For a specified setting, taking into
account issues of translation into practice
Clarity of the trade-offs between
benefits and the harms




the estimated size of the effect for each
main outcome
the precision of these estimates
the relative value attached to the
expected benefits and harms
important factors that could be expected
to modify the size of the expected
effects in specific settings; e.g. proximity
to a hospital
Balance between benefits and
harm




Net benefits: The intervention does more
good than harm.
Trade-offs: There are important tradeoffs between the benefits and harms.
Uncertain net benefits: It is not clear
whether the intervention does more good
than harm.
Not net benefits: The intervention does
not do more good than harm.
Judgements about
recommendations
This should include considerations of costs;
i.e. “Is the net gain (benefits-harms) worth
the costs?”
 Do it
 Probably do it
No recommendation
 Probably don’t do it
 Don’t do it
Will GRADE lead to change?
Should healthy asymptomatic postmenopausal women have
been given oestrogen + progestin for prevention in 1992?

Quality of evidence across studies for
–
–
–
–
–
–
–



CHD
Hip fracture
Colorectal cancer
Breast cancer
Stroke
Thrombosis
Gall bladder disease
Quality of evidence across critical outcomes
Balance between benefits and harms
Recommendations
Evidence profile: Quality assessment
Oestrogen + progestin for prevention in 1992
(before WHI and HERS)
Oestrogen + progestin versus usual care
Oestrogen + progestin for
prevention after WHI and HERS
Further developments





Diagnostic tests
Complexity
Costs
(Equity)
Empirical evaluations
GRADE Profiler
GRADE profiler (GRADEpro)
Empirical evaluations




Critical appraisal of other systems
Pilot test + sensibility
“Case law” + practical experience
Guidance for judgements
– Single studies
– Sparse data or imprecise data




Agreement
Validity?
Comparisons with other systems
Alternative presentations
Comparison of GRADE and other systems













Explicit definitions
Explicit, sequential judgements
Components of quality
Overall quality
Relative importance of outcomes
Balance between health benefits and harms
Balance between incremental health benefits and
costs
Consideration of equity
Evidence profiles
International collaboration
Software
Consistent judgements?
Communication?
Who is interested in GRADE











WHO
American Endocrine Society
American College of Chest Physicians (ACCP)
Italian National Cancer Institute
Clinical Evidence
Norwegian Centre for Health Services
UpToDate
Close relationship with Cochrane
Collaboration
American Society of Clinical Oncology
(ASCO)
Urology Associations
American Thoracic Society
Case scenario and clinical
question


70 year old men with history of
hypertension presents to the ED with right
upper and lower extremity weakness and
slurred speech for approximately two hours.
A head CT is not showing signs of
intracranial bleeding. Workup for
contraindication to intravenous fibrinolysis
(rTPA is used in your hospital) is negative.
In elderly men with acute stroke and
treated hypertension, does thrombolytic
therapy administered within 3 hours
compared to no thrombolysis reduce death?
Questions?
Taking account of costs

Include important (disaggregated) costs in
evidence summaries and balance sheets when
relevant
–
–
–
–




May be useful to aggregate and value (in monetary terms)
Always include disaggregated resource utilisation
Note when important information is missing
Published cost-effectiveness analyses are rarely helpful
Assess the quality of the evidence for important
costs (consumption of resources) as for other
effects (Were quantities measured reliably?)
If costs are critical to a decision, low quality
evidence can lower the overall quality of evidence
Costs are negotiable (the value of resources)
There are many possible criteria for making a
recommendation
Should activated protein C be given
to patients in severe sepsis?
An example with costs
GRADE evidence profile:
Activated Protein C for sepsis







Name:
Jaeschke and Schunemann
Date:
September 2004
Question: Should APC be used for severe sepsis?
Setting:
ICU in Paris
Baseline risk:
Severe sepsis or septic shock > 24 h
References:
Effectiveness: Bernard 2001. Efficacy
and safety of recombinant human activated protein C for
severe sepsis. NEJM 2001; 344:699 and Manns 2002. An
economic evaluation of activated protein C treatment for
severe sepsis. NEJM 2002;347:993.
Cost-effectiveness: Manns 2002. An economic evaluation
of activated protein C treatment for severe sepsis. NEJM
2002;347:993.
Possible criteria for making a
recommendation







Treatment effect
Adverse effects
Cost
Cost-effectiveness
Equity
Seriousness of the problem
Administrative restrictions
Quality assessment
Summary of findings