Methodology for Guideline Development for the 7th ACCP

Download Report

Transcript Methodology for Guideline Development for the 7th ACCP

Grading evidence and
recommendations
1 February 2005
Professional good intentions and
plausible theories are insufficient
for selecting policies and practices
for protecting, promoting and
restoring health.
Iain Chalmers
How can we judge the
extent of our confidence
that adherence to a
recommendation will do
more good than harm?
GRADE
Grades of Recommendation
Assessment, Development and
Evaluation
GRADE Working Group
David Atkins, chief medical officera
Dana Best, assistant professorb
Peter A Briss, chiefc
Martin Eccles, professord
Yngve Falck-Ytter, associate directore
Signe Flottorp, researcherf
Gordon H Guyatt, professorg
Robin T Harbour, quality and information director h
Margaret C Haugh, methodologisti
David Henry, professorj
Suzanne Hill, senior lecturerj
Roman Jaeschke, clinical professork
Gillian Leng, guidelines programme director l
Alessandro Liberati, professorm
Nicola Magrini, directorn
James Mason, professord
Philippa Middleton, honorary research fellowo
Jacek Mrukowicz, executive director p
Dianne O’Connell, senior epidemiologistq
Andrew D Oxman, directorf
Bob Phillips, associate fellowr
Holger J Schünemann, associate professorgg,s
Tessa Tan-Torres Edejer, medical officer/scientistt
Helena Varonen, associate editoru
Gunn E Vist, researcherf
John W Williams Jr, associate professorv
Stephanie Zaza, project directorw
a) Agency for Healthcare Research and Quality, USA
b) Children's National Medical Center, USA
c) Centers for Disease Control and Prevention, USA
d) University of Newcastle upon Tyne, UK
e) German Cochrane Centre, Germany
f) Norwegian Centre for Health Services, Norway
g) McMaster University, Canada
h) Scottish Intercollegiate Guidelines Network, UK
i) Fédération Nationale des Centres de Lutte Contre le
Cancer, France
j) University of Newcastle, Australia
k) McMaster University, Canada
l) National Institute for Clinical Excellence, UK
m) Università di Modena e Reggio Emilia, Italy
n) Centro per la Valutazione della Efficacia della Assistenza
Sanitaria, Italy
o) Australasian Cochrane Centre, Australia
p) Polish Institute for Evidence Based Medicine, Poland
q) The Cancer Council, Australia
r) Centre for Evidence-based Medicine, UK
s) University of Buffalo, USA
t) World Health Organisation, Switzerland
u) Finnish Medical Society Duodecim, Finland
v) Duke University Medical Center, USA
w) Centers for Disease Control and Prevention, USA
Opinions do not necessarily represent those of
the institutions with which the members of the
GRADE Working Group are affiliated.
What do you know about GRADE?




Have prepared a guideline
Read the BMJ paper
Have prepared a systematic review and a summary of
findings table
Have attended a GRADE meeting, workshop or talk
Why bother about grading?



People draw conclusions about the
– quality of evidence
– strength of recommendations
Systematic and explicit approaches can
help
– protect against errors
– resolve disagreements
– facilitate critical appraisal
– communicate information
However, there is wide variation in
currently used approaches
Who is confused?
Evidence
 II-2
 C+
 Strong
Recommendation
B
1
Strongly
recommended
Organization
 USPSTF
 ACCP
 GCPS
Still not confused?
Recommendation for use of oral
anticoagulation in patients with atrial
fibrillation and rheumatic mitral valve
disease
Evidence
 B
 C+
 IV
Recommendation
Class I
1
C
Organization
 AHA
 ACCP
 SIGN
Guidelines development process
Quality of evidence
The extent to which one can be confident that an estimate of
effect or association is correct.
It depends on the:
– study design (e.g. RCT, cohort study)
– study quality/limitations (protection against bias; e.g.
concealment of allocation, blinding, follow-up)
– consistency of results
– directness of the evidence including the




populations (those of interest versus similar; for example,
older, sicker or more co-morbidity)
interventions (those of interest versus similar; for example,
drugs within the same class)
outcomes (important versus surrogate outcomes)
comparison (A - C versus A - B & C - B)
Quality of evidence
The quality of the evidence (i.e. our confidence) may
also be REDUCED when there is:
 Sparse or imprecise data
 Reporting bias
The quality of the evidence (i.e. our confidence) may
be INCREASED when there is:




A strong association
A dose response relationship
All plausible confounders would have reduced the
observed effect
All plausible biases would have increased the
observed lack of effect
Quality assessment criteria
Quality of
evidence
High
Study design
Lower if
Higher if
Randomised trial
Study quality:
-1 Serious
limitations
-2 Very serious
limitations
Strong association:
+1 Strong, no
plausible
confounders
+2 Very strong,
no major
threats to
validity
Moderate
Low
Observational
study
Very low
Any other
evidence
-1 Important
inconsistency
Directness:
-1 Some
uncertainty
-2 Major
uncertainty
-1 Sparse or
imprecise data
-1 High probability
of reporting bias
+1 Evidence of a
Dose response
gradient
+1 All plausible
confounders
would have
reduced the
effect
Categories of quality




High: Further research is very unlikely
to change our confidence in the
estimate of effect.
Moderate: Further research is likely to
have an important impact on our
confidence in the estimate of effect
and may change the estimate.
Low: Further research is very likely to
have an important impact on our
confidence in the estimate of effect
and is likely to change the estimate.
Very low: Any estimate of effect is
very uncertain.












Judgements about the overall
quality of evidence




Most systems not explicit
Options:
– strongest outcome
– primary outcome
– benefits
– weighted
– separate grades for benefits and harms
– no overall grade
– weakest outcome
Based on lowest of all the critical outcomes
Beyond the scope of a systematic review
Strength of recommendation
The extent to which one can be confident that
adherence to a recommendation will do more
good than harm.
 trade-offs (the relative value attached to the
expected benefits, harms and costs)
 quality of the evidence
 translation of the evidence into practice in a
specific setting
 uncertainty about baseline risk
Judgements about the balance
between benefits and harms


Before considering cost and making a
recommendation
For a specified setting, taking into
account issues of translation into practice
Clarity of the trade-offs between
benefits and the harms




the estimated size of the effect for each
main outcome
the precision of these estimates
the relative value attached to the
expected benefits and harms
important factors that could be expected
to modify the size of the expected
effects in specific settings; e.g. proximity
to a hospital
Balance between benefits and
harm




Net benefits: The intervention does more
good than harm.
Trade-offs: There are important tradeoffs between the benefits and harms.
Uncertain net benefits: It is not clear
whether the intervention does more good
than harm.
Not net benefits: The intervention does
not do more good than harm.
Judgements about
recommendations
This should include considerations of costs;
i.e. “Is the net gain (benefits-harms) worth
the costs?”
 Do it
 Probably do it
No recommendation
 Probably don’t do it
 Don’t do it
Will GRADE lead to change
Should healthy asymptomatic postmenopausal women have
been given oestrogen + progestin for prevention in 1992?

Quality of evidence across studies for
–
–
–
–
–
–
–



CHD
Hip fracture
Colorectal cancer
Breast cancer
Stroke
Thrombosis
Gall bladder disease
Quality of evidence across critical outcomes
Balance between benefits and harms
Recommendations
Evidence profile: Quality assessment
Oestrogen + progestin for prevention in 1992
(before WHI and HERS)
Oestrogen + progestin versus usual care
Oestrogen + progestin for
prevention after WHI and HERS
Further developments





Diagnostic tests
Complexity
Costs
(Equity)
Empirical evaluations
GRADE for diagnostic tests
Quality of evidence
High
Moderate
Low
Very low
Study design
Cross-sectional (or cohort)
studies of patients with
diagnostic uncertainty with
direct comparison
Anything else
Lower if *
Study limitations
(including
representativeness of
population, choice of gold
standard, incomplete
performance of tests,
independence of test
interpretation)
-1 Serious limitations
-2 Very serious limitations
-1 Important
inconsistency
Directness
-1-Some uncertainty
-2-Major uncertainty
-1 Sparse or imprecise
data
-1 High probability of
reporting bias
GRADE Profiler
Taking account of costs

Include important (disaggregated) costs in
evidence summaries and balance sheets when
relevant
–
–
–
–




May be useful to aggregate and value (in monetary terms)
Always include disaggregated resource utilisation
Note when important information is missing
Published cost-effectiveness analyses are rarely helpful
Assess the quality of the evidence for important
costs (consumption of resources) as for other
effects (Were quantities measured reliably?)
If costs are critical to a decision, low quality
evidence can lower the overall quality of evidence
Costs are negotiable (the value of resources)
There are many possible criteria for making a
recommendation
Should activated protein C be given
to patients in severe sepsis?
An example with costs
GRADE evidence profile.
Activated Protein C for sepsis







Name:
Jaeschke and Schunemann
Date:
September 2004
Question: Should APC be used for severe sepsis?
Setting:
ICU in Copenhagen
Baseline risk:
Severe sepsis or septic shock > 24 h
References:
Effectiveness: Bernard 2001. Efficacy
and safety of recombinant human activated protein C for
severe sepsis. NEJM 2001; 344:699 and Manns 2002. An
economic evaluation of activated protein C treatment for
severe sepsis. NEJM 2002;347:993.
Cost-effectiveness: Manns 2002. An economic evaluation
of activated protein C treatment for severe sepsis. NEJM
2002;347:993.
Possible criteria for making a
recommendation







Treatment effect
Adverse effects
Cost
Cost-effectiveness
Equity
Seriousness of the problem
Administrative restrictions
Quality assessment
Summary of findings
Empirical evaluations




Critical appraisal of other systems
Pilot test + sensibility
“Case law” + practical experience
Guidance for judgements
– Single studies
– Sparse data or imprecise data




Agreement
Validity?
Comparisons with other systems
Alternative presentations
Comparison of GRADE and other systems












Explicit definitions
Explicit, sequential judgements
Components of quality
Overall quality
Relative importance of outcomes
Balance between health benefits and harms
Balance between incremental health benefits and
costs
Consideration of equity
Evidence profiles
International collaboration
Consistent judgements?
Communication?
We will serve the public more
responsibly and ethically
when research designed to reduce
the likelihood that we will be
misled by bias and the play of
chance has become
an expected element of professional
and policy making practice, not an
optional add-on.
Iain Chalmers
A prerequisite
Practitioners and policy makers
must make much clearer that
they need rigorous evaluative
research to help ensure that they
do more good than harm.
Iain Chalmers