Methodology for Guideline Development for the 7th ACCP

Download Report

Transcript Methodology for Guideline Development for the 7th ACCP

Grading evidence and
recommendations
Workshop W-069
Congress Hall ABEF
Oct 6 2004
Professional good intentions and
plausible theories are insufficient
for selecting policies and practices
for protecting, promoting and
restoring health.
Iain Chalmers
How can we judge the
extent of our confidence
that adherence to a
recommendation will do
more good than harm?
GRADE
Grades of Recommendation
Assessment, Development and
Evaluation
GRADE Working Group
David Atkins, chief medical officera
Dana Best, assistant professorb
Peter A Briss, chiefc
Martin Eccles, professord
Yngve Falck-Ytter, associate directore
Signe Flottorp, researcherf
Gordon H Guyatt, professorg
Robin T Harbour, quality and information director h
Margaret C Haugh, methodologisti
David Henry, professorj
Suzanne Hill, senior lecturerj
Roman Jaeschke, clinical professork
Gillian Leng, guidelines programme director l
Alessandro Liberati, professorm
Nicola Magrini, directorn
James Mason, professord
Philippa Middleton, honorary research fellowo
Jacek Mrukowicz, executive director p
Dianne O’Connell, senior epidemiologistq
Andrew D Oxman, directorf
Bob Phillips, associate fellowr
Holger J Schünemann, associate professorgg,s
Tessa Tan-Torres Edejer, medical officer/scientistt
Helena Varonen, associate editoru
Gunn E Vist, researcherf
John W Williams Jr, associate professorv
Stephanie Zaza, project directorw
a) Agency for Healthcare Research and Quality, USA
b) Children's National Medical Center, USA
c) Centers for Disease Control and Prevention, USA
d) University of Newcastle upon Tyne, UK
e) German Cochrane Centre, Germany
f) Norwegian Centre for Health Services, Norway
g) McMaster University, Canada
h) Scottish Intercollegiate Guidelines Network, UK
i) Fédération Nationale des Centres de Lutte Contre le
Cancer, France
j) University of Newcastle, Australia
k) McMaster University, Canada
l) National Institute for Clinical Excellence, UK
m) Università di Modena e Reggio Emilia, Italy
n) Centro per la Valutazione della Efficacia della Assistenza
Sanitaria, Italy
o) Australasian Cochrane Centre, Australia
p) Polish Institute for Evidence Based Medicine, Poland
q) The Cancer Council, Australia
r) Centre for Evidence-based Medicine, UK
s) University of Buffalo, USA
t) World Health Organisation, Switzerland
u) Finnish Medical Society Duodecim, Finland
v) Duke University Medical Center, USA
w) Centers for Disease Control and Prevention, USA
Opinions do not necessarily represent those of
the institutions with which the members of the
GRADE Working Group are affiliated.
What do you know about GRADE?




Have prepared a guideline
Read the BMJ paper
Have prepared a systematic review and a summary of
findings table
Have attended a GRADE meeting, workshop or talk
Why bother about grading?



People draw conclusions about the
– quality of evidence
– strength of recommendations
Systematic and explicit approaches can
help
– protect against errors
– resolve disagreements
– facilitate critical appraisal
– communicate information
However, there is wide variation in
currently used approaches
Who is confused?
Evidence
 II-2
 C+
 Strong
Recommendation
B
1
Strongly
recommended
Organization
 USPSTF
 ACCP
 GCPS
Still not confused?
Recommendation for use of oral
anticoagulation in patients with atrial
fibrillation and rheumatic mitral valve
disease
Evidence
 B
 C+
 IV
Recommendation
Class I
1
C
Organization
 AHA
 ACCP
 SIGN
Guidelines development process
Quality of evidence
The extent to which one can be confident that an estimate of
effect or association is correct.
It depends on the:
– study design (e.g. RCT, cohort study)
– study quality/limitations (protection against bias; e.g.
concealment of allocation, blinding, follow-up)
– consistency of results
– directness of the evidence including the




populations (those of interest versus similar; for example,
older, sicker or more co-morbidity)
interventions (those of interest versus similar; for example,
drugs within the same class)
outcomes (important versus surrogate outcomes)
comparison (A - C versus A - B & C - B)
Quality of evidence
The quality of the evidence (i.e. our confidence) may
also be REDUCED when there is:
 Sparse or imprecise data
 Reporting bias
The quality of the evidence (i.e. our confidence) may
be INCREASED when there is:




A strong association
A dose response relationship
All plausible confounders would have reduced the
observed effect
All plausible biases would have increased the
observed lack of effect
Quality assessment criteria
Quality of
evidence
High
Study design
Lower if
Higher if
Randomised trial
Study quality:
-1 Serious
limitations
-2 Very serious
limitations
Strong association:
+1 Strong, no
plausible
confounders
+2 Very strong,
no major
threats to
validity
Moderate
Low
Observational
study
Very low
Any other
evidence
-1 Important
inconsistency
Directness:
-1 Some
uncertainty
-2 Major
uncertainty
-1 Sparse or
imprecise data
-1 High probability
of reporting bias
+1 Evidence of a
Dose response
gradient
+1 All plausible
confounders
would have
reduced the
effect
Categories of quality




High: Further research is very unlikely
to change our confidence in the
estimate of effect.
Moderate: Further research is likely to
have an important impact on our
confidence in the estimate of effect
and may change the estimate.
Low: Further research is very likely to
have an important impact on our
confidence in the estimate of effect
and is likely to change the estimate.
Very low: Any estimate of effect is
very uncertain.












Judgements about the overall
quality of evidence





most systems just use evidence about primary
benefit/outcome
but what about other outcomes (downsides)?
options:
– ignore all but primary outcome
– basing it on the evidence for benefits
– some blended approach
– having separate grades for benefits and harms
– weakest of any outcome
Based on lowest of all the critical outcomes
Beyond the scope of a systematic review
Strength of recommendation
The extent to which one can be confident that
adherence to a recommendation will do more
good than harm.
 trade-offs (the relative value attached to the
expected benefits, harms and costs)
 quality of the evidence
 translation of the evidence into practice in a
specific setting
 uncertainty about baseline risk
Judgements about the balance
between benefits and harms


Before considering cost and making a
recommendation
For a specified setting, taking into
account issues of translation into practice
Clarity of the trade-offs between
benefits and the harms




the estimated size of the effect for each
main outcome
the precision of these estimates
the relative value attached to the
expected benefits and harms
important factors that could be expected
to modify the size of the expected
effects in specific settings; e.g. proximity
to a hospital
Balance between benefits and
harm




Net benefits: The intervention does more
good than harm.
Trade-offs: There are important tradeoffs between the benefits and harms.
Uncertain net benefits: It is not clear
whether the intervention does more good
than harm.
Not net benefits: The intervention does
not do more good than harm.
Judgements about
recommendations
This should include considerations of costs;
i.e. “Is the net gain (benefits-harms) worth
the costs?”
 Do it
 Probably do it
No recommendation
 Probably don’t do it
 Don’t do it
Will GRADE lead to change
Should healthy asymptomatic postmenopausal women have
been given oestrogen + progestin for prevention in 1992?

Quality of evidence across studies for
–
–
–
–
–
–
–



CHD
Hip fracture
Colorectal cancer
Breast cancer
Stroke
Thrombosis
Gall bladder disease
Quality of evidence across critical outcomes
Balance between benefits and harms
Recommendations
Evidence profile: Quality assessment
Oestrogen + progestin for prevention before
WHI and HERS
Oestrogen + progestin versus usual care
Oestrogen + progestin for
prevention after WHI and HERS
GRADE for diagnostic tests
Quality of evidence
High
Moderate
Low
Very low
Study design
Cross-sectional (or cohort)
studies of patients with
diagnostic uncertainty with
direct comparison
Anything else
Lower if *
Study limitations
(including
representativeness of
population, choice of gold
standard, incomplete
performance of tests,
independence of test
interpretation)
-1 Serious limitations
-2 Very serious limitations
-1 Important
inconsistency
Directness
-1-Some uncertainty
-2-Major uncertainty
-1 Sparse or imprecise
data
-1 High probability of
reporting bias
Challenges for GRADE


Operationalise all steps
Dissemination/buy in
– simple to do
– easy to understand and use

Tool and manual
GRADE profiler (GRADEpro)
Separation by outcomes
Work in groups of two


take a pencil (and paper)
write down the most important
issues/questions you have about GRADE
The 10 burning questions/issues about
GRADE
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Different experts
Prospective studies
Valuing benefits and harms – decide about
tradeoffs
Low quality evidence leading to strong rec’s
How can one introduce/disseminate one
single/uniform system
Empirical evidence for GRADE – how should we
obtain it
Mechanisms for balancing benefits and cost
Reliability?
Other than RCT evidence
Decisions about quality of evidence/limitations of
study design, guidance about magnitude of effect
The 10 burning questions/issues about
GRADE
1.
2.
3.
4.
5.
Other than RCT evidence
What type of cost and resources
Who judges the importance of outcomes
How can one evaluate whether all
outcomes are reported?
Decisions about quality of
evidence/limitations of study design
Small group sessions




find a group
select spokes person
take 30 minutes to complete the task
be prepared to criticise
Summary
What is good about GRADE?
–
–
–
What is most challenging?
– Takes too long
– relative importance is difficult to work out
– Difficult - much more time needed
–
What do we need to do next?
– more time
–
The 10 burning questions/issues about
GRADE
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Summary
What is good about GRADE?
–
–
–
What is most challenging?
–
–
–
What do we need to do next?
–
–
–