Transcript Slide 1
RATING QUALITY OF EVIDENCE AND
STRENGTH OF RECOMMENDATIONS IN GI
USING THE GRADE FRAMEWORK
AGA Clinical Practice & Quality Management Committee Teleconference
17 Oct 2008
Yngve Falck-Ytter, M.D.
Assistant Professor of Medicine
Case Western Reserve University, Cleveland
Division of Gastroenterology, Case and VA Medical Center
Disclosure
In the past 4 years, Dr. Falck-Ytter received no
personal payments for services from industry. His
research group received research grants from
Valeant and Roche that were deposited into nonprofit research accounts. He is a member of the
GRADE working group which has received funding
from various governmental entities in the US and
Europe. Some of the GRADE work he has done is
supported in part by grant # 1 R13 HS016880-01
from the Agency for Healthcare Research and
Quality.
Content
Background and rationale for revisiting
guideline methodology
The GRADE approach
Grading the quality of evidence and strength
of recommendations in GI
Hierarchy of evidence
Randomized Controlled
Trials
Cohort Studies and Case
Control Studies
Case Reports and Case
Series, Non-systematic
observations
Expert Opinion
BIAS
Expert Opinion
Expert Opinion
STUDY DESIGN
Reasons for grading evidence?
People draw conclusions about the
quality of evidence and strength of recommendations
Systematic and explicit approaches can help
protect against errors, resolve disagreements
communicate information and fulfill needs
Change practitioner behavior
However, wide variation in approaches
GRADE working group. BMJ. 2004 & 2008
Grading used in GI CPGs
AASLD
AGA
ACG
I RCTs, well
designed, n↑ for
suff. stat. power
I Syst. review of
RCTs
II 1+ properly desig.
RCT, n↑, clinical
setting
II-2 Cohort or casecontrol analytical
studies
II 1 large welldesigned clinical
trial (+/- rand.),
cohort or casecontrol studies or
well designed metaanalysis
II-3 Multiple time series,
dramatic uncontr.
experiments
III Clinical experience,
descr. studies,
expert comm.
III
IV Not rated
IV Non-exp. studies
>1 center/group,
opinion respected
authorities, clinical
evidence, descr.
studies, expert
consensus comm.
I
RCTs
II-1 Controlled trials
(no randomization)
Opinion of respected
authorities, descrip.
epidemiology
III Publ., well-desig.
trials, pre-post,
cohort, time series,
case-control studies
ASGE
A. Prospect.
controlled
trials
B. Observational
studies
C. Expert
opinion
6
Limitations of existing systems
Confuse quality of evidence with strength of
recommendations
Lack well-articulated conceptual framework
Criteria not comprehensive or transparent
GRADE unique
breadth, intensity of development process
wide endorsement and use
conceptual framework
comprehensive, transparent criteria
Focus on all important outcomes related to a
specific question and overall quality
GRADE Working Group
David Atkins, chief medical officera
Dana Best, assistant professorb
Martin Eccles, professord
Francoise Cluzeau, lecturerx
Yngve Falck-Ytter, associate directore
Signe Flottorp, researcherf
Gordon H Guyatt, professorg
Robin T Harbour, quality and information director h
Margaret C Haugh, methodologisti
David Henry, professorj
Suzanne Hill, senior lecturerj
Roman Jaeschke, clinical professork
Regina Kunx, Associate Professor
Gillian Leng, guidelines programme directorl
Alessandro Liberati, professorm
Nicola Magrini, directorn
James Mason, professord
Philippa Middleton, honorary research fellowo
Jacek Mrukowicz, executive directorp
Dianne O’Connell, senior epidemiologistq
Andrew D Oxman, directorf
Bob Phillips, associate fellowr
Holger J Schünemann, professorg,s
Tessa Tan-Torres Edejer, medical officert
David Tovey, Editory
Jane Thomas, Lecturer, UK
Helena Varonen, associate editoru
Gunn E Vist, researcherf
John W Williams Jr, professorv
Stephanie Zaza, project directorw
a) Agency for Healthcare Research and Quality, USA
b) Children's National Medical Center, USA
c) Centers for Disease Control and Prevention, USA
d) University of Newcastle upon Tyne, UK
e) German Cochrane Centre, Germany
f) Norwegian Centre for Health Services, Norway
g) McMaster University, Canada
h) Scottish Intercollegiate Guidelines Network, UK
i) Fédération Nationale des Centres de Lutte Contre le Cancer, France
j) University of Newcastle, Australia
k) McMaster University, Canada
l) National Institute for Clinical Excellence, UK
m) Università di Modena e Reggio Emilia, Italy
n) Centro per la Valutazione della Efficacia della Assistenza Sanitaria, Italy
o) Australasian Cochrane Centre, Australia
p) Polish Institute for Evidence Based Medicine, Poland
q) The Cancer Council, Australia
r) Centre for Evidence-based Medicine, UK
s) National Cancer Institute, Italy
t) World Health Organisation, Switzerland
u) Finnish Medical Society Duodecim, Finland
v) Duke University Medical Center, USA
w) Centers for Disease Control and Prevention, USA
x) University of London, UK
Y) BMJ Clinical Evidence, UK
GRADE uptake
GRADE: Quality of evidence
The extent to which our confidence in an
estimate of the treatment effect is adequate to
support particular recommendation.
Although the degree of confidence is a
continuum, we suggest using four categories:
High
Moderate
Low
Very low
11
Quality of evidence across
studies
Outcome #1
Outcome #2
Outcome #3
Quality: High
Quality: Moderate
Quality: Low
III
V
II
IB
Determinants of quality
RCTs start high
Observational studies start low
What lowers quality of evidence? 5 factors:
Detailed design and execution
Inconsistency of results
Indirectness of evidence
Imprecision
Publication bias
What is the study design?
14
1. Design and execution
Study limitations (risk of bias)
Lack of allocation concealment
No true intention to treat principle
Inadequate blinding
Loss to follow-up
Early stopping for benefit
2. Consistency of results
Look for explanation for inconsistency
patients, intervention, comparator, outcome, methods
Judgment
variation in size of effect
overlap in confidence intervals
statistical significance of heterogeneity
I2
Heterogeneity
Pagliaro L et al. Ann Intern Med 1992;117:59-70
17
3. Directness of Evidence
Indirect comparisons
Interested in head-to-head comparison
Drug A versus drug B
Infliximab versus adalimumab in Crohn’s disease
Differences in
patients (early cirrhosis vs end-stage cirrhosis)
interventions (CRC screening: flex. sig. vs colonoscopy)
outcomes (non-steroidal safety: ulcer on endoscopy vs
symptomatic ulcer complications)
4. Imprecision
Small sample size
small number of events
wide confidence intervals
uncertainty about magnitude of effect
5. Reporting Bias
(Publication Bias)
Reporting of studies
publication bias
number of small studies
Reporting of outcomes
Quality assessment criteria
Quality of
evidence
Study
design
Lower if…
High (4)
Randomized
trial
Study limitations
(design and execution)
Moderate (3)
Low (2)
Very low (1)
Inconsistency
Observational
study
Indirectness
Imprecision
Higher if…
What can
raise the
quality of
evidence?
Publication bias
21
BMJ 2003;327:1459–61
22
Quality assessment criteria
Quality of
evidence
Study
design
Lower if…
Higher if…
High (4)
Randomized
trial
Study limitations
Large effect (e.g., RR 0.5)
Very large effect (e.g., RR 0.2)
Inconsistency
Evidence of dose-response
gradient
Indirectness
All plausible confounding
would reduce a
demonstrated effect
Moderate (3)
Low (2)
Very low (1)
Observational
study
Imprecision
Publication bias
23
Categories of quality
High
Further research is very unlikely to change our
confidence in the estimate of effect
Moderate
Further research is likely to have an important impact on
our confidence in the estimate of effect and may change
the estimate
Low
Further research is very likely to have an important
impact on our confidence in the estimate of effect and is
likely to change the estimate
Very low
Any estimate of effect is very uncertain
24
Judgments about the
overall quality of evidence
Most systems not explicit
Options:
Benefits
Primary outcome
Highest
Lowest
Beyond the scope of a systematic review
GRADE: Based on lowest of all the critical
outcomes
25
GRADE evidence profile
Strength of recommendation
“The strength of a recommendation reflects the
extent to which we can, across the range of patients
for whom the recommendations are intended, be
confident that desirable effects of a management
strategy outweigh undesirable effects.”
Although the strength of recommendation is a
continuum, we suggest using two categories :
“Strong” and “Weak”
Desirable and undesirable effects
Desirable effects
Mortality reduction
Improvement in quality of life, fewer
hospitalizations/infections
Reduction in the burden of treatment
Reduced resource expenditure
Undesirable effects
Deleterious impact on morbidity, mortality or quality of
life, increased resource expenditure
Determinants of the strength
of recommendation
Factors that can weaken the
strength of a recommendation
Explanation
Lower quality evidence
The higher the quality of evidence, the more
likely is a strong recommendation.
The larger the difference between the desirable
and undesirable consequences, the more likely a
strong recommendation warranted. The smaller
the net benefit and the lower certainty for that
benefit, the more likely is a weak
recommendation warranted.
The greater the variability in values and
preferences, or uncertainty in values and
preferences, the more likely weak
recommendation warranted.
The higher the costs of an intervention – that is,
the more resources consumed – the less likely is a
strong recommendation warranted.
Uncertainty about the
balance of benefits versus
harms and burdens
Uncertainty or differences in
values
Uncertainty about whether
the net benefits are worth
the costs
Developing recommendations
Implications of a
strong recommendation
Patients: Most people in this situation would want
the recommended course of action and only a small
proportion would not
Clinicians: Most patients should receive the
recommended course of action
Policy makers: The recommendation can be
adapted as a policy in most situations
Implications of a
weak recommendation
Patients: The majority of people in this situation
would want the recommended course of action,
but many would not
Clinicians: Be prepared to help patients to make a
decision that is consistent with their own
values/decision aids and shared decision making
Policy makers: There is a need for substantial
debate and involvement of stakeholders
Where GRADE fits in
Prioritize problems, establish panel
Systematic review
Searches, selection of studies, data collection and analysis
Prepare evidence profile:
Quality of evidence for each outcome and summary of findings
Assess overall quality of evidence
Decide direction and strength of recommendation
Draft guideline
Consult with stakeholders and / or external peer reviewer
Disseminate guideline
Implement the guideline and evaluate
GRADE
Assess the relative importance of outcomes
Critical
Outcome
Critical
Outcome
Important
Outcome
Not
High
Moderate
Low
Very low
Summary of findings
& estimate of effect
for each outcome
Systematic review
Grade down
P
I
C
O
Outcome
1.
2.
3.
4.
5.
Grade up
RCT start high,
obs. data start low
Risk of bias
Inconsistency
Indirectness
Imprecision
Publication
bias
1. Large effect
2. Dose
response
3. Confounders
Guideline development
Formulate recommendations:
• For or against (direction)
• Strong or weak (strength)
By considering:
Quality of evidence
Balance benefits/harms
Values and preferences
Revise if necessary by considering:
Resource use (cost)
Rate
overall quality of evidence
across outcomes based on
lowest quality
of critical outcomes
•
•
•
•
“We recommend using…”
“We suggest using…”
“We recommend against using…”
“We suggest against using…”
Conclusions
GRADE is gaining acceptance as international
standard
Criteria for evidence assessment across
questions and outcomes
Criteria for moving from evidence to
recommendations
Simple, transparent, systematic
Transparency in decision making and judgments
is key