Transcript Slide 1

GRADING EVIDENCE AND RECOMMENDATIONS:
STARTING WITH GRADE BASICS VS.
UTILIZING THE FULL FRAMEWORK
AHRQ Annual Meeting 2010:
“Better Care, Better Health: Delivering on Quality for All Americans"
September 28, 2010
Yngve Falck-Ytter, M.D.
Associate Professor of Medicine
Case Western Reserve University, Cleveland, Ohio
Holger Schünemann, M.D., Ph.D.
Chair, Department of Clinical Epidemiology & Biostatistics
Michael Gent Chair in Healthcare Research
McMaster University, Hamilton, Canada
1
Disclosures
In the past 5 years, Dr. Falck-Ytter received no
personal payments for services from industry. His
research group received research grants from Three
Rivers, Valeant and Roche that were deposited into
non-profit research accounts. He is a member of the
GRADE working group which has received funding
from various governmental entities in the US and
Europe, such as the AHRQ. Some of the GRADE work
he has done is supported in part by grant # 1 R13
HS016880-01 from the Agency for Healthcare
Research and Quality (AHRQ).
2
Content
Part 1
 A 7 minute version of GRADE
Part 2
 Rapid interactive exchange contrasting GRADE
basic vs. the full GRADE approach





Advantages of a structured approach
Asking good clinical questions
Systematic review vs. ad hoc approaches
Grading the quality of evidence
How to determine the strength of recommendations
3
Question to the audience
Decisions in your medical practice
are based on:
A. Training, experience and knowledge of respected
colleagues
B. Patient preferences
C. Convincing evidence (non experimental) from case
reports, case series, disease mechanism
D. RCTs, systematic reviews of RCTs and metaanalyses
E. All of the above
4
Evidence-based clinical decisions
Patient values and
preferences
Clinical
circumstances
Expertise
Research evidence
Haynes et al. 2002
5
A real world example…
P: In patients with acute hepatitis C …
I : Should anti-viral treatment be used …
C: Compared to no treatment …
O: To achieve viral clearance?
Evidence Recommendation
Organization
B
Class I
AASLD (2009)
II-1
“Should be initiated…”
VA (2006)
1+
A
SIGN (2006)
-/-
“Most authorities…”
AGA (2006)
-/-
B “It works…”
AWMF(2004)
6
Question to the audience
By now…
A. …you are thoroughly confused
B. …you send her to a doctor because treatment is
recommended
C. …you send her to a doctor but she can expect that,
according to guidelines, she will not be treated
D. …you look at the evidence yourself because past
experience tells you that guidelines don’t help
7
GRADE is outcome-centric
Outcome #1
Quality: High
Outcome #2
Quality: Moderate
Outcome #3
Quality: Low
III
V
II
IB
Old system
GRADE
Critical
Outcome
Critical
Outcome
Important
Outcome
Less
High
Moderate
Low
Very low
Summary of findings
& estimate of effect
for each outcome
Systematic review
Grade down
P
I
C
O
Outcome
1.
2.
3.
4.
5.
Grade up
RCT start high,
obs. data start low
Risk of bias
Inconsistency
Indirectness
Imprecision
Publication
bias
1. Large effect
2. Dose
response
3. Confounders
Guideline development
Rate
overall quality of evidence
across outcomes based on
lowest quality
of critical outcomes
Formulate recommendations:
• For or against (direction)
• Strong or weak (strength)
By considering:
 Quality of evidence
 Balance benefits/harms
 Values and preferences
Revise if necessary by considering:
 Resource use (cost)
•
•
•
•
“We recommend using…”
“We suggest using…”
“We recommend against using…”
“We suggest against using…”
9
Question to the audience
Which question follows a well structured
clinical PICO format:
A. What is the evidence that food allergens cause
eosinophilic esophagitis?
B. Is it known what the evidence is that aspirin can
prevent progression of dysplasia to cancer in
Barrett’s esophagus?
C. In patients undergoing hip replacement, does
warfarin compared to aspirin reduce venous
thromboembolism, pulmonary embolism and
mortality?
10
That’s an excellent question
 Translating informal clinical questions into
specific PICO questions = central to GRADE
 Even if an organization has limited resources,
taking care of this step actually saves resources:
 Helps limiting your scope
 Specifies the search strategy more clearly
 Guides data extraction
 Helps with formulating recommendations
11
Taking it to the next level
Informal
Question
Population
Whether to
Patients
use thrombo- underprophylaxis
going
for VTE
THR
prophylaxis
(drugs)
PICO Question
Intervention(s)
Any drug
(ASA, LDUH,
LMWH,
fondaparinux,
direct
thrombin
inhibitors)
Method
ComOutcome(s)
parator(s)
No antiAsymptomatic DVT
RCT,
coagulation (surrogate for
obs.
symptomatic VTE);
studies
symptomatic DVT;
non-fatal PE; fatal PE;
bleeding (operative
site vs. non-operative
site); readmission; reoperation; total
mortality
12
Importance of outcomes
Deciding on the importance of outcomes on decision making:
1
2
3
Less important
P:
I:
C:
O:
4
5
6
Important
7
8
9
Critically important
In patients after hip replacement…
Should warfarin rather than…
Aspirin be given…
To reduce symptomatic venous
thromboembolism and mortality?
13
Question to the audience
Deciding on the importance of outcomes on decision making:
1
2
3
Less important
4
5
6
Important
7
8
9
Critically important
Please rate outcome: Dying from pulmonary embolism
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision making
C. (7, 8, 9): Critically important for decision making
14
Question to the audience
Deciding on the importance of outcomes on decision making:
1
2
3
Less important
4
5
6
Important
7
8
9
Critically important
Asymptomatic deep vein thrombosis in the calf (e.g., as
seen on mandatory venography at end of study)
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision making
C. (7, 8, 9): Critically important for decision making
15
Question to the audience
Deciding on the importance of outcomes on decision making:
1
2
3
Less important
4
5
6
Important
7
8
9
Critically important
Stomach ulcer bleeding requiring endoscopy
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision making
C. (7, 8, 9): Critically important for decision making
16
Question to the audience
Deciding on the importance of outcomes on decision making:
1
2
3
Less important
4
5
6
Important
7
8
9
Critically important
Regular blood work and dose adjustments
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision making
C. (7, 8, 9): Critically important for decision making
17
Rating the importance of
outcomes
 Train the content expert to understand that
outcomes that are critical for decision making
are identified
 Rating is done before, during and after the
evidence review
 The rating may change in light of new
information
18
Critical
Outcome
Critical
Outcome
Important
Outcome
Less
High
Moderate
Low
Very low
Summary of findings
& estimate of effect
for each outcome
Systematic review
Grade down
P
I
C
O
Outcome
1.
2.
3.
4.
5.
Grade up
RCT start high,
obs. data start low
Risk of bias
Inconsistency
Indirectness
Imprecision
Publication
bias
1. Large effect
2. Dose
response
3. Confounders
Guideline development
Rate
overall quality of evidence
across outcomes based on
lowest quality
of critical outcomes
Formulate recommendations:
• For or against (direction)
• Strong or weak (strength)
By considering:
 Quality of evidence
 Balance benefits/harms
 Values and preferences
Revise if necessary by considering:
 Resource use (cost)
•
•
•
•
“We recommend using…”
“We suggest using…”
“We recommend against using…”
“We suggest against using…”
19
Taking it to the next level
 Early involvement of consumers in the
guideline development process
 Selecting systematic reviews that are known
to make an effort to include consumer views
(e.g., Cochrane etc.)
 Can be used to identify research gaps
20
Evidence review stage
What format of evidence do you use?
$$$
Using mainly systematic reviews (SR)
Have the
resources
Do it inhouse
Outsource
Mainly using single study data
Don’t have the
resources
Ready to use
SR
Search for SR
Update SR
Use
GRADE
without
evidence
profiles
Ad hoc reviews
Utilize the full GRADE framework (± evidence Profiles)
Not ready to
use SR
$
21
Question to the audience
Select the best answer: You can find high
quality systematic reviews for “free” here:
A. AHRQ
B. The Cochrane Library
C. Canadian Agency for Drugs and Technologies in
Health (CADTH)
D. National Institute for Clinical Excellence (NICE), UK
E. All of the above
22
Taking it to the next level
 What to look for when selecting evidence
review centers
 Commissioning systematic reviews: Making
sure the center understands GRADE
requirements
 What SR methodology they use
 What databases they can search
 What software they use
 How they document their work
23
Question to the audience
GRADE rating evidence: The quality of
evidence may need downgrading if:
A. The outcome is reduction of elevated pressure in
the eye (IOP) instead of loss of vision
B. There are large losses to follow-up
C. Some trials showing benefits, others reporting
harms
D. The confidence interval is wide and there are few
events
E. All of the above
24
Quality of evidence: beyond risk of bias
Definition: The extent to which our confidence in an estimate of the
treatment effect is adequate to support a particular recommendation
Methodological
limitations
Risk of bias:
Allocation
concealment
Blinding
Intention-to-treat
Follow-up
Stopped early
Inconsistency
of results
Indirectness
of evidence
Imprecision
of results
Publication
bias
Sources of
indirectness:
Indirect
comparisons
Patients
Interventions
Comparators
Outcomes
25
Quality assessment criteria
Study
design
Quality of
evidence
Lower if…
Randomized
trials
High
Study limitations
(design and execution)
Moderate
Inconsistency
Low
Indirectness
Very low
Imprecision
Observational
studies
Higher if…
What can
raise the
quality of
evidence?
Publication bias
26
Question to the audience
A.
B.
C.
D.
A systematic review of observational studies
showed a relationship between front sleeping
position (versus back position) and sudden
infant death syndrome (SIDS): OR 2.93 (1.15,
7.47). Rate the quality of evidence for the
outcome SIDS:
High
Moderate
Low
Very low
27
Question to the audience
A.
B.
C.
D.
You review all colonoscopies for average risk
screening in your health system and document a
percentage of patient who developed a
perforation after the procedure (evidence of free
air on imaging). No comparison group without
colonoscopy available. Rate the quality of
evidence for the outcome perforation:
High
Moderate
Low
Very low
28
Question to the audience
A.
B.
C.
D.
Several RCTs have shown the effectiveness of
natalizumab to induce remission in Crohn’s
disease. Study/post-marketing data showed 31
cases of potentially lethal progressive multifocal
leukoencephalopathy (PML, JC virus related).
Rate the quality of evidence for PML:
High
Moderate
Low
Very low
29
Quality assessment criteria
Study
design
Quality of
evidence
Lower if…
Higher if…
Randomized
trials
High
Study limitations
(design and execution)
Large effect (e.g., RR 0.5)
Very large effect (e.g., RR 0.2)
Moderate
Inconsistency
Evidence of dose-response
gradient
Low
Indirectness
Very low
Imprecision
All plausible confounding
would reduce a
demonstrated effect
Observational
studies
Publication bias
30
“Categories” of quality (1)
High
Further research is very unlikely to change our
confidence in the estimate of effect
Moderate
Further research is likely to have an important impact on
our confidence in the estimate of effect and may change
the estimate
Low
Further research is very likely to have an important
impact on our confidence in the estimate of effect and is
likely to change the estimate
Very low
Any estimate of effect is very uncertain














31
Conceptualizing quality (2)
High
We are very confident that the true effect lies close to
that of the estimate of the effect.
Moderate
We are moderately confident in the estimate of effect:
The true effect is likely to be close to the estimate of
effect , but possibility to be substantially different.
Low
Our confidence in the effect is limited: The true effect
may be substantially different from the estimate of the
effect.
Very low
We have very little confidence in the effect estimate:
The true effect is likely to be substantially different from
the estimate of effect.














32
Taking it to the next level
 Advantages of systematically assessing
quality of evidence
 Downgrading and upgrading “on-the-fly” can
introduce errors
Study /
year
Treatment
AlloBlinding
cation
concealment
No
outcome
(%)
Analysis
Comments
REMOBILIZE
2009
dabigatran
220 mg QD
dabigatran
150 mg QD
enoxaparin
30 mg BID
Yes
(IVRS)
(blocks
of 6)
269/862
(31.2%)
232/877
(26.5%)
239/876
(27.3%)
ITT: no
Low dose ASA and
stocking allowed,
but not pneumatic
devices
Patients: Y
Caregivers: Y
Data coll: PY
Adjudic: Y
Data analysts: ?
33
GRADE evidence profile
34
Question to the audience
PICO: Should children with otitis media be
treated with antibiotics?
Rate the overall quality of evidence for this
clinical question by evaluating all critical
outcomes (use the evidence profile):
A. High
B. Moderate
C. Low
D. Very low
35
Outcome
Critical
Outcome
Important
Outcome
Important
Outcome
Less
Overall quality of evidence
Critical
Grade down or up
P
I
C
O
Outcome
Formulate recommendations:
• For or against (direction)
• Strong or weak (strength)
By considering:
 Quality of evidence
 Balance benefits/harms
 Values and preferences
Revise if necessary by considering:
 Resource use (cost)
36
Question to the audience
PICO: Should children with otitis media be
treated with antibiotics?
Rate the overall strength or recommendations:
A. “We recommend early antibiotics in children with
acute otitis media”
B. “We suggest early antibiotics…”
C. “We suggest against using antibiotics initially…”
D. “We recommend against using antibiotics initially…”
37
Strength of recommendation
“The strength of a recommendation reflects the
extent to which we can,
across the range of patients for whom the
recommendations are intended,
be confident that desirable effects of a management
strategy outweigh undesirable effects.”
4 determinants of the strength
of recommendation
Factors that can weaken the
strength of a recommendation
Explanation
 Lower quality evidence
The higher the quality of evidence, the more
likely is a strong recommendation.
 Uncertainty about the
balance of benefits versus
harms and burdens
The larger the difference between the desirable
and undesirable consequences, the more likely
a strong recommendation warranted. The
smaller the net benefit and the lower certainty
for that benefit, the more likely is a weak
recommendation warranted.
 Uncertainty or differences
in patients’ values
The greater the variability in values and
preferences, or uncertainty in values and
preferences, the more likely weak
recommendation warranted.
 Uncertainty about whether
the net benefits are worth
the costs
The higher the costs of an intervention – that is,
the more resources consumed – the less likely
is a strong recommendation warranted.
39
Implications of a
strong recommendation
 Patients: Most people in this situation would want
the recommended course of action and only a small
proportion would not
 Clinicians: Most patients should receive the
recommended course of action
 Policy makers: The recommendation can be
adapted as a policy in most situations
40
Implications of a
weak recommendation
 Patients: The majority of people in this situation
would want the recommended course of action,
but many would not
 Clinicians: Be prepared to help patients to make a
decision that is consistent with their own
values/decision aids and shared decision making
 Policy makers: There is a need for substantial
debate and involvement of stakeholders
41
Taking it to the next level
 Explicit separation of quality of evidence from
making recommendations
 Correctly balancing the benefits against the
undesirable effects
 Special challenges: resource use
 Increasing transparency in the process of
making recommendations
42
Question to the audience
Should patients with chronic hepatitis C be
treated with interferon/ribavirin combination?
There is high quality evidence for benefits and
high quality evidence for harms.
Rate the overall strength or recommendations:
A. “We recommend treatment of chronic hepatitis C”
B. “We suggest treatment…”
C. “We suggest against treating patients…”
D. “We recommend against treating patients…”
43
Patient values & preferences
 In the absence of evidence, guideline panels
have to function as surrogates to estimate
values and preferences (V&P)
 Consumer involvement can help
 Attaching V&P statements to guideline
recommendations increases transparency
44
Taking it to the next level
 Systematically searching the literature for
studies of values and preferences
 Systematic reviews of V&P
 Querying the guideline panel to rate health
utilities of outcomes using case scenarios
45
Question to the audience
Please select the most appropriate
answer. The reason you attended this
session:
A. Just interested in the topic
B. Have been involved in narrative evidence reviews,
but have not used any formal grading system
C. Have used a grading system but not GRADE
D. Using or considered using GRADE
46
Question to the audience
Please select the most appropriate
answer. Selecting a system to rate the
quality of evidence and strength of
recommendations, such as GRADE:
A. Appears too expensive to implement
B. Appears valuable, but still requires substantial
upfront expense
C. Appears to have some upfront cost but long-term
savings
D. I use GRADE – it has been paying off for me
47
Basic dimensions
Guideline work aligns along 3 basic dimensions
 High quality
 Fast
 Expensive
vs.
vs.
vs.
low quality
slow
cheap
48
Ideal vs. practical ad hoc GRADE approaches
Stage
Elements
Advantage
Comment
Ideal
Systematic review
GRADE eTables
Qual. of evidence
Strength of rec.
Follows highest standards
Methodolog. most rigorous
Easily maintainable
Fully transparent process
Access to methodologist
Access to evidence centers
Initially more resource
intensive, long-term savings
Intermediary
Ad hoc review
GRADE eTables
Qual. of evidence
Strength of rec.
Still retaining major
advantages of the of
the “ideal approach”
Risk of bias higher
Access methodologist rec.
Only minimal addl. cost
Initiation
Ad hoc review
GRADE eTables
Qual. of evidence
Strength of rec.
Option to fully “upgrade”
to an “ideal approach”
Foundation of a methodologically sound system
Risk of bias higher
Access methodologist prn
No additional cost
49
Sources of funding
 Funders may have an agenda
 Industry – tricky
 Foundations
 Public – AHRQ, criteria
 EHC program fit




(3: available, relevance for public payer, priority condition)
Importance (7: e.g., public interest etc.)
No duplication
Feasibility
Impact (6: e.g., addresses inequity)
50
Taking it to the next level
 Long term planning
 Create a high quality guideline product
 Attract high quality guideline panel
 Unconflicted methodologist (editor)
 Content expert (deputy editor)
 Content expert authors
 Health economists
51
Taking it to the next level
 GRADE evidence profiles
 Condensed and standardized summary of evidence
 Are increasingly already created as part of a





systematic review (e.g., Cochrane reviews)
Flexible presentation (e.g., as summary of findings
tables)
Initial investment
Long-term value
GRADEpro software (tie-in with RevMan)
Avoids duplication of efforts across the globe
52
Vision
1. Globalize the evidence, localize
2.
3.
4.
5.
recommendations
Focus on questions that are important to
patients and clinicians
Undertake collaborative evidence reviews
Use a common metric to assess the quality of
evidence and strength of recommendations
Examined collaborative models for funding
Schunemann 2009
53
GRADE uptake
54
Conclusion
Gaining acceptance as international standard
because GRADE adds value:
1. Criteria for evidence assessment across a range
of questions and outcomes
2. Sensible, systematic, fostering transparency
3. Balance between simplicity and methodological
rigor