Webcast-Slides-Nunnally-Grade

Download Report

Transcript Webcast-Slides-Nunnally-Grade

Mark E. Nunnally, MD, FCCM
Co-Director, Critical Care Fellowship and
Associate Professor in the Department of
Anesthesia and Critical Care
University of Chicago Medical Center
Chicago, Illinois
GRADE Methodology Expert
Contributing Author, “Surviving Sepsis
Campaign: International Guidelines for
Management of Sepsis and Septic Shock: 2012”
Making GRADE work: a how-to
for guidelines authors
Mark E. Nunnally, MD, FCCM
Associate Professor
Department of Anesthesia & Critical Care
The University of Chicago
Course objectives I
• Translate evidence into graded
recommendations.
• Identify the features that reduce or increase
the quality of evidence.
Course objectives II
• Appraise clinical data to determine quality of
evidence.
• Integrate quality of evidence for an
intervention with costs, the balance between
desirable and undesirable effects and values
to determine the strength of a
recommendation.
Contents
• GRADE- why?
• Transparency and Certainty
• The Guidelines process: a methodologist’s
perspective
• GRADE- components
• Summary
Conflict of interest.
I am a GRADE advisor for the Surviving Sepsis
Campaign
Conflict of interest.
I am also only a consultant. YOU are the
experts.
WHY GRADE?
Many guidelines, little
standardization
Some inform…
Some restrict…
All claim to be evidencebased…
…how can we be certain a
guideline is supported by the
evidence?
…how can we be certain its
recommendations will hold
over time?
…how relevant is the
recommendation to the
things that matter to me?
Should we rate evidence?
•
•
•
•
‘Quality’ is a diluted term
Quality is a continuum
Decisions are always somewhat arbitrary
‘Experts’ and clinicians don’t always share
the same view
– This is one reason evidence and
recommendations should be separate.
Should we rate evidence?
•
•
•
•
You need some reference
Simplicity
Transparency
Vividness
Grading of Recommendations
Assessment, Development and
Evaluation
• www.gradeworkinggroup.org
• International consensus document
• Template for systematic reviews,
recommendations
TRANSPARENCY AND CERTAINTY
QOE- Philosophical Bent
• We are going to make recommendations that
we (or others) will subsequently change.
– GRADE lets us:
• try to define how likely that is
• communicate our certainty in any effect
• translate findings to clinical realities, by accounting for
the costs, tradeoffs and effort behind following a
recommendation
Example- Glycemic Control
• 2001: Van den Berghe publishes sentinel
article: NEJM 2001, 345
• 2003-2008: Guidelines, protocols, quality
metrics proposed
• 2009: NICE SUGAR
• 2009-present: Re-write or retire
Be Explicit
•
•
•
•
What are the data?
What are their limitations?
How easy is it to do something?
How confident are you in recommending?
The guidelines process: a
methodologist’s perspective
Getting from evidence to guidelines
Evidence Hierarchy
• Experience
• Reports
• Observational Studies
• RCTs
• Meta-analyses
Guidelines Hierarchy
• Clinical biases
• Experience-based
tendencies
• Cost analyses
• Decision analyses
• Formal Guidelines
Not all guidelines are created equal
PICO
Rate
importance
of outcomes
Outcome1
Critical
Outcome2
Critical
Outcome3
Important
Outcome4
Not
Systematic Review
(outcomes across studies)
Evidence Profile
(GRADEpro)
1
2
Pooled estimate of effect for each outcome
Quality of evidence for each outcome
High
 Moderate
High
| Moderate |Low
Low| Very
Very low
low
start
RCT
observational
rate down
action
Select
outcomes
1.
2.
3.
4.
5.
rate up
Formulate
question
1. large effect
2. dose-response
3. antagonistic bias
risk of bias
inconsistency
indirectness
imprecision
publication bias
systematic review of evidence
Formulate recommendations
 For or against an action
 Strong or weak (strength)
Strong or weak:
 Quality of evidence
 Balance benefits/downsides
 Values and preferences
 Resource use (cost)
recommendation
Guideline panel
Rate overall quality of evidence
across outcomes
Wording
 “We recommend…” | “Clinicians should…”
 “We suggest…” | “Clinicians might…”



unambiguous
clear implications for action
transparent (values & preferences statement)
high
low
Question
PICO
Evidence
Summarize
Judge
QOE
SOR
THE QUESTION
PICO
• Population
– Ventilated patients, APACHE scores
• Intervention
– Medicine, therapy, education, systems intervention
• Comparison
– High(how high) versus low (how low) tidal volume
• Outcome
– FBI: mortality (at what follow-up), LOS, VAP
Rating outcomes
• 7-9: critical [death, disability or both]
• 4-6: important [skin breakdown, sepsis]
• 1-3: limited [ileus, ICU stay]
THE EVIDENCE
Collect evidence
• Be thorough
– Use explicit search strategies
– Decide on published v unpublished data
• Consider gray literature in some cases
– Proceedings papers
– Abstracts
– Clinicaltrials.gov
– ALWAYS consider comparator
Assembling
Evidence is
Hard
Data have to be summarized
to inform
GRADE pragmatic approach
•
•
•
•
Get a good meta-analysis (MA)
If no MA, identify main studies
If possible, do your own MA
If no MA, describe main studies/results
– Be explicit (inclusion/exclusion, flaws)
• Keep the link between recommendation and
evidence
Meta Analysisthe Good and the Bad
• Good
– Bad
• Important detail lost
– One-stop synthesis
• Heterogeneity
– Exploration of variability
• N-omegalic
– Improve power
significance
– Ideally- data shown as
• A stew is the sum of
sum and parts
its ingredients
Don’t GRADE everything
• No plausible alternative
– Surveying for infection, resuscitating shock,
practicing quality improvement
• Recommend to consider
– As opposed to not considering?
• Statements lacking specificity
– Intervention, Comparison, relevant Outcomes
(good and bad)
JUDGING
Judge Evidence and Recommendation
• Unique to GRADE
• Related, but distinct
• Recommendation must take clinical realities
into account
– Costs
– Burdens
– Benefits/risks
– Values
Recommendations
Have 2 Components:
• Strength
• Direction
GRADE COMPONENTS
Entering the GRADE meat-grinder
• RCT- High quality
• Observational study- Low quality
• Expert report- Very Low quality
Grade Down
• Study limitations
• Inconsistency
• Indirectness
• Imprecision
• Publication Bias
Grade Down
• Study limitations
• Inconsistency
• Indirectness
• Imprecision
• Publication Bias
• Allocation
concealment
• Blinding
• Loss to follow-up
• No intent-to-treat
• Stopping early
• Failure to report
outcomes
Study Limitations/Risk of Bias
• Bias definition: 1. Unequal distribution of risk
factors (confounders) across study groups. 2.
Factors that systematically change study
effects to result in a directional change in the
signal.
Risk of Bias
• GRADE treats bias by individual outcomes
– Pain scores- strong effect if unblinded
– Mortality- effect of blinding less clear
– Loss to follow-up for different outcome windows
• With multiple studies and different risks of
bias, quality should be judged by the relative
contribution of studies to the confidence in
the effect.
Risk of Bias
• Blinding
– Patient, clinician, data assessor
• Concealment of allocation
• Intention-to-treat principle
– Absence negates the balance from randomization
Risk of Bias
• Stopping Early for Benefit, especially if trials
have < 500 events
– Brassler D, et al. JAMA, 2010;303(12):1180-7
• Selective outcome reporting
– Only positive outcomes, composite results only, or
lack of pre-specified outcomes
• Loss to follow-up
– Significance relates to # of events
Risk of bias- Observational Studies
• Prognosis can differ
• Groups can have multiple differences:
– Time
– Place
– Population
– Co-morbidity
This is why observational studies typically enter as
“Low” quality of evidence
Grade Down
• Study limitations
• Inconsistency
• Indirectness
• Imprecision
• Publication Bias
• Widely differing
estimates of
treatment effect
• Heterogeneity not
explained
• Differences:
– Populations,
interventions,
outcomes
Inconsistency
• Definition: 1. Heterogeneity. 2. Lack of
similarity of point estimates or confidence
intervals. 3. Variable findings unexplained by a
priori hypotheses. 4. Subgroup effects that
cannot be sufficiently explained.
Inconsistency
• Generally, effects are looked at in relative
terms, rather than absolute
– Subgroups may have different baseline rates, but
similar relative effects
Inconsistency
• Inconsistency can come from study diversity:
– Populations
– Interventions
– Outcomes
– Study methods
• Credible inconsistency may lead to split
recommendations
Basic assessments of inconsistency
• Point estimates vary widely
• Little or no CI overlap
• Test of heterogeneity shows a low p value
– 𝛘2
(P ≤ 0.10 may be sufficient)
• I2 is large:
-<40%: low
-30-60%: moderate
-50-90%: substantial
-75-100%: considerable
Context
• It is only significant inconsistency if the
variability would influence a clinical decision
– If point estimates and CIs favor treatment over
costs/burdens/side effects, no need to downgrade
Inconsistency
• Example:
• Low-dose steroids in sepsis:
– 6 studies, 3 high baseline mortality, 3 low, with
difference in effect:
• Patel GP. Am J Respir Crit Care Med 2012;185:133-139
Grade Down
• Study limitations
• Inconsistency
• Indirectness
• Imprecision
• Publication Bias
• If a>>b and c>b, is
a>c?
• Differences from
intervention and
outcome of interest:
– population,
intervention,
comparator
Indirectness
• Definition: 1. Evidence does not directly
compare to the clinical question of interest. 2.
Differing patients, interventions, comparisons
or outcomes in available studies necessitate
extrapolation of evidence to question being
addressed.
Indirectness
• Examples:
– Animal studies: downgrade 1 or 2 levels, in
general, but consider the relevance of the data
(toxicity v therapeutic benefit)
– If drug A>B and B>C, is A>C?
– Low-fat diet: US versus French population
• Setting, co-”interventions,” genetics
– Surrogate outcomes: Blood pressure control
versus cardiovascular events
Indirectness
• Example:
– H2RA and PPI: C. Difficile infection: observational
study not direct to critically ill patients, but with
interesting effect: Very Low QOE
• Leonard J et al. Am J Gastroenterol 2007;102: 2047
Grade Down
• Study limitations
• Inconsistency
• Indirectness
• Imprecision
• Publication Bias
• Few patients,
outcomes
• Wide confidence
intervals
Imprecision
• Definition: 1. High impact of random error on
evidence quality. 2. Wide range of results to
be expected from repetitive study. 3. Wide
range in which the truth likely lies.
Imprecision
• Driven by # of events and by degree of effect
• 95% confidence intervals may encompass
harm and benefit
– Taken in the context of the recommendation
• More important: 95% CIs embrace absolute
values that reduce our confidence in a
recommendation
Relative Effect
Absolute Effect
Use absolute effects
Toxicity
Imprecision
• Example:
– NE v Vasopressin: Mortality CI wide, spanned RR =
1.
• for ventricular arrhythmias, RR 0.47 (0.38, 0.58), but 21
events  FRAGILE
– H2RA and pneumonia: unable to exclude harm
– Negative factors may require tighter CIs:
• Side effects/toxicity
• Burdens/costs
Grade Down
• Study limitations
• Inconsistency
• Indirectness
• Imprecision
• Publication Bias
• Few trials
• Industry funding
• Asymmetric Funnel
plot
Publication Bias
• Definition: 1. Studies with statistically
significant results more likely to be counted
than negative studies. 2. Smaller, high-effect
studies disproportionately impact published
literature. 3. Published commercially-funded
studies are more likely to be positive.
Publication Bias
• Publication: + Studies > – Studies (RR 1.78)
– Hopewell S, The Cochrane Database of Systematic
Reviews, 2007.
• – Studies: delayed, obscure publication
• + studies: duplicate publication
• Small studies, industry sponsor ⇒
↑publication bias
Publication Bias
• How to detect? It’s more difficult than one
might think.
– Look for:
•
•
•
•
•
Small trials
Conflicts in authors/study sponsors
Duplications
Abstracts, grey literature with negative findings
Unpublished data
– Ideally, we would trend MAs over time
Publication Bias
Pooled Estimate
Selective Publication
Greater Study Limitations
More Restrictive/Responsive Population
Publication Bias
Publication Bias
Publication BiasTesting
• Tests of asymmetry
• Imputing missing information
• Repeated MA over time
Publication BiasAddressing the Problem
• Thorough research
– Gray Literature
– FDA submissions
– Abstracts, proceedings
– Author Contact
• Clinicaltrials.gov
– N.B: only for RCTs, not observational studies
Grade Up
• Large magnitude of
effect
• Dose response
gradient
• Bias likely to blunt
results
Grade Up
• Large magnitude of
effect
• Dose response
gradient
• Bias likely to blunt
results
• Stronger signals
signal stronger
evidence
Grade Up
• Large magnitude of
effect
• Dose response
gradient
• Bias likely to blunt
results
• Signal pattern
consistent with
physiologic model
Grade Up
• Large magnitude of
effect
• Dose response
gradient
• Bias likely to blunt
results
• Some studies run up
against mitigating
factors that work
against them.
Moving Up- Examples
• Very strong, consistent association; no plausible
confounders, up 2 grades
– insulin in diabetic ketoacidosis
– antibiotics in septic shock
• Strong, consistent association with no plausible
confounders up 1 grade
How to get GRADEpro on your computer?
• Cochrane IMS website
• cc-ims.net/revman/gradepro/download
• http://www.ccims.net/revman/gradepro/download
• Google ‘gradepro’
GRADE output: Summary of Findings
GRADE output: Evidence Profile
Question: Should longer term (7 day) low dose (up to 300 mg/day of hydrocortisone) glucocorticosteroids be
used in severe sepsis and septic shock?
Settings: ICU
Bibliography: Annane 2009
Summary of findings
Quality assessment
No of studies
Design
Limitations
Inconsistency
No of patients
Indirectness
Imprecision
longer term (7
day) low dose (up
Other
to 300 mg/day of
considerations hydrocortisone)
glucocorticosteroi
ds
Effect
Importance
control
Relative
(95% CI)
Quality
Absolute
Mortality, 28 days
12
randomised no serious
trials
limitations
serious1
randomised no serious
trials
limitations
no serious no serious serious4
inconsistenc indirectness
y3
no serious no serious none
indirectness imprecision
236/629
(37.5%)
71 fewer per
RR 0.84 (0.72 to 1000 (from 13
0.97)
fewer to 123
fewer)

MODERATE
CRITICAL2
9 more per
RR 1.12 (0.81 to 1000 (from 14
56/767 (7.3%)
1.53)
fewer to 39
more)

MODERATE
IMPORTANT

HIGH
IMPORTANT
264/599
(44.1%)
GI bleeding
3
none
65/827 (7.9%)
Superinfections
45
randomised no serious
trials
limitations
no serious no serious no serious none
inconsistenc indirectness imprecision7
y6
1
184/983
(18.7%)
170/934
(18.2%)
2 more per
RR 1.01 (0.82 to 1000 (from 33
1.25)
fewer to 46
more)
Meta-regression examining the effect of severity of illness (baseline mortality) on efficacy suggested an effect - p value 0.04
using fixed effect and 0.06 using random effect model. JAMA 2009; 302:1643-1645.
2 Reported for all trials
3 I2=0
4 RR up to 1.53
5 need to check
6 I2=8%
Final QOE
•
•
•
•
High:
Medium:
Low:
Very Low:
A, ++++, ↑↑↑↑
B, +++-, ↑↑↑
C, ++--, ↑↑
D, +---, ↑
Alternate QOE interpretation
• High- Further research very unlikely to change
confidence
• Moderate- likely to have an important impact
• Low- very likely to impact
• Very Low- uncertain
Separate QOE and Strength of
Recommendation
GRADE’s defining feature
• Evidence: high or low quality?
• likelihood estimates are true and adequate
• Recommendation: weak or strong?
• confidence that following recommendation will cause
more good than harm
FactorsSTRONG vs WEAK
• Balance good & bad
• QOE
• Uncertainty
– values
– preferences
• Cost
FactorsSTRONG vs WEAK
• Balance good & bad • GI Bleed v C. Dificile
• QOE
• Uncertainty
• Early antibiotics v
inappropriate
– values
antibiotics
– preferences
• Cost
FactorsSTRONG vs WEAK
• Balance good & bad • A or B can support
STRONG
• QOE
• Uncertainty
• C or D should usually
– values
be WEAK
– preferences
• Cost
FactorsSTRONG vs WEAK
• Balance good & bad • Cancer remission v
quality of life
• QOE
• Uncertainty
• Delirium v pain
– values
control
– preferences
• Cost
FactorsSTRONG vs WEAK
• Balance good & bad • $/QALY
• QOE
• Uncertainty
• Allocating limited
resources
– values
– preferences
• Cost
• Burdens for patients
and providers
STRONG to stakeholders
• Patient: most people would want it
• Clinician: most should receive, uniform
behavior
• Policymaker: adopt as policy, use as quality
indicator
WEAK to stakeholders
• Patient: many people would not want it
• Clinician: help patient make a balanced
decision
– decision aid might be needed
• Policymaker: debate
Final Strength of Recommendations
STRONG:
–do it, or don’t do
it
–“We
recommend”
–GRADE 1
WEAK:
–probably do it,
or probably
don’t
–“We suggest”
–GRADE 2
PICO
Rate
importance
of outcomes
Outcome1
Critical
Outcome2
Critical
Outcome3
Important
Outcome4
Not
Systematic Review
(outcomes across studies)
Evidence Profile
(GRADEpro)
1
2
Pooled estimate of effect for each outcome
Quality of evidence for each outcome
High
 Moderate
High
| Moderate |Low
Low| Very
Very low
low
start
RCT
observational
rate down
action
Select
outcomes
1.
2.
3.
4.
5.
rate up
Formulate
question
1. large effect
2. dose-response
3. antagonistic bias
risk of bias
inconsistency
indirectness
imprecision
publication bias
systematic review of evidence
Formulate recommendations
 For or against an action
 Strong or weak (strength)
Strong or weak:
 Quality of evidence
 Balance benefits/downsides
 Values and preferences
 Resource use (cost)
recommendation
Guideline panel
Rate overall quality of evidence
across outcomes
Wording
 “We recommend…” | “Clinicians should…”
 “We suggest…” | “Clinicians might…”



unambiguous
clear implications for action
transparent (values & preferences statement)
high
low
Useful Resources
• BMJ: GRADE series
– GRADE Introduction:
• BMJ 2008;336;924-926
– Overview of Quality of Evidence:
• BMJ 2008;336;995-998
– Translating Evidence to Recommendations:
• BMJ 2008;336;1049-1051
– How to handle disagreements in guidelines
panels: BMJ 2008;337:a744
Useful Resources II
• Journal of Clinical Epidemiology
– GRADE Guidelines Series: 1-9. 2011
– April, 2011 (64(4)): 1-4
• Intro, framing the question and outcomes, rating
quality of evidence, risk of bias
– December, 2011 (64(12)): 5-9
• Publication bias, imprecision, inconsistency,
indirectness, rating up
Useful Resources II
• Journal of Clinical Epidemiology
– GRADE Guidelines Series: 1-9. 2011
– April, 2011 (64(4)): 1-4
• Intro, framing the question and outcomes, rating
quality of evidence, risk of bias
– December, 2011 (64(12)): 5-9
• Publication bias, imprecision, inconsistency,
indirectness, rating up