Transcript Slide 1
GRADING EVIDENCE AND RECOMMENDATIONS: STARTING WITH GRADE BASICS VS. UTILIZING THE FULL FRAMEWORK AHRQ Annual Meeting 2010: “Better Care, Better Health: Delivering on Quality for All Americans" September 28, 2010 Yngve Falck-Ytter, M.D. Associate Professor of Medicine Case Western Reserve University, Cleveland, Ohio Holger Schünemann, M.D., Ph.D. Chair, Department of Clinical Epidemiology & Biostatistics Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada 1 Disclosures In the past 5 years, Dr. Falck-Ytter received no personal payments for services from industry. His research group received research grants from Three Rivers, Valeant and Roche that were deposited into non-profit research accounts. He is a member of the GRADE working group which has received funding from various governmental entities in the US and Europe, such as the AHRQ. Some of the GRADE work he has done is supported in part by grant # 1 R13 HS016880-01 from the Agency for Healthcare Research and Quality (AHRQ). 2 Content Part 1 A 7 minute version of GRADE Part 2 Rapid interactive exchange contrasting GRADE basic vs. the full GRADE approach Advantages of a structured approach Asking good clinical questions Systematic review vs. ad hoc approaches Grading the quality of evidence How to determine the strength of recommendations 3 Question to the audience Decisions in your medical practice are based on: A. Training, experience and knowledge of respected colleagues B. Patient preferences C. Convincing evidence (non experimental) from case reports, case series, disease mechanism D. RCTs, systematic reviews of RCTs and metaanalyses E. All of the above 4 Evidence-based clinical decisions Patient values and preferences Clinical circumstances Expertise Research evidence Haynes et al. 2002 5 A real world example… P: In patients with acute hepatitis C … I : Should anti-viral treatment be used … C: Compared to no treatment … O: To achieve viral clearance? Evidence Recommendation Organization B Class I AASLD (2009) II-1 “Should be initiated…” VA (2006) 1+ A SIGN (2006) -/- “Most authorities…” AGA (2006) -/- B “It works…” AWMF(2004) 6 Question to the audience By now… A. …you are thoroughly confused B. …you send her to a doctor because treatment is recommended C. …you send her to a doctor but she can expect that, according to guidelines, she will not be treated D. …you look at the evidence yourself because past experience tells you that guidelines don’t help 7 GRADE is outcome-centric Outcome #1 Quality: High Outcome #2 Quality: Moderate Outcome #3 Quality: Low III V II IB Old system GRADE Critical Outcome Critical Outcome Important Outcome Less High Moderate Low Very low Summary of findings & estimate of effect for each outcome Systematic review Grade down P I C O Outcome 1. 2. 3. 4. 5. Grade up RCT start high, obs. data start low Risk of bias Inconsistency Indirectness Imprecision Publication bias 1. Large effect 2. Dose response 3. Confounders Guideline development Rate overall quality of evidence across outcomes based on lowest quality of critical outcomes Formulate recommendations: • For or against (direction) • Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) • • • • “We recommend using…” “We suggest using…” “We recommend against using…” “We suggest against using…” 9 Question to the audience Which question follows a well structured clinical PICO format: A. What is the evidence that food allergens cause eosinophilic esophagitis? B. Is it known what the evidence is that aspirin can prevent progression of dysplasia to cancer in Barrett’s esophagus? C. In patients undergoing hip replacement, does warfarin compared to aspirin reduce venous thromboembolism, pulmonary embolism and mortality? 10 That’s an excellent question Translating informal clinical questions into specific PICO questions = central to GRADE Even if an organization has limited resources, taking care of this step actually saves resources: Helps limiting your scope Specifies the search strategy more clearly Guides data extraction Helps with formulating recommendations 11 Taking it to the next level Informal Question Population Whether to Patients use thrombo- underprophylaxis going for VTE THR prophylaxis (drugs) PICO Question Intervention(s) Any drug (ASA, LDUH, LMWH, fondaparinux, direct thrombin inhibitors) Method ComOutcome(s) parator(s) No antiAsymptomatic DVT RCT, coagulation (surrogate for obs. symptomatic VTE); studies symptomatic DVT; non-fatal PE; fatal PE; bleeding (operative site vs. non-operative site); readmission; reoperation; total mortality 12 Importance of outcomes Deciding on the importance of outcomes on decision making: 1 2 3 Less important P: I: C: O: 4 5 6 Important 7 8 9 Critically important In patients after hip replacement… Should warfarin rather than… Aspirin be given… To reduce symptomatic venous thromboembolism and mortality? 13 Question to the audience Deciding on the importance of outcomes on decision making: 1 2 3 Less important 4 5 6 Important 7 8 9 Critically important Please rate outcome: Dying from pulmonary embolism A. (1, 2, 3): Less important for decision making B. (4, 5, 6): Important for decision making C. (7, 8, 9): Critically important for decision making 14 Question to the audience Deciding on the importance of outcomes on decision making: 1 2 3 Less important 4 5 6 Important 7 8 9 Critically important Asymptomatic deep vein thrombosis in the calf (e.g., as seen on mandatory venography at end of study) A. (1, 2, 3): Less important for decision making B. (4, 5, 6): Important for decision making C. (7, 8, 9): Critically important for decision making 15 Question to the audience Deciding on the importance of outcomes on decision making: 1 2 3 Less important 4 5 6 Important 7 8 9 Critically important Stomach ulcer bleeding requiring endoscopy A. (1, 2, 3): Less important for decision making B. (4, 5, 6): Important for decision making C. (7, 8, 9): Critically important for decision making 16 Question to the audience Deciding on the importance of outcomes on decision making: 1 2 3 Less important 4 5 6 Important 7 8 9 Critically important Regular blood work and dose adjustments A. (1, 2, 3): Less important for decision making B. (4, 5, 6): Important for decision making C. (7, 8, 9): Critically important for decision making 17 Rating the importance of outcomes Train the content expert to understand that outcomes that are critical for decision making are identified Rating is done before, during and after the evidence review The rating may change in light of new information 18 Critical Outcome Critical Outcome Important Outcome Less High Moderate Low Very low Summary of findings & estimate of effect for each outcome Systematic review Grade down P I C O Outcome 1. 2. 3. 4. 5. Grade up RCT start high, obs. data start low Risk of bias Inconsistency Indirectness Imprecision Publication bias 1. Large effect 2. Dose response 3. Confounders Guideline development Rate overall quality of evidence across outcomes based on lowest quality of critical outcomes Formulate recommendations: • For or against (direction) • Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) • • • • “We recommend using…” “We suggest using…” “We recommend against using…” “We suggest against using…” 19 Taking it to the next level Early involvement of consumers in the guideline development process Selecting systematic reviews that are known to make an effort to include consumer views (e.g., Cochrane etc.) Can be used to identify research gaps 20 Evidence review stage What format of evidence do you use? $$$ Using mainly systematic reviews (SR) Have the resources Do it inhouse Outsource Mainly using single study data Don’t have the resources Ready to use SR Search for SR Update SR Use GRADE without evidence profiles Ad hoc reviews Utilize the full GRADE framework (± evidence Profiles) Not ready to use SR $ 21 Question to the audience Select the best answer: You can find high quality systematic reviews for “free” here: A. AHRQ B. The Cochrane Library C. Canadian Agency for Drugs and Technologies in Health (CADTH) D. National Institute for Clinical Excellence (NICE), UK E. All of the above 22 Taking it to the next level What to look for when selecting evidence review centers Commissioning systematic reviews: Making sure the center understands GRADE requirements What SR methodology they use What databases they can search What software they use How they document their work 23 Question to the audience GRADE rating evidence: The quality of evidence may need downgrading if: A. The outcome is reduction of elevated pressure in the eye (IOP) instead of loss of vision B. There are large losses to follow-up C. Some trials showing benefits, others reporting harms D. The confidence interval is wide and there are few events E. All of the above 24 Quality of evidence: beyond risk of bias Definition: The extent to which our confidence in an estimate of the treatment effect is adequate to support a particular recommendation Methodological limitations Risk of bias: Allocation concealment Blinding Intention-to-treat Follow-up Stopped early Inconsistency of results Indirectness of evidence Imprecision of results Publication bias Sources of indirectness: Indirect comparisons Patients Interventions Comparators Outcomes 25 Quality assessment criteria Study design Quality of evidence Lower if… Randomized trials High Study limitations (design and execution) Moderate Inconsistency Low Indirectness Very low Imprecision Observational studies Higher if… What can raise the quality of evidence? Publication bias 26 Question to the audience A. B. C. D. A systematic review of observational studies showed a relationship between front sleeping position (versus back position) and sudden infant death syndrome (SIDS): OR 2.93 (1.15, 7.47). Rate the quality of evidence for the outcome SIDS: High Moderate Low Very low 27 Question to the audience A. B. C. D. You review all colonoscopies for average risk screening in your health system and document a percentage of patient who developed a perforation after the procedure (evidence of free air on imaging). No comparison group without colonoscopy available. Rate the quality of evidence for the outcome perforation: High Moderate Low Very low 28 Question to the audience A. B. C. D. Several RCTs have shown the effectiveness of natalizumab to induce remission in Crohn’s disease. Study/post-marketing data showed 31 cases of potentially lethal progressive multifocal leukoencephalopathy (PML, JC virus related). Rate the quality of evidence for PML: High Moderate Low Very low 29 Quality assessment criteria Study design Quality of evidence Lower if… Higher if… Randomized trials High Study limitations (design and execution) Large effect (e.g., RR 0.5) Very large effect (e.g., RR 0.2) Moderate Inconsistency Evidence of dose-response gradient Low Indirectness Very low Imprecision All plausible confounding would reduce a demonstrated effect Observational studies Publication bias 30 “Categories” of quality (1) High Further research is very unlikely to change our confidence in the estimate of effect Moderate Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate Low Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate Very low Any estimate of effect is very uncertain 31 Conceptualizing quality (2) High We are very confident that the true effect lies close to that of the estimate of the effect. Moderate We are moderately confident in the estimate of effect: The true effect is likely to be close to the estimate of effect , but possibility to be substantially different. Low Our confidence in the effect is limited: The true effect may be substantially different from the estimate of the effect. Very low We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect. 32 Taking it to the next level Advantages of systematically assessing quality of evidence Downgrading and upgrading “on-the-fly” can introduce errors Study / year Treatment AlloBlinding cation concealment No outcome (%) Analysis Comments REMOBILIZE 2009 dabigatran 220 mg QD dabigatran 150 mg QD enoxaparin 30 mg BID Yes (IVRS) (blocks of 6) 269/862 (31.2%) 232/877 (26.5%) 239/876 (27.3%) ITT: no Low dose ASA and stocking allowed, but not pneumatic devices Patients: Y Caregivers: Y Data coll: PY Adjudic: Y Data analysts: ? 33 GRADE evidence profile 34 Question to the audience PICO: Should children with otitis media be treated with antibiotics? Rate the overall quality of evidence for this clinical question by evaluating all critical outcomes (use the evidence profile): A. High B. Moderate C. Low D. Very low 35 Outcome Critical Outcome Important Outcome Important Outcome Less Overall quality of evidence Critical Grade down or up P I C O Outcome Formulate recommendations: • For or against (direction) • Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) 36 Question to the audience PICO: Should children with otitis media be treated with antibiotics? Rate the overall strength or recommendations: A. “We recommend early antibiotics in children with acute otitis media” B. “We suggest early antibiotics…” C. “We suggest against using antibiotics initially…” D. “We recommend against using antibiotics initially…” 37 Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” 4 determinants of the strength of recommendation Factors that can weaken the strength of a recommendation Explanation Lower quality evidence The higher the quality of evidence, the more likely is a strong recommendation. Uncertainty about the balance of benefits versus harms and burdens The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely is a weak recommendation warranted. Uncertainty or differences in patients’ values The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted. Uncertainty about whether the net benefits are worth the costs The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted. 39 Implications of a strong recommendation Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action Policy makers: The recommendation can be adapted as a policy in most situations 40 Implications of a weak recommendation Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making Policy makers: There is a need for substantial debate and involvement of stakeholders 41 Taking it to the next level Explicit separation of quality of evidence from making recommendations Correctly balancing the benefits against the undesirable effects Special challenges: resource use Increasing transparency in the process of making recommendations 42 Question to the audience Should patients with chronic hepatitis C be treated with interferon/ribavirin combination? There is high quality evidence for benefits and high quality evidence for harms. Rate the overall strength or recommendations: A. “We recommend treatment of chronic hepatitis C” B. “We suggest treatment…” C. “We suggest against treating patients…” D. “We recommend against treating patients…” 43 Patient values & preferences In the absence of evidence, guideline panels have to function as surrogates to estimate values and preferences (V&P) Consumer involvement can help Attaching V&P statements to guideline recommendations increases transparency 44 Taking it to the next level Systematically searching the literature for studies of values and preferences Systematic reviews of V&P Querying the guideline panel to rate health utilities of outcomes using case scenarios 45 Question to the audience Please select the most appropriate answer. The reason you attended this session: A. Just interested in the topic B. Have been involved in narrative evidence reviews, but have not used any formal grading system C. Have used a grading system but not GRADE D. Using or considered using GRADE 46 Question to the audience Please select the most appropriate answer. Selecting a system to rate the quality of evidence and strength of recommendations, such as GRADE: A. Appears too expensive to implement B. Appears valuable, but still requires substantial upfront expense C. Appears to have some upfront cost but long-term savings D. I use GRADE – it has been paying off for me 47 Basic dimensions Guideline work aligns along 3 basic dimensions High quality Fast Expensive vs. vs. vs. low quality slow cheap 48 Ideal vs. practical ad hoc GRADE approaches Stage Elements Advantage Comment Ideal Systematic review GRADE eTables Qual. of evidence Strength of rec. Follows highest standards Methodolog. most rigorous Easily maintainable Fully transparent process Access to methodologist Access to evidence centers Initially more resource intensive, long-term savings Intermediary Ad hoc review GRADE eTables Qual. of evidence Strength of rec. Still retaining major advantages of the of the “ideal approach” Risk of bias higher Access methodologist rec. Only minimal addl. cost Initiation Ad hoc review GRADE eTables Qual. of evidence Strength of rec. Option to fully “upgrade” to an “ideal approach” Foundation of a methodologically sound system Risk of bias higher Access methodologist prn No additional cost 49 Sources of funding Funders may have an agenda Industry – tricky Foundations Public – AHRQ, criteria EHC program fit (3: available, relevance for public payer, priority condition) Importance (7: e.g., public interest etc.) No duplication Feasibility Impact (6: e.g., addresses inequity) 50 Taking it to the next level Long term planning Create a high quality guideline product Attract high quality guideline panel Unconflicted methodologist (editor) Content expert (deputy editor) Content expert authors Health economists 51 Taking it to the next level GRADE evidence profiles Condensed and standardized summary of evidence Are increasingly already created as part of a systematic review (e.g., Cochrane reviews) Flexible presentation (e.g., as summary of findings tables) Initial investment Long-term value GRADEpro software (tie-in with RevMan) Avoids duplication of efforts across the globe 52 Vision 1. Globalize the evidence, localize 2. 3. 4. 5. recommendations Focus on questions that are important to patients and clinicians Undertake collaborative evidence reviews Use a common metric to assess the quality of evidence and strength of recommendations Examined collaborative models for funding Schunemann 2009 53 GRADE uptake 54 Conclusion Gaining acceptance as international standard because GRADE adds value: 1. Criteria for evidence assessment across a range of questions and outcomes 2. Sensible, systematic, fostering transparency 3. Balance between simplicity and methodological rigor