Merck Project 10 Predictive Modeling of Drug

Download Report

Transcript Merck Project 10 Predictive Modeling of Drug

Characterizing an Optimal Predictive
Modeling Framework for Prediction of
Adverse Drug Events
Jon Duke, MD MS, Xiaochun Li PhD,
Zuoyi Zhang PhD
EDM Forum
June 7th 2014
Project Goal/Background
• The ability to identify patients at increased risk of an
adverse drug event at the time of prescribing may
improve patient safety
• The optimal methods for performing such predictive
modeling using routinely collected clinical data are
unknown
• The goal of this study is to assess the feasibility and
optimal methods for predicting risk of ADEs of differing
prevalence levels using observational data
Drug Safety Alert Override Rate
49% - 96%
Methods
• We selected three drug-outcome pairs of
varying frequency
• ACE-I and Hyperkalemia (Frequent)
• SSRIs/SNRIs and Hyponatremia (Infrequent)
• Statins and Rhabdomyolysis (Rare)
• We defined each outcome phenotype based
on diagnoses (ICD-9) and labs (LOINC)
Methods
• Applied a new user design for cohort selection
– Drug exposure initiation must be ≥ 1 year after
entering the observation period
• Source dataset: 2.2 million patients in OMOP
CDM format
– Drugs, diagnoses, labs, procedures, demographics
• Applied three modeling methods to each drugoutcome pair
– Multiple regression
– Classification and regression trees (CART)
– Random forest
Performance Evaluation and
Comparators
• Used an 2:1 random training-test split
• Modeled for event at 30, 90, and 365 days
• In addition to measuring performance using the
target drug-outcome pairs, also applied the
derived models to predicting risk of the outcome
in similar cohorts without exposure
– Comparator Cohorts
• Hyperkalemia – Amlodipine
• Hyponatremia – Bupropion
• Rhabdomyolysis – Niacin
Results (no surprise)
• Random Forest performs best
– Mean AUC for all outcomes
– RF > LR > CART - 80% > 76% > 71%
• More common outcomes perform better
AUC 0.86
AUC 0.7
Results (surprise)
• Models performed equally as well on the
comparator groups in terms of predicting the
likelihood of a given outcome
– Hyperkalemia with Amlodipine AUC 84%
– Hyponatremia with Bupropion AUC 82%
– Rhabdomyolysis with Niacin AUC 76%
Results
• Despite reasonable AUCs, PPVs remain
mediocre for best performing models
– Hyperkalemia with ACE 35%
– Hyponatremia with SSRI 28%
– Rhabdomyolysis with Statin 2%
• Skewing towards better specificity (setting
threshold to catch only highest risk 10%)
– Slight improvements to 38%, 37%, 4%
Discussion
• Putting our findings in the context of developing
real-world CDS systems based on individual risk
– Random forest best, but logistic regression pretty
close and may be more appealing for displaying RFs
– Developing individual models for each drug-outcome
pair may prove unnecessary. One model for each
outcome may be sufficient
– Even with risk-based alerting, poor specificity / alert
fatigue will remain a problem for rare adverse events
Conclusions
• Personalized risk calculation may be beneficial
in CDS, but effectiveness will be highly
dependent on implementation strategy (e.g.,
highlighting vs. suppressing alerts)
• Study limitations include application setting,
unknown misclassification existence, possibly
incomplete lists of covariates
• Further research required regarding using
common models for individual outcomes
Thank You
[email protected]