Data Mining in VAERS to Enhance Vaccine Safety Monitoring at the FDA Robert Ball, MD, MPH, ScM Dale Burwen MD, MPH M.

Download Report

Transcript Data Mining in VAERS to Enhance Vaccine Safety Monitoring at the FDA Robert Ball, MD, MPH, ScM Dale Burwen MD, MPH M.

Data Mining in VAERS to
Enhance Vaccine Safety
Monitoring at the FDA
Robert Ball, MD, MPH, ScM
Dale Burwen MD, MPH
M. Miles Braun, MD, MPH
Division of Epidemiology
Office of Biostatistics and Epidemiology
DIMACS, October 18, 2002
What is the Vaccine Adverse Event Reporting
System (VAERS)?
– National system for surveillance of adverse
events after vaccination initiated by
National Childhood Vaccine Injury Act
1986 and established 1990
– Jointly managed by FDA and CDC
– Reports received from health professionals,
vaccine manufacturers, and the public
Post-licensure Safety Monitoring
• How we do it
– VAERS
• Potentially rapid detection of signal of new safety
concern
• Rarely allows determination of causality
– Enhanced surveillance
• Obtain standardized information on reports
– Controlled studies of hypothesized causal
relationships raised in surveillance
– Communicate results
Uses of VAERS
•
•
•
•
Detecting unrecognized adverse events
Monitoring known reactions
Identifying possible risk factors
Vaccine lot surveillance
Limitations of VAERS
•
•
•
•
•
•
•
Reported diagnoses are not verified
Lack of consistent diagnostic criteria
Wide range in data quality
Underreporting
Inadequate denominator data
No unvaccinated control group
Usually not possible to assess whether a
vaccine caused the reported adverse
event
Analysis of VAERS Data
• Describe characteristics and look for
patterns to detect “signals” of adverse
events plausibly linked to a vaccine
• Signals detected through analysis of
VAERS data almost always require
confirmation through a controlled study
Fundamental Problem in
Assessing Spontaneous Reports
• VAERS ~10-15K reports / year
• AERS ~20K reports / year (CBER)
• How can a sensitive system to detect
potential product problems not be
overloaded and overwhelmed by
information to which we have to respond?
“Data Mining”
• Identify events reported more commonly for one
product than others
– Proportional Reporting Ratios (PRR)
– Empirical Bayesian Geometric Mean (EBGM)
– Don’t account for medical knowledge or biases in
reporting
• EBGM algorithm implemented by Lincoln
Technologies and PPD Informatics
– VAERS Data Mining Environment (VDME)
• PRR algorithm implemented in standard packages
(e.g. SAS, STATA) on an ad hoc basis
Proportional Reporting Ratio
• Compares the adverse event profile of one vaccine to
other vaccines
Number of reports with
Adverse
Event Y
Vaccine X
Other vaccines
a
c
PRR = [a/(a+b)] / [c/(c+d)]
Other Adverse
Events
b
d
Total
(a+b)
(c+d)
Proportional Reporting Ratio
• Compares the adverse event profile of one
vaccine to other vaccines
• Evans has proposed using PRR  2, n  3, and
chi square  4 as criteria for selecting pairs for
further evaluation
Background: Empirical Bayesian
Data Mining
• Similar to PRR in comparing one vaccine to
others
• Calculates observed and expected frequencies
– Observed: # of reported events/vaccine
– Expected: Based on overall frequency of the event
for all vaccines, and the total # of reports of the
vaccine of interest
• Identifies cells with very small expected counts –
accounts for the instability of the small number
Empirical Bayesian Data Mining
• Ranks vaccine-event combinations by Empirical
Bayesian Geometric Mean (EBGM)
• Dumouchel has proposed EBGM  2 as criterion to
select pairs for further evaluation
• Multi-item Gamma Poisson Shrinkage (MGPS)
algorithm detects multi-way combinations
– V=vaccine; S=symptom
• VS
• VSS
• VSSS
Rotavirus VaccineIntussusception
•
•
•
•
•
•
•
•
Clinical Trials Signal
Wild type RV & intussusception study
FDA - licensure
CDC - recommendations for use
Post-marketing Surveillance (VAERS)
Background rates
Population-based incidence rates
Withdrawal
Rotavirus Vaccine and Intussusception:
Signal Emergence
Vaccine Profiles
Anthrax Vaccine: 3-Dimensional
Assessment (V-S-S)
Effect of Stratification on EBGM:
Anthrax Vaccine and Selected COSTARTS
18
16
Amnesia
Derm contact
Hypogonad male
Nodule Subcut
Prev react
Skin dry
Tinnitis
14
EBGM
12
10
8
6
4
2
0
Crude
Age Stratified
Age-Sex
Stratified
Selection of “Item Sets” for
Empirical Bayesian Data Mining
• The choice of “Item Sets” influences the Multi-item
Gamma Poisson Shrinkage (MGPS) algorithm
• Currently all combinations (e.g. 2D v-v, s-s, v-s where
v=vaccine; s=symptom)
• If input is restricted to only v-s combinations the
magnitude of the EBGM and rank for pairs with small
numbers are affected
• Appropriate selection of Item Sets needs systematic
evaluation
Effect of Item Set Selection on
EBGM
Challenges
• What is the best method?
– Bayesian vs. PRR vs. other?
– What are criteria for making this decision?
• How should each method be applied and
interpreted?
– What level of PRR/EBGM?
– How should statistics be interpreted?
Challenges
• Should data mining methods be used for automated
screening or as analytic tools?
– Importance of stratification suggests need for intermediate
level epi/stat sophistication in users
– Users need training to properly interpret results
• Computing resources
– Substantial effort required for data preparation
– Software needs user-friendly features to enhance end-user
control over:
•
•
•
•
Defining data subsets of interest
Stratification
Combining adverse event terms
Selecting item sets prior to data mining
Challenges
• Usual method of monitoring for signals:
•
•
•
•
Physician review of individual reports as they arrive
Physician review of serious reports
Committee review of serious reports at weekly meeting
Physician review of monthly numerical summaries of selected
vaccines
• Periodic vaccine or disease-specific surveillance summaries
• Where does data mining best fit in this process?
• How can data mining results be best communicated to
decision makers, health care providers, and the
public?
Next Steps and Future Challenges
• Continue using PRR and Empirical Bayesian
methods in routine practice
• Systematic comparison of methods
• Simulation study in collaboration with CDC
• Large size of AERS database, especially with 2
way and 3 way interactions
– Is simpler better? e.g. PRR with chi-square
• Drug dictionary in AERS
Summary
• Automated summary of a large amount of data
• Potential for improving usual methods of
monitoring for signals
– Other methods should also be considered
• Further understanding and experience is
needed
Acknowledgments
• FDA
– Manette Niu, Phil Perucci, other CBER staff, Ana
Szarfman and other CDER staff
• CDC
– Henry Rolka, Vitali Poole, Penina Haber, John
Iskander, and other CDC staff
• Others
– Lincoln Technologies, Inc.
– PPD Informatics
– William DuMouchel
Selected References
• Dumouchel W. Bayesian data mining in large frequency tables, with an
application to the FDA spontaneous reporting system. American
Statistician 1999;53:177-190.
• Evans SW, Waller PC, Davis S. Use of proportional reporting ratios for
signal generation from spontaneous adverse drug reaction reports.
Pharmacoepidemiol Drug Saf 2001;10:483-486.
• Niu MT, Erwin DE, Braun MM. Data mining in the US Vaccine
Adverse Event Reporting System (VAERS): early detection of
intussusception and other events after rotavirus vaccination. Vaccine
2001;19:4627-4634.