The Linked Clinical Data Project

Download Report

Transcript The Linked Clinical Data Project

SHARPn
High-Throughput
Phenotyping (HTP)
November 18, 2013
Electronic health records (EHRs) driven
phenotyping
• EHRs are becoming more and more prevalent
within the U.S. healthcare system
• Meaningful Use is one of the major drivers
• Overarching goal
• To develop high-throughput semi-automated
techniques and algorithms that operate on
normalized EHR data to identify cohorts of
potentially eligible subjects on the basis of
disease, symptoms, or related findings both
retrospectively and prospectively
High-Throughput Phenotyping from EHRs
©2013 MFMER | slide-2
EHR-driven Phenotyping Algorithms – The Process
Rules
Evaluation
Phenotype
Algorithm
Transform
Mappings
Visualization
Transform
Data
NLP, SQL
High-Throughput Phenotyping from EHRs
[eMERGE Network]
©2013 MFMER | slide-3
Key lessons learned from eMERGE
• Algorithm design and transportability
•
•
•
•
Non-trivial; requires significant expert involvement
Highly iterative process
Time-consuming manual chart reviews
Representation of “phenotype logic” is critical
• Standardized data access and representation
• Importance of unified vocabularies, data elements, and value sets
• Questionable reliability of ICD & CPT codes (e.g., billing the wrong
code since it is easier to find)
• Natural Language Processing (NLP) plays a vital role
[Kho et al. Sc. Trans. Med 2011; 3(79): 1-7]
High-Throughput Phenotyping from EHRs
©2013 MFMER | slide-4
Algorithm Development Process - Modified
•
Standardized and structured
representation of phenotype
definition criteria
Use the NQF Quality Data
Model (QDM)
•
Rules
•
Conversion of structured
phenotype criteria into
executable queries
Evaluation
• Use JBoss® Drools (DRLs)
Semi-Automatic Execution
Phenotype
Algorithm
• Standardized representation
Visualization
•
Transform
Mappings
Transform
of clinical data
Create new and re-use existing
clinical element models (CEMs)
Data
NLP, SQL
High-Throughput Phenotyping from EHRs
[Welch et al., JBI 2012; 45(4):763-71]
©2013 MFMER | slide-5
An Evaluation of the NQF Quality Data Model for Representing Electronic
Health Record Driven Phenotyping Algorithms
William K. Thompson, Ph.D.1, Luke V. Rasmussen1, Jennifer A. Pacheco1,
Peggy L. Peissig, M.B.A.2, Joshua C. Denny, M.D. 3, Abel N. Kho, M.D. 1,
Aaron Miller, Ph.D.2, Jyotishman Pathak, Ph.D.4,
1
Northwestern University, Chicago, IL; 2Marshfield Clinic, Marshfield, WI; 3Vanderbilt
University, Nashville, TN; 4Mayo Clinic, Rochester, MN
Abstract
The development of Electronic Health Record (EHR)-based phenotype selection algorithms is a non-trivial and
highly iterative process involving domain experts and informaticians. To make it easier to port algorithms across
institutions, it is desirable to represent them using an unambiguous formal specification language. For this purpose
we evaluated the recently developed National Quality Forum (NQF) information model designed for EHR-based
quality measures: the Quality Data Model (QDM). We selected 9 phenotyping algorithms that had been previously
developed as part of the eMERGE consortium and translated them into QDM format. Our study concluded that the
QDM contains several core elements that make it a promising format for EHR-driven phenotyping algorithms for
clinical research. However, we also found areas in which the QDM could be usefully extended, such as representing
information extracted from clinical text, and the ability to handle algorithms that do not consist of Boolean
combinations of criteria.
Introduction and Motivation
Identifying subjects for clinical trials and research studies can be a time-consuming and expensive process. For this
reason, there has been much interest in using the electronic health records (EHRs) to automatically identify patients
that match clinical study eligibility criteria, making it possible to leverage existing patient data to inexpensively and
automatically generate lists of patients that possess desired phenotypic traits.1,2 Yet the development of EHR-based
phenotyping algorithms is a non-trivial and highly iterative process involving domain experts and data analysts.3 It
is therefore desirable to make it as easy as possible to re-use such algorithms across institutions in order to minimize
the degree of effort involved, as well as the potential for errors due to ambiguity or under-specification. Part of the
solution to this issue is the adoption of an unambiguous and precise formal specification language for representing
phenotyping algorithms. This step is naturally a pre-condition for achieving the long-term goal of automatically
executable phenotyping algorithm specifications. A key element required to achieve this goal is a formal
representation for modeling the algorithms that can aid portability by enforcing standard syntax on algorithm
specifications, along with providing well-defined mappings to standard semantics of data elements and value sets.
[Thompson
et al., AMIA 2012]
Our experience in the development of phenotyping algorithms stems from work performed as part
of the electronic
4
Medical Records and Genomics (eMERGE) consortium, a network of seven sites originally using data collected in
the EHR as part of routine clinical
care to detect phenotypes
for from
use in
genome-wide association studies. The
High-Throughput
Phenotyping
EHRs
©2013 MFMER | slide-6
[Li et al., AMIA 2012]
High-Throughput Phenotyping from EHRs
©2013 MFMER | slide-7
http://phenotypeportal.org
[Endle et al., AMIA 2012]
High-Throughput Phenotyping from EHRs
Phenotype Modeling and Execution
Architecture (pheMA): New 4-year NIH R01
Portability and
Execu on
User
Interac on
Standards and
Representa on
Exis ng tools/ini a ves leveraged
eMERGE Phenotype
Algorithms
Quality Data
Model (QDM)
Clinical Informa on
Modeling Ini a ve
(CIMI)
Health Quality
Measures Format
(HQMF)
ONC Query Health /
S&I Framework
Natural Language
Processing
I2b2 datamart
Deliverables
Aim 1 - Standards based
template and informa on model
QDM - Phenotype
Extensions (QDM/PX)
Aim 2 - Open-access repository
and infrastructure for authoring,
sharing and accessing algorithms
Phenotype
KnowledgeBase (PheKB)
SHARPn Clinical
Elements Model (CEM)
Repository
Specific Aims
Jboss® Drools Rule
Engine
Enterprise Data
Warehouse (EDW)
Aim 3 - Transla ng phenotype
defini on criteria to executable
EHR data queries
High-Throughput Phenotyping from EHRs
Phenotype Authoring
Tool
Library of computable
phenotype algorithms
QDM/PX -> Drools
Translator Engine
Executable Algorithms
©2013 MFMER | slide-10
Plan for Aim 1: Evaluation of Quality Data Model
Evaluation
QDM & MAT extension
Reviewing
Measures reviewed by three individual domain experts
(Comparison for the two versions of the measure)
Standards to validate and compare the created measures:
Gold Standards
Modeling
•
•
•
•
•
•
Measure how concise of the measure is (more concise is better)
Measure is true to the algorithm
Measure how much existing values sets and measures are re-used
Measure how much time it took to implement in MAT
Measure how many rules in the MAT version vs. Word document
Considerations:
 how experienced the person was w/ MAT to start, and for
ea. phenotype as gain more experience make note of it
 how well the person knew the phenotype to start
One algorithm modeled by two individual MAT & QDM experts
Measure Authoring Tool
eMERGE phenotypes:
T2DM; Resistant Hypertension; Hypothyroidism; Cataracts;
Diabetic Retinopathy; PAD; Dementia; VTE; Glaucoma; Ocular Hypertension
Continuous variable phenotypes
QRS duration from ECG; Lipids (inc. HDL); Height; RBC; WBC
Plan for Aims 2 & 3: National Library of
Computable Phenotyping Algorithms
High-Throughput Phenotyping from EHRs
©2013 MFMER | slide-12