SHARP High-Throughput Phenotyping Jyoti Pathak, Ph.D.

Download Report

Transcript SHARP High-Throughput Phenotyping Jyoti Pathak, Ph.D.

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping

Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH

June 12, 2012

Electronic health records (EHRs) driven phenotyping

• Overarching goal • To develop high-throughput automated techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-2

Current HTP project themes

• Standardization of phenotype definitions • Library of phenotyping algorithms • Phenotyping workbench • Machine learning techniques for phenotyping • Just-in-time phenotyping ©2012 MFMER | slide-3 SHARPn High-Throughput Phenotyping

Algorithm Development Process - Modified

• Standardized and structured representation of phenotype definition criteria • Use the NQF Quality Data Model (QDM)

Rules Semi-Automatic Execution

• Conversion of structured phenotype criteria into executable queries

Evaluation Phenotype Algorithm Visualization

clinical data • Create new and re-use existing clinical element models (CEMs)

Transform Transform Data Mappings NLP, SQL

[Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012] ©2012 MFMER | slide-4 SHARPn High-Throughput Phenotyping

• • • •

NQF Quality Data Model (QDM)

Standard of the National Quality Forum (NQF) • A structure and grammar to represent quality measures in a standardized format Groups of codes in a code set (ICD-9, etc.) • "

Diagnosis, Active: steroid induced diabetes

" using "steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)” Supports temporality & sequences • AND: "

Procedure, Performed: eye exam

" > 1 year(s) starts

before

or

during

"Measurement end date" Implemented as set of XML schemas • Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.) ©2012 MFMER | slide-5 SHARPn High-Throughput Phenotyping

116 Meaningful Use Phase I Quality Measures

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-6

Example: Diabetes & Lipid Mgmt. - I Human readable HTML

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-7

Example: Diabetes & Lipid Mgmt. - II Computable XML

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-8

Algorithm Development Process - Modified

• Standardized and structured representation of phenotype definition criteria • Use the NQF Quality Data Model (QDM)

Rules Semi-Automatic Execution

• Conversion of structured phenotype criteria into executable queries

Evaluation Phenotype Algorithm Visualization

clinical data • Create new and re-use existing clinical element models (CEMs)

Transform Transform Data Mappings NLP, SQL

[Welch et al. 2012] [Thompson et al., submitted 2012] [Li et al., submitted 2012] ©2012 MFMER | slide-9 SHARPn High-Throughput Phenotyping

Drools-based Phenotyping Architecture

Clinical Element Database Data Access Layer Transformation Layer Transform physical representation  Normalized logical representation (Fact Model) Business Logic Inference Engine (Drools) Service for Creating Output (File, Database, etc) List of Diabetic Patients SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-10

Automatic translation from NQF QDM criteria to Drools

SHARPn High-Throughput Phenotyping [Li et al., submitted 2012] ©2012 MFMER | slide-11

The “executable” Drools flow

©2012 MFMER | slide-12

Phenotype library and workbench - I http://phenotypeportal.org

1. Converts QDM to Drools 2. Rule execution by querying the CEM database 3. Generate summary reports ©2012 MFMER | slide-13

Phenotype library and workbench - II http://phenotypeportal.org

©2012 MFMER | slide-14

Phenotype library and workbench - III

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-15

Machine learning and HTP - I

• Machine learning and association rule mining • Manual creation of algorithms take time • Let computers do the “hard work” • Validate against expert developed ones [Caroll et al. 2011] ©2012 MFMER | slide-16 SHARPn High-Throughput Phenotyping

Machine learning and HTP - II

• • • • • Origins from sales data

Items

(columns): co-morbid conditions

Transactions

(rows): patients

Itemsets

: sets of co-morbid conditions

Goal

: find

all

itemsets (sets of conditions) that

frequently

co-occur in patients.

• One of those conditions should be DM.

• •

Support

: # of transactions the itemset appeared in • Support({TB, DLM, ND})=3

I Frequent

: an itemset support(

I

)>

minsup I

is frequent, if

AB

Patien t

001 002 003 004 005

TB

Y Y Y

A B

DL M

Y Y Y Y Y

AC ABD AD BC

ND … IEC

Y Y Y

C

Y

BD ACD D

Y Y

CD

X

: infrequent [Simon et al. 2012] SHARPn High-Throughput Phenotyping

Just-in-Time phenotyping - I Transfusion-related Acute Lung Injury (TRALI) Transfusion-associated Circulatory Overload (TACO)

Electronic Health Records and Phenomics

Just-in-Time phenotyping - II TRALI/TACO “sniffer”

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-19

Electronic Health Records and Phenomics

Active Surveillance for TRALI and TACO

Of the

88 TRALI cases

correctly identified by the CART algorithm, only

11 (12.5%)

of these were reported to the blood bank by the clinical service. Of the

45 TACO cases

correctly identified by the CART algorithm, only

5 (11.1%)

were reported to the blood bank by the clinical service. SHARPn High-Throughput Phenotyping

Publications till date (conservative)

14 12 12 10 8 8 6 4 2 6 6 2 Papers Abstracts Under review 0 Year 1 (2011) Year 2 (2012) Year 3 (2013) ©2012 MFMER | slide-22 SHARPn High-Throughput Phenotyping

2011 Milestones

   Standardized definitions for phenotype criteria Rules-based environment for phenotype algorithm execution  National library for standardized phenotype definitions (collaboration with eMERGE) Machine learning techniques for algorithm definitions   Online, real-time phenotype execution Phenotyping algorithm authoring environment ©2012 MFMER | slide-23 SHARPn High-Throughput Phenotyping

2012 Milestones

• • • • Machine learning techniques for algorithm definitions Online, real-time phenotype execution Collaboration with NQF, Query Health and i2b2 infrastructures • • • • Use cases and demonstrations MU quality metrics (w/ NQF, Query Health) Cohort identification (w/ eMERGE, PGRN) Value analysis (w/ Mayo CSHCD, REP) Clinical trial alerting (w/ Mayo Cancer Ctr./CTSA) ©2012 MFMER | slide-24 SHARPn High-Throughput Phenotyping

Project 3: Collaborators & Acknowledgments

• • • • • • CDISC (Clinical Data Interchange Standards Consortium) • Rebecca Kush, Landen Bain Centerphase Solutions • Gary Lubin, Jeff Tarlowe Group Health Seattle • David Carrell Harvard University/MIT • Guergana Savova, Peter Szolovits Intermountain Healthcare/University of Utah • Susan Welch, Herman Post, Darin Wilcox, Peter Haug Mayo Clinic • Cory Endle, Rick Kiefer, Sahana Murthy, Gopu Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor, Chris Chute ©2012 MFMER | slide-25 SHARPn High-Throughput Phenotyping