COI: Brainstorming Document Feedback from: • • • • • • • • • • Jyotishman Pathak Dan Russler Matt Moores Alan Ruttenberg Parsa Mirhaji Lee Feigenbaum Ronan Fox Rachel Richesson Susie Stephens Eric Prud’hommeaux.
Download
Report
Transcript COI: Brainstorming Document Feedback from: • • • • • • • • • • Jyotishman Pathak Dan Russler Matt Moores Alan Ruttenberg Parsa Mirhaji Lee Feigenbaum Ronan Fox Rachel Richesson Susie Stephens Eric Prud’hommeaux.
COI: Brainstorming Document
Feedback from:
•
•
•
•
•
•
•
•
•
•
Jyotishman Pathak
Dan Russler
Matt Moores
Alan Ruttenberg
Parsa Mirhaji
Lee Feigenbaum
Ronan Fox
Rachel Richesson
Susie Stephens
Eric Prud’hommeaux
Goal
• To demonstrate the value of semantic web (SW) specifications
in bridging the divide between clinical practice and clinical
research
– Collaborative development of a proof of concept (POC) that
demonstrates key value propositions of using SW specifications
• To get buy-in from a wide variety of stakeholders as a prelude
to acceptance and adoption of semantic web specifications
– Get buy-in for the use case
– Get wide participation and involvement of the key stakeholders in various
stages of analysis, design and development of the POC.
– Re-use existing standards, terminologies, data and information models of
existing communities to increase the probability of adoption.
Methodology
The group has been functioning in a consensus
driven manner where opinions are sought at
each step from all the stakeholders and a
decision is taken based on the consensus so
created.
It was realized that a critical success factor was
to incorporate the views of various communities
at each step.
Decision 1: Use Case
• Use Case Development was lead by Rachel
Richesson
• Wide variety of use cases were investigated and
discussed
– Patient Recruitment
– Adverse Event Detection
– Tracking Patient Through a Clinical Trial
• Decision: Focus on Patient Recruitment
– Data was assumed to be in an EMR.
Re-use of existing Information Models
• How can we re-use EMR data for Clinical Research?
• HL7/RIM/DCM descriptions may be viewed as a “format” for Clinical
Research Data.
– Typically clinical data in healthcare delivery systems and applications is
represented or transformed into this “format”
• CDISC/SDTM description may be viewed as a “format” for Clinical Research
questions.
– Typically clinical data in clinical trials systems and application is represented or
transformed into this format?
• Can we ask questions in one “format” when the data represented in another
“format”?
• How can we implement functionality to map across these “formats”?
– The mapping module should be flexible to incorporate extensions in the “formats”
– The mapping module should be flexible to “plug and play” with multiple “formats”
Decision 2: Information Models
• Wide variety of Information Models were considered
–
–
–
–
–
–
–
HL7/RIM
CDISC/SDTM
Detailed Clinical Models from Intermountain Healthcare
Galen
POMR Ontology (Chimezie)
Eligibility Criteria Ontology (Helen)
Healthcare Delivery Encounter-based Meta Model (Parsa Mirhaji)
• Conclusions
– No one ontology/information model is likely to fit the bill
– Align as closely as possible to existing information model and terminology
standards as possible
– Identify gaps and inadequacies in addressing the use case at hand.
– Provide feedback to standards groups: CDISC, HL7/RIM, BRIDG
• Decision: Use CDISC/SDTM, Detailed Clinical Models, HL7/RIM as “seed”
ontologies to begin with
– Iteratively refine them as gaps and inadequacies are discovered
Demonstrate Re-use
• Re-use of data from the EMR for Clinical Research
• Re-use of existing vocabularies, e.g., NCI Thesaurus, Snomed, MedDRA
• Re-use of pre-existing information models e.g., HL7/RIM/DCM, SDTM
• Identify and Re-use software components that can be used to enable a wide
range of use cases
– Patient Recruitment
– Adverse Drug Event Detection
– Tracking a Patient through a Clinical Trial
• Develop the POC based on an implementation of these re-usable
components
–
–
–
–
Components that implement mapping
Components that implement data retrieval
Components that implement wrappers/trasnformations
Components that implement checking for elgibility criteria, adverse events and
other clinical events of significance.
Decision 3
• Decided to implement POC on a real world data set as
opposed to a synthetically created data set. This
raised the following issues:
– What would be an appropriate “seed’ Information
Model/Ontology to describe healthcare data based on
current state.
– What are appropriate terminologies (e.g., Snomed, LOINC,
RxNorm) that need to be considered to capture coded
information in healthcare data based on current state
• Parsa Mirhaji provided the data and his feedback was
crucial in identifying the appropriate “seed”
model/terminology
Decision 4
• Based on discussions with W3C folks such as Ralph
Swick, Karen Myers, Steve Bratt, Eric P.
– W3C is interested in working with external standards bodies
such as HL7 and CDISC and express their content using
Semantic Web specification such as RDF and OWL – Steve
Bratt at the Bio IT World Luncheon
– Implication about W3C being a content neutral
– Bron Kisler emphasized that since W3C is providing only the
languages, a collaboration would be synergistic and would
make sense
– Is it possible to develop a collaborative interest group with
involvement of HL7, CDISC and others – conversation with
Ralph Swick
One proposed Solution Architecture
Protocol
Specification
Interface
Mapping
Module
RDF Transformation
Engine
Eligibility
Checking
Module
CTMS
EMR System
Decision 5
• Current State Assumptions
– Information Models and Vocabularies used in Clinical Trials
Context are different from those used in the Healthcare
Delivery context
• Emphasis on the mapping aspect
– Support Plug and Play of different Information Models and
Vocabularies
• Technology Choices:
– SPARQL
– N3 rules
Mapping Module
• Critical component of the key goal of this effort.
– i.e., To gain acceptance from a wide variety of stakeholders
in the healthcare and clinical trials space.
– HL7/RIM/DCM – seek alignment with healthcare standards
– CDISC/SDTM – seek alignment with clinical trials standards
– Develop Mappings across these two models
– Identify limitations and gaps across these models
• Scope:
– Focus only on those data items that are required for patient
recruitment
– Focus only on those data items that are related to diabetes
and hypertension
– To be driven in some part by “mock” diabetes and
hypertension records
Use Case Step Through
1.
2.
3.
4.
5.
6.
7.
Clinical Trial Administrator uses the Protocol Specification Interface to
specify the eligibility criteria. The data items are specified using elements
from the SDTM model.
The mapping module translates the data items to the appropriate
HL7/RIM/DCM representation.
Appropriate queries are made to the Mediator/Gateway module.
The Mediator/Gateway module translates the query into the underlying
database query language. The query is executed at the database and
sent to the mapping module.
The mapping module retranslates the data into terms from the SDTM
model.
The Eligibility Checking Module checks which patients satisfy the eligibility
criteria.
The selected patients are returned to the Clinical Trial Administrator
Note: Some eligibility criteria may not be expressible using SPARQL queries
and may required rules, etc.
Next Steps: Narrow Scope for
Implementation
• Choose a protocol for implementation #8 (second one)
• Limit Scope to Medications, Lab Tests and Vital Signs
• Develop Clinical Trials Ontology and Clinical Practice Ontology
– Iterative development
– Alignment with standards as closely as possible
• Implement RDF data store based on data requirements and mock patients
• Implement Mapping module using N3 rules
• Implement Eligibility checking module using SPARQL
• Try to demonstrate another use case for Adverse Drug Event Detection.
Specification of Eligibility Criteria
•
Assume we will use an ontology or rule-based tool to
specify eligibility criteria
•
Open to NLP/Ontology-based approaches that
translate free text clinical protocol specifications that
transform these into a structured form
•
Examples:
–
–
Type 1 diabetes and/or history of ketoacidosis
History of long-term therapy with insulin (>30 days) within
the last year
Eligibility Criteria Specification
• The functional requirements for this need to be
identified and spec’ed out. For e.g.,
–
–
–
–
Temporal Constraints
Trends on clinical data and values
…
Out of Scope for POC.
• May want to see if the CT or HC communities have
done some work on standards for specifying eligibility
criteria.
– Out of Scope for POC
Design Choice: Eligibility Criteria as a
“layer” around Data Items
• Data Items
– Problem: Type 2 Diabetes
– History of Problem: Ketoacidosis
– History of Therapy
• Name: Insulin
• Length: X days
• Time Period: [Date1, Date2]
• Eligibility Criteria:
–
–
–
–
Rule conditions
Patient has Type 2 Diabetes
Patient has History of Ketoacidosis
Patient has History of Therapy:
• Name = Insulin
• X > 30 days
• Time Period < 1 year
Mappings: Goal/Methodology
1.
Characterize the various data items required for patient recruitment
(modulo scope) List of requirements on the data content Tab
1.
•
For each data item do the following:
1.
2.
3.
4.
1.
http://spreadsheets.google.com/ccc?key=pINNryLt_vyDiPyHj11WiDg&hl=en_
US&pli=1
Identify the RIM/DCM construct(s) that models that data item DCM column
under Models
Identify the SDTM construct(s) that models that data item SDTM column
under Models
Identify the terminologies that model some of the values required
Terminology Columns including Snomed, MedDRA and NCI Thesaurus
Identify the data types and values that characterize the values of some of the
data items Data Types and Units columns including those for RIM and
SDTM
We will be considering various constructs of HL7/RIM, Detailed Clinical
Models and other models in conjunction
Consider a Data Item Example:
• History of Therapy
– Name: Insulin
– Length of Therapy: 100 days
– StartDate: Date
– EndDate: Date
Mapping Methodology
1.
Identify Information Model Elements
1.
Therapy =>
1.
SubstanceAdministration (HL7/RIM)
1.
2.
2.
3.
2.
Medication.Name (HL7/RIM)
Identify Controlled Vocabularies
1.
•
Medication.Name => Controlled Vocabulary RxNorm (also known as
Terminology Binding)
Identify Data Types
•
•
Medication (subClass of ManufacturedMaterial, HL7/RIM)
Specific type of Participation called Consumable (HL7/RIM)
Insulin =>
1.
2.
effectiveTime
statusCode
Dates and Times => TS data type in HL7
Identify Units
1.
Included in the definition of data types … taken from the UCUM standard
Mapping Methodology (Continued)
1. Mappings between Information Model elements;
1. SystolicBP VSTEST, VSTESTCD = SYSBP
•
Mappings between controlled vocabularies:
•
•
•
•
SystolicBP “Some Snomed Concept”
SYSBP “Some NCI Thesaurus Concept”
“Some Snomed Concept” “Some NCI Thesaurus
Concept”
Between Data Types and Units
•
HL7:PQ VSRESU
Design Choice: Leverage Existing
Implementations and Systems
• SHER System
– Re-use the Reasoner to compute eligibility criteria
• Semantic DB System
– Re-use the NLP parser (if available) to parse the
textual representation of the clinical trials criteria
into structured queries, rules, whatever
Technical: Eligibility Criteria in OWL
Patient
that (hasProblem some DiabetesType 1
or hasHistory some Ketacidosis)
and hasTherapy
(some Therapy
that hasLength all int[>30]
and hasTimePeriod all int[< 365])
Just for illustration purposes … need thorough and
detailed analysis to get it right.
Technical Design: Eligibility Criteria using Rules
IF (the_patient.hasProblem = DiabetesType 1
OR the_patient.hasHistory = Ketacidosis)
AND the_patient.hasHistory.name = Insulin
AND the_patient.hasHistory.length > 30
AND the_patient.hasHistory.timePeriod < 365
THEN
the_patient is eligible for the clinical trial
Mapping Design Issues
• Are mappings always 1-1?
• Is it always possible to get synonym mappings?
• What happens to these mappings when there
are changes in the information models?
• Are these mappings enough to enable a bidirectional flow through between EMR and
Clinical trials data?