Transcript Document
R T U New York State Center of Excellence in Bioinformatics & Life Sciences Principles of Referent Tracking and its Application in Biomedical Informatics October 20, 2009 Rochester Clinical & Translational Research Curriculum Seminar Series Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group University at Buffalo, NY, USA R T U New York State Center of Excellence in Bioinformatics & Life Sciences Seminar overview 1. Setting the scene: a rough description of what Referent Tracking is and why it is important 2. Review the basics of Basic Formal Ontology relevant to Referent Tracking • The crucial distinction between representations and what they represent 3. How to apply this • • past and ongoing projects translational data warehousing at UB R T U New York State Center of Excellence in Bioinformatics & Life Sciences Part 1: Setting the scene Referent Tracking: What and Why ? R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is Referent Tracking ? • A paradigm under development since 2005, – based on Basic Formal Ontology, – designed to keep track of relevant portions of reality and what is believed and communicated about them, – enabling adequate use of realism-based ontologies, terminologies, thesauri, and vocabularies, – originally conceived to track particulars on the side of the patient and his environment denoted in his EHR, – but since then studied in and applied to a variety of domains, – and now evolving towards tracking absolutely everything, not only particulars, but also universals. R T U New York State Center of Excellence in Bioinformatics & Life Sciences ‘Principles for Success’ • Evolutionary change • Radical change: • Principle 6: Architect Information and Workflow Systems to Accommodate Disruptive Change » Organizations should architect health care IT for flexibility to support disruptive change rather than to optimize today’s ideas about health care. • Principle 7: Archive Data for Subsequent Re-interpretation » Vendors of health care IT should provide the capability of recording any data collected in their measured, uninterpreted, original form, archiving them as long as possible to enable subsequent retrospective views and analyses of those data.NOTE NOTE: ‘See, for example, Werner Ceusters and Barry Smith, “Strategies for Referent Tracking in Electronic Health Records” Journal of Biomedical Informatics 39(3):362-378, June 2006.’ Willam W. Stead and Herbert S. Lin, editors; Committee on Engaging the Computer Science Research Community in Health Care Informatics; National Research Council. Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions (2009) R T U New York State Center of Excellence in Bioinformatics & Life Sciences Source of all data Reality ! R T U New York State Center of Excellence in Bioinformatics & Life Sciences Ultimate goal of Referent Tracking A digital copy of the world R T U New York State Center of Excellence in Bioinformatics & Life Sciences Requirements for this digital copy • R1: • R2 A faithful representation of reality … of everything that is digitally registered, what is generic scientific theories what is specific what individual entities exist and how they relate • R3: • R4 … throughout reality’s entire history, … which is computable in order to … … allow queries over the world’s past and present, … make predictions, … fill in gaps, … identify mistakes, ... R T U New York State Center of Excellence in Bioinformatics & Life Sciences In fact … the ultimate crystal ball R T U New York State Center of Excellence in Bioinformatics & Life Sciences The ‘binding’ wall I don’t want a cartoon of the world R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologies for ‘unambiguous representation’ ??? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 03/04/1993 58298795 Other lesion on other specified region 5572 17/05/1993 79001 Essential hypertension 298 22/08/1993 2909872 Closed fracture of radial head 298 22/08/1993 9001224 Accident in public building (supermarket) 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologies for ‘unambiguous representation’ ??? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 47804 03/04/1993 58298795 5572 17/05/1993 79001 298 22/08/1993 2909872 298 22/08/1993 9001224 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract If two different fracture codes Accident in public building (supermarket) are used in relation to Other lesion on other specified region observations made on the same Essential hypertension day for the same patient, do they Closed fracture of radial head denote the same fracture ? Accident in public building (supermarket) R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologies for ‘unambiguous representation’ ??? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 2309 21/03/1992 9001224 47804 03/04/1993 58298795 5572 17/05/1993 79001 298 22/08/1993 2909872 298 22/08/1993 9001224 If the same fracture closed fracturecode of shaft of isfemur used for the Accident in public building (supermarket) same patient on Other lesion on other specified region different dates, can Essential hypertension these codes Closed fracture of radial head denote the Accident in public same building (supermarket) fracture? 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologies for ‘unambiguous representation’ ??? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 03/04/1993 58298795 5572 17/05/1993 298 22/08/1993 298 22/08/1993 lesion on other specified region Can the sameOther fracture code used in relation 79001 Essential hypertension to two different patients denote the same 2909872 Closed fracture of radial head 9001224 fracture? Accident in public building (supermarket) 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologies for ‘unambiguous representation’ ??? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 5572 03/04/1993 58298795 Otherused lesion on other specified region Can two different tumor codes 17/05/1993 Essential hypertension in relation79001 to observations made on different 22/08/1993 2909872 Closed fracture of radial head dates for the same patient, 22/08/1993 9001224 Accident in public building (supermarket) denote same tumor ? 01/04/1997 the 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract 5572 298 298 R T U New York State Center of Excellence in Bioinformatics & Life Sciences Terminologies for ‘unambiguous representation’ ??? PtID Date 5572 04/07/1990 5572 04/07/1990 5572 12/07/1990 5572 12/07/1990 5572 ObsCode Narrative closed of shaft of femur for the Do three references offracture ‘hypertension’ 81134009 Fracture, closed, spiral the same same patient denote three times 26442006 closed fracture of shaft of femur disease? 26442006 9001224 Accident in public building (supermarket) 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 03/04/1993 58298795 Other lesion on other specified region 5572 17/05/1993 79001 Essential hypertension 298 22/08/1993 2909872 Closed fracture of radial head 298 22/08/1993 9001224 Accident in public building (supermarket) 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract R T U New York State Center of Excellence in Bioinformatics & Life Sciences Can the same type of location code used in relation to three different ??? Terminologies for ‘unambiguous representation’ events denote the same location? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 03/04/1993 58298795 Other lesion on other specified region 5572 17/05/1993 79001 Essential hypertension 298 22/08/1993 2909872 Closed fracture of radial head 298 22/08/1993 9001224 Accident in public building (supermarket) 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract R T U New York State Center of Excellence in Bioinformatics & Life Sciences How will we ever know ? PtID Date ObsCode Narrative 5572 04/07/1990 26442006 closed fracture of shaft of femur 5572 04/07/1990 81134009 Fracture, closed, spiral 5572 12/07/1990 26442006 closed fracture of shaft of femur 5572 12/07/1990 9001224 Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 2309 21/03/1992 9001224 Accident in public building (supermarket) 47804 03/04/1993 58298795 Other lesion on other specified region 5572 17/05/1993 79001 Essential hypertension 298 22/08/1993 2909872 Closed fracture of radial head 298 22/08/1993 9001224 Accident in public building (supermarket) 5572 01/04/1997 26442006 closed fracture of shaft of femur 5572 01/04/1997 79001 Essential hypertension 0939 20/12/1998 255087006 malignant polyp of biliary tract R T U New York State Center of Excellence in Bioinformatics & Life Sciences The problem in a nutshell • Generic terms used to denote specific entities do not have enough referential capacity – Usually enough to convey that some specific entity is denoted, – Not enough to be clear about which one in particular. • For many ‘important’ entities, unique identifiers are used: – – – – UPS parcels Patients in hospitals VINs on cars … R T U New York State Center of Excellence in Bioinformatics & Life Sciences Fundamental goals of ‘our’ Referent Tracking 1. explicit reference to the concrete individual entities relevant to the accurate description of some portion of reality, ... Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78. R T U New York State Center of Excellence in Bioinformatics & Life Sciences Method: numbers instead of words – Introduce an Instance Unique Identifier (IUI) for each relevant particular (individual) entity 78 Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78. R T U New York State Center of Excellence in Bioinformatics & Life Sciences Codes for ‘types’ AND identifiers for instances PtID Date ObsCode Narrative 5572 04/07/1990 26442006 IUI-001 closed fracture of shaft of femur 5572 04/07/1990 81134009 IUI-001 Fracture, closed, spiral 5572 12/07/1990 26442006 IUI-001 closed fracture of shaft of femur 5572 12/07/1990 9001224 IUI-007 Accident in public building (supermarket) 5572 04/07/1990 79001 IUI-005 Essential hypertension 0939 24/12/1991 255174002 IUI-004 benign polyp of biliary tract 2309 21/03/1992 26442006 IUI-002 closed fracture of shaft of femur 2309 21/03/1992 9001224 IUI-007 Accident in public building (supermarket) 47804 03/04/1993 58298795 IUI-006 Other lesion on other specified region 5572 17/05/1993 79001 IUI-005 Essential hypertension 298 22/08/1993 2909872 IUI-003 Closed fracture of radial head 298 22/08/1993 9001224 IUI-007 Accident in public building (supermarket) 5572 01/04/1997 26442006 IUI-012 closed fracture of shaft of femur 5572 01/04/1997 79001 IUI-005 Essential hypertension IUI-004 malignant polyp of biliary tract 0939 20/12/1998 255087006 7 distinct disorders R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depend this-2 and this-3 has this-4’, where • • • • • • • • • • this-1 this-2 this-2 this-3 this-3 this-4 this-4 this-5 this-5 … instanceOf instanceOf qualityOf instanceOf roleOf instanceOf partOf instanceOf partOf human being age-of-40-years this-1 patient-role this-1 tumor this-5 stomach this-1 at t1 at t2 at t2 at t3 at t3 at t4 at t6 at t7 at t8 R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depend this-2 and this-3 has this-4’, where • • • • • • • • • • this-1 this-2 this-2 this-3 this-3 this-4 this-4 this-5 this-5 … instanceOf instanceOf qualityOf instanceOf roleOf instanceOf partOf instanceOf partOf human being age-of-40-years this-1 patient-role this-1 tumor this-5 stomach this-1 at t1 at t2 at t2 at t3 at t3 at t4 at t6 at t7 at t8 denotators for particulars R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depend this-2 and this-3 has this-4’, where • • • • • • • • • • this-1 this-2 this-2 this-3 this-3 this-4 this-4 this-5 this-5 … instanceOf instanceOf qualityOf instanceOf roleOf instanceOf partOf instanceOf partOf human being age-of-40-years this-1 patient-role this-1 tumor this-5 stomach this-1 at t1 at t2 at t2 at t3 at t3 at t4 at t6 at t7 at t8 denotators for appropriate relations R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depend this-2 and this-3 has this-4’, where • • • • • • • • • • this-1 this-2 this-2 this-3 this-3 this-4 this-4 this-5 this-5 … instanceOf instanceOf qualityOf instanceOf roleOf instanceOf partOf instanceOf partOf human being age-of-40-years this-1 patient-role this-1 tumor this-5 stomach this-1 at t1 at t2 at t2 at t3 at t3 at t4 at t6 at t7 at t8 denotators for universals or particulars R T U New York State Center of Excellence in Bioinformatics & Life Sciences The shift envisioned • From: – ‘this human being is a 40 year old patient with a stomach tumor’ • To (something like): – ‘this-1 on which depend this-2 and this-3 has this-4’, where • • • • • • • • • • this-1 this-2 this-2 this-3 this-3 this-4 this-4 this-5 this-5 … instanceOf instanceOf qualityOf instanceOf roleOf instanceOf partOf instanceOf partOf human being age-of-40-years this-1 patient-role this-1 tumor this-5 stomach this-1 at t1 at t2 at t2 at t3 at t3 at t4 at t6 at t7 at t8 time periods (for continuants) when the relationships hold R T U New York State Center of Excellence in Bioinformatics & Life Sciences Relevance: the way RT-compatible systems ought to interact with representations of generic portions of reality instance-of at t caused #105 by R T U New York State Center of Excellence in Bioinformatics & Life Sciences Yes, but … • • • • what are particulars ? what are universals ? what are denotators ? … the answer is in … R T U New York State Center of Excellence in Bioinformatics & Life Sciences Part 2: Basic Formal Ontology No (good) Referent Tracking without (good = realism-based) Ontology R T U New York State Center of Excellence in Bioinformatics & Life Sciences Ontology • In computer science: – a formal specification of a conceptualization R T U New York State Center of Excellence in Bioinformatics & Life Sciences Not the wrong sort R T U New York State Center of Excellence in Bioinformatics & Life Sciences No serious scholar should work with ‘concepts’ R T U New York State Center of Excellence in Bioinformatics & Life Sciences Slow penetration of the idea … R T U New York State Center of Excellence in Bioinformatics & Life Sciences More serious scholars become convinced … what is a concept description a description of? R T U New York State Center of Excellence in Bioinformatics & Life Sciences The right sort of ontology can help … • In computer science: – a formal specification of a conceptualization • leads to bad ontologies • In philosophy: – a representation of reality • In the OBO Foundry: – a representational artifact which is intended to represent universals and some defined classes. • foundation in philosophical realism R T U New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of Excellence in Bioinformatics & Life Sciences Basic axioms underlying OBO Foundry ontologies 1. There is an external reality which is ‘objectively’ the way it is; 2. That reality is accessible to us; 3. We build in our brains cognitive representations of reality; 4. We communicate with others about what is there, and what we believe there is there. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA R T U New York State Center of Excellence in Bioinformatics & Life Sciences What is there ? The parts of BFO relevant for Referent Tracking some continuant universal instanceOf at some continuant particular some occurrent universal t instanceOf some occurrent particular R T U New York State Center of Excellence in Bioinformatics & Life Sciences The importance of temporal indexing malignant tumor benign tumor instanceOf at t1 instanceOf at t2 partOf at t1 this-4 partOf at t2 stomach instanceOf at t2 instanceOf at t1 this-1’s stomach R T U New York State Center of Excellence in Bioinformatics & Life Sciences Sorts of relations UtoU: isa, partOf(UU), … U1 U2 PtoU: instanceOf, lacks, denotes(PU)… P1 PtoP: partOf, denotes, … P2 R T U New York State Center of Excellence in Bioinformatics & Life Sciences The essential pieces dependent continuant material object t history me … at t spatial region instanceOf t participantOf at t some quality spacetime region t occupies my life my 4D STR projectsOn at t located-in at t some spatial region temporal region projectsOn some temporal region R T U New York State Center of Excellence in Bioinformatics & Life Sciences Three levels of reality of what is there L1 R L2 L3 symbolizations beliefs ‘about’ R T U New York State Center of Excellence in Bioinformatics & Life Sciences Portion of Reality Entity Configuration represents Relation Universal Particular contains is about Non-referring particular class Information content ent. denotes corresponds-to Representation RT-tuple Representational unit Defined class … … … Extension Denotator CUI IUI UUI RUI denotes denotes denotes Representations in Referent Tracking R T U New York State Center of Excellence in Bioinformatics & Life Sciences Part 3: Applications of Referent Tracking R T U New York State Center of Excellence in Bioinformatics & Life Sciences (1) Making existing EHR systems RT compatible • In: Teich JM, Suermondt J, Hripcsak C. (eds.), AMIA 2007 Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago IL, 2007. – Rudnicki R, Ceusters W, Manzoor S, Smith B. What Particulars are Referred to in EHR Data? A Case Study in Integrating Referent Tracking into an Electronic Health Record Application. – Manzoor S, Ceusters W, Rudnicki R. A Middleware Approach to Integrate Referent Tracking in EHR Systems. R T U New York State Center of Excellence in Bioinformatics & Life Sciences Problems with prevailing EHR paradigms • Perfect ‘semantic’ tools are useless if data captured at the source is not of high quality • Prevailing HIT information models don’t allow data to be stored at acceptable quality level: – No formal distinction between disorders and diagnosis – Messy nature of the notions of ‘problem’ and ‘concern’ – No unique identification of the entities about which data is stored • Unique IDs for data-elements cannot serve as unique IDs for the entities denoted by these data-elements R T U New York State Center of Excellence in Bioinformatics & Life Sciences MedtuityEMR Patient’s Encounter Document R T U New York State Center of Excellence in Bioinformatics & Life Sciences (2) The ReMINE Project on Adverse Events • Ceusters W, Capolupo M, De Moor G, Devlies J. Introducing Realist Ontology for the Representation of Adverse Events. In: Eschenbach C, Gruninger M. (eds.) Formal Ontology in Information Systems, IOS Press, Amsterdam, 2008;:237-250 • Ceusters W, Capolupo M, Smith B, De Moor G. An Evolutionary Approach to the Representation of Adverse Events. In: Medical Informatics Europe 2009, Sarajevo, Bosnia and Herzegovina, August 31, 2009. Studies in health technology and informatics 150;:537-541 R T U New York State Center of Excellence in Bioinformatics & Life Sciences ReMINE Taxonomy Annotated Events Risk Manager’s Event Administration System R T U New York State Center of Excellence in Bioinformatics & Life Sciences Part of the ReMINE Domain Ontology R T U New York State Center of Excellence in Bioinformatics & Life Sciences ReMINE’s RT-compatible event registration • an incident (#1) that happened at time t2 to a patient (#2) after some intervention (#3 at t1) • is judged at t3 to be an adverse event, thereby giving rise to a belief (#4) about #1 on • the part of some person (#5, a caregiver as of time t6). • This requires the introduction (at t4) of an entry (#6) in the adverse event database (#7, installed at t0). R T U New York State Center of Excellence in Bioinformatics & Life Sciences Advantages • Synchronisation of two distinct representations of the same reality: – taxonomies: • user-oriented view • data annotation – ontologies: • realism-based view • unconstrained reasoning • Domain ontology compatible with OBO-Foundry ontologies: – no overlap, – easier to re-use. • Not only tracking of incidents, but also: – how well individual clinicians and organizations manage adverse events, – how well one learns from past experiences. R T U New York State Center of Excellence in (3) Bioinformatics & Life Sciences Over the past 15 years, nearly 500 genes that contribute to inherited eye diseases have been identified. Diseasecausing mutations are associated with many ocular diseases, including glaucoma, cataracts, strabismus, corneal dystrophies and a number of forms of retinal degenerations. This remarkable new genetic information highlights the significant inroads that are being made in understanding the medical basis of human ophthalmic diseases. As a result, gene-based therapies are actively being pursued to ameliorate ophthalmic genetic diseases that were once considered untreatable. R T U New York State Center of Excellence in Bioinformatics & Life Sciences eyeGENE™ core medical data schema Patient Clinical Encounter Patient Clinical Finding Patient Diagnosis Diagnosis Clinical Finding Diagnosis Finding Link Clinical Finding Unit Link Units Specimen Lab Result Ceusters W. Providing a Realist Perspective on the eyeGENE Database System. In: Smith B. (ed.) Proceedings of the International Conference on Biomedical Ontologies (ICBO), Buffalo, NY, July 23-26, 2009;67-70. R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some recommendations (1) • For each table, data field and associated allowed values, hard- or soft-coded business rule that restrict data-input, 1. assess what (type of) entity in reality would be denoted by any data instance, – includes any ‘value’ from ‘value sets’, external terminologies, etc 2. represent how these entities in reality relate to each other as well as to other ontologically relevant entities that are not explicitly addressed in the information model, • the domain model proper, – based on realism-based ontologies 3. describe formally how the information model has to be interpreted in terms of the domain model. – ‘interpretation model’ R T U New York State Center of Excellence in Bioinformatics & Life Sciences Some recommendations (2) • • The (relevant parts of the) interpretation model should be part of any information exchange. Change user interfaces and information model only when no ‘realist interpretation’ is possible or faithful data entry cannot be achieved. – – – certain fields should not be ‘required’, formatting, e.g. phone numbers, is acceptable in a userinterface when it satisfies local situations (not ‘requirements’), but not for exchange, ‘unknown’ and ‘null values’ are acceptable, if suitable interpretations are provided in the interpretation model, not just as text in data-dictionaries. R T U New York State Center of Excellence in Bioinformatics & Life Sciences (4) Translational data warehousing Today’s data generation and use observation & measurement data organization model development use = outcome add Δ (instrument and study optimization) verify further R&D Generic beliefs application R T U New York State Center of Excellence in Bioinformatics & Life Sciences Key components data information generates generates • Players • HIT • Outcomes generates influences knowledge hypotheses R T U New York State Center of Excellence in Bioinformatics & Life Sciences Key components data information generates generates • Players • HIT • Outcomes generates influences reality knowledge hypotheses about representation R T U New York State Center of Excellence in Bioinformatics & Life Sciences Current deficiencies • At the level of first-order reality: – Desired outcomes different for distinct players • Competing interests – Multitude of HIT applications and paradigms • At the level of representations: – Variety of formats – Silo formation (incompatible representations, privacy) – Doubtful semantics • In their interplay: – Very poor provenance or history keeping – No formal link with that what the data are about – Low quality R T U New York State Center of Excellence in Bioinformatics & Life Sciences General principles of RT-enabled data warehousing (1) R T U New York State Center of Excellence in Bioinformatics & Life Sciences General principles of RT-enabled data warehousing (2) • Unique identifier for: – – – – each data-element and combinations thereof (L3) what the data-element is about (L1) each generated copy of an existing data-element (L3) each transaction involving data-elements (L3) • Identifiers centrally managed in RTS • Exclusive use of ontologies for type descriptions following OBO-Foundry principles • Centrally managed data dictionaries, data-ownership, exchange criteria R T U New York State Center of Excellence in Bioinformatics & Life Sciences General principles of RT-enabled data warehousing (3) • Central inventory of ‘attributes’ but peripheral maintenance of ‘values’ • Identifiers function as pseudonyms – centrally known that for person IUI-1 there are values about instances of UUI-2 maintained by researcher/clinician IUI-3 for periods IUI-4, IUI-5, … • Disclosure of what the identifiers stand for based on need and right to know • Generation of off-line datasets for research with transaction-specific identifiers for each element R T U New York State Center of Excellence in Bioinformatics & Life Sciences Feedback to clinical care • Finding ‘similar’ patient cases – suggestions for prevention, investigation, treatment • ‘Outbreak’ detection • Comparing outcomes – related to disorders, providers, treatments, … • Links to literature • Clinical trial selection • … R T U New York State Center of Excellence in Bioinformatics & Life Sciences Summary • Referent tracking breaks with ‘traditional’ information management • Visionary or Folie à deux ? – work thus far primarily theoretical • successful in finding problems and suggesting solutions, but not yet large scale implementations – a lot of redundancy and overhead – simple algorithms but huge search space • It took barcodes 15 year to become accepted, thus time is in our favor.