Image Ontologies - bioontology.org

Download Report

Transcript Image Ontologies - bioontology.org

A biological ontology is:
 A machine interpretable representation
of some aspect of biological reality
 what kinds
eye disc
of things
develops
exist?
from
 what are the
relationships
between
these things?
sense organ
is_a
eye
part_of
ommatidium
Following basic rules helps
make better ontologies
 Ontologies must be intelligible both to humans
(for annotation) and to machines (for reasoning
and error-checking)
 Unintuitive rules for classification lead to entry
errors (problematic links)
 Facilitate training of curators
 Overcome obstacles to alignment with other
ontology and terminology systems
 Enhance harvesting of content through
automatic reasoning systems
Animal disease models
Animal models
Mutant Gene
Mutant or
missing Protein
Mutant Phenotype
Animal disease models
Humans
Mutant Gene
Animal models
Mutant Gene
Mutant or
missing Protein
Mutant or
missing Protein
Mutant Phenotype
(disease)
Mutant Phenotype
(disease model)
Animal disease models
Humans
Mutant Gene
Animal models
Mutant Gene
Mutant or
missing Protein
Mutant or
missing Protein
Mutant Phenotype
(disease)
Mutant Phenotype
(disease model)
Animal disease models
Humans
Mutant Gene
Animal models
Mutant Gene
Mutant or
missing Protein
Mutant or
missing Protein
Mutant Phenotype
(disease)
Mutant Phenotype
(disease model)
SHH-/+
SHH-/-
shh-/+
shh-/-
Phenotype
(clinical sign) = entity
+ attribute
Phenotype
(clinical sign) = entity
P1
= eye
+ attribute
+ hypoteloric
Phenotype
(clinical sign) = entity
P1
P2
= eye
= midface
+ attribute
+ hypoteloric
+ hypoplastic
Phenotype
(clinical sign) = entity
P1
P2
P3
= eye
= midface
= kidney
+
+
+
+
attribute
hypoteloric
hypoplastic
hypertrophied
Phenotype
(clinical sign) = entity
P1
P2
P3
= eye
= midface
= kidney
+
+
+
+
ZFIN:
eye
midface
kidney
attribute
hypoteloric
hypoplastic
hypertrophied
PATO:
hypoteloric
+
hypoplastic
hypertrophied
Phenotype
(clinical sign) = entity
+ attribute
Anatomical ontology
Cell & tissue ontology
Developmental ontology
Gene ontology
biological process
molecular function
cellular component
+
PATO
(phenotype and trait ontology)
Phenotype
(clinical sign) = entity
P1
P2
P3
= eye
= midface
= kidney
+
+
+
+
attribute
hypoteloric
hypoplastic
hypertrophied
Syndrome = P1 + P2 + P3
(disease)
= holoprosencephaly
Human holoprosencephaly
Zebrafish
shh
Zebrafish
oep
EA model
entity
attribute
attribute
fin
shape
irregular shape
eye
color hue
blue
mesenchyme
relative thickness
thin
brain
structure
fused
retinal cells
relative orientation
disoriented
Proposed schema
Association = Genotype Phenotype Environment Assay
Phenotype = Stage* Entity Attribute Value
Entity = OBOClassID
Attribute = PATOVersion2ClassID
Monadic and relational
attributes
 Monadic:
 the quality/attribute inheres in a single entity
 Relational:
 the quality/attribute inheres in two or more entities
 sensitivity of an organism to a kind of drug
 sensitivity of an eye to a wavelength of light
 can turn relational attributes into cross-product
monadic attributes
 e.g. sensitivityToRedLight
 better to use relational attributes
 avoids redundancy with existing ontologies
Incorporating relational
attributes
Association = Genotype Phenotype Environment Assay
Phenotype = Stage* Entity Attribute Entity*
Entity = OBOClassID
Attribute = PATOVersion2ClassID
Example data record:
Phenotype =
“organism” sensitiveTo “puromycin”
Measurable attributes
 Some attributes are inexact and implicitly relative to a
wild-type or normal attribute
 relatively short, relatively long, relatively reduced
 easier than explicitly representing:
 this tail length shorter-than ‘canonical mouse’ wild-type tail
length
 Some attributes are determinable
 use a measure function
 unit, value, {time}
 this tail has length L
 measure(L, cm) = 2
 Keep measurements separate from (but linked to)
attribute ontology
Incorporating
measurements
Association = Genotype Phenotype Environment Assay
Phenotype = Stage* Entity Attribute Entity* Measurement*
Measurement = Unit Value (Time)
Entity = OBOClassID
Attribute = PATOVersion2ClassID
Example data record:
Phenotype =
“gut” “acidic” Measurement = “pH” 5
Composite phenotype
classes
 Mammalian phenotype has composite
phenotype classes
 e.g. “reduced B cell number”
 Compose at annotation time or ontology
curation time?
 False dichotomy
 Core 2 will help map between composite
class based annotation and EA
annotation
Interpreting annotations
 Annotations are data records
 typically use class IDs
 implicitly refer to instances
 How do we map an annotation to
instances?
 Important for using annotations
computationally
Interpreting annotations (1)
 What does an EA (or EAV) annotation mean?
 Annotation:
 Genotype=“FBal00123” E=“brain” A=“fused”
 presumed implied meaning:
 this organism
 has_part x, where
x instance_of “brain”
x has_quality “fused”
 or in natural language:
 “this organism has a fused brain”
 Various built-in assumptions
Interpreting annotations (II)
 What does this mean:
 annotation:
 Genotype=“FBal00123” E=“wing” A=“absent”
 using same mapping as annotation I:
 fly98 has_part x, where
 x instance_of “wing”
 x has_quality “absent”
 or in natural language:
 this fly has a wing which is not there
 !
 What we really intend:
 NOT(this organism has_part x, where x instance_of “wing”)
Interpreting annotations (II)

What does this mean:
 annotation:
 Genotype=“FBal00123” E=“wing” A=“absent”
 using same mapping as annotation I:
 this organism has_part x, where


x instance_of “wing”
x has_quality “absent”
 or in natural language:
 this fly has a wing which is not there
 !
 What we really intend:
 this organism has_quality “wingless”
 “wingless” = the property of having count(has_part “wing”)=0
 Are our computational
representations intended to capture
linguistic statements or reality?
Does this matter?
 Logical reasoners will compute incorrect
results
 unless explicitly provided with specific rules for
certain attributes such as “absent”
 What are the consequences?
 Basic search will be fine
 e.g. “find all wing phenotypes”
 But computers will not be able to reason
correctly
Interpreting annotations (III)
 What does this mean:
 annotation:
 E=“digit” A=“supernumery”
 using same interpretation as annotation I:
 this organism has_part x, where
 x instance_of “digit”
 x has_quality “supernumery”
 or in natural language:
 this organism has a particular finger which is
supernumery
 What we really intend:
 this person has_quality “supernumery finger”
 “supernumery finger” = the property of having
count(has_part “digit”) > wild-type”
 !!!
Interpreting annotations (IV)
 What does this mean:
 annotation:
 Gt=“mp001” E=“brown fat cell”
A=“increased quantity”
 using same mapping as annotation I:
 this organism has_part x, where
 x instance_of “brown fat cell”
 x has_quality “increased quantity”
 or in natural language:
 this organism has a particular brown fat cell which is
increased in quantity
 What we really intend:
 this organism has_part population_of(“brown fat
cell”) which has_quality increased size
Other use cases
 spermatocyte devoid of asters
 Homeotic transformations
 increased distance between wing
veins
 Some vs all
Alternate perspectives
 process vs state
 regulatory processes:
 acidification of midgut has_quality reduced rate
 midgut has_quality low acidity
 development vs behavior
 wing development has_quality abnormal
 flight has_quality intermittent
 granularity (scale)
 chemical vs molecular vs cell vs tissue vs
anatomical part
Summary
 Define attributes in terms of instances
 Evaluate proposed new schema
 measurement proposal
 relational attribute proposal
 Complexity trade-off
 create library of use cases
 Core2 will create tools to present user-friendly layer
 Alternate perspective annotations are useful
Before: domain knowledge is
embedded in the db schema
Gene
table
Exon
table
RNA
table
Protein
table
After: domain knowledge is
embedded in the ontology
feature
table
Ontology driven db schema is
less expensive to maintain
 The logical description and the physical
database description of the biology are
developed independently
 Therefore new biological knowledge will
only require:





Ontology changes: e.g. new terms
GUI changes: display
No schema changes
No query changes
No middleware changes
Step 1:
Build an ontology
that reflects reality
Step 2: Data capture
Database:
UIDs serving
as proxies for
instances
Step 3:
Classify data
using the
ontology
Ontologies must adapt over
time
 Getting it right
 It is impossible to get it
right the 1st (or 2nd, or
3rd, …) time.
 What we know about
biology is continually
growing
 This “standard”
requires versioning.
Improve
Collaborate
and Learn
Image Ontologies
Matthew Fielding
From RadLex to RadiO
 A unified language for radiology information sources
(e.g. teaching files, research data, and radiology
reports).
 Will describe all the salient aspects of an imaging
examination (e.g., modality, technique, visual features,
anatomy, and pathology).
 Will emphasize adoption or linkage to established
terminology and standards when possible, such as the
ACR Index, SNOMED, the Unified Medical Language
System (UMLS), the Fleischner Society Glossaries, and
DICOM.
 Will be used to organize and retrieve radiology images.
Image Ontologies
C. Forbes Dewey
Experibase
 A common technology that will capture data from all of
the major experimental systems generating biological
data.
 Implementing it for gel electrophoresis, microarrays,
fluorescence-activated cell sorting, mass spectrometry
and optical microscopy.
 Coordinating with the Interoperable Informatics
Infrastructure Consortium (I3C)
 Will be used to organize and interrogate these
experimental data
Image Ontologies
Bill Lorensen
Image Ontologies
William Bug
Image Ontology
Requirements
 Linking databases created at multiple centers
concerned with human disease and associated animal
models.
 BIRN Ontology Task Force (OTF) reviews different
ontological reference interpretations by its audience:
anatomists, clinicians, genomics, pathologists,
diagnosticians, and neurologists
 Using existing ontologies, tools, and formalisms wherever
possible and extend them only as necessary. Any
ontology work performed by BIRN should be aligned with
other efforts and provided back to the maintainers
 Developing a set of ontologies that are approved for use
and a set of policies and procedures for extensions
Image Ontologies
Louis Goldberg
On Reasoning with Images
 What different approaches are available for
spatial, temporal, and spatio-temporal
representation and reasoning formalisms used
in computer applications?
 What is the expressive power of those
formalisms
 Formalizations for commonsense reasoning
about space and time.
 Formalisms for the representation of vagueness