Data and Knowledge Representation Lecture 3
Download
Report
Transcript Data and Knowledge Representation Lecture 3
Data and Knowledge
Representation
Lecture 3
Qing Zeng, Ph.D.
Last Time We Talked About
Boolean Algebra
Predicate Logic (First order logic)
Today We Will Talk About
Ontology
Major KR Schemes
Tell me what’s in this room
Tables, chairs, windows, computers, papers,
pens, people, etc..
We can write x. y.Table( x) Room( y) In( x, y)
But what is a table? What is a room?
Logic has no vocabulary of its own
Ontology Fills the Gap
Ontology is a study of existence, of all
kinds of existence, of all kinds of entities
It supplies the predicates of predicate
logic and labels that fill the boxes and
circles of conceptual graph
Webster’s Definition of Ontology
“1 : a branch of metaphysics concerned
with the nature and relations of being
2 : a particular theory about the nature of
being or the kinds of existents” -http://www.webster.com/cgi-bin/dictionary
My Simplified Understanding
Ontology seeks to describe entities through
classification of relations among entities
Domain ontology limits the its scope to a specific
domain such as medicine
In informatics, we further limit domain ontology
to what is needed by a application or certain
kinds of applications such clinical guideline,
retrieval of pathology information
Why Ontology in Biomedical Domain
Encode data
E.g.
Patient A is diabetic and HIV positive
Represent knowledge
E.g.
Blood Glucose test is a diagnostic test for
diabetes.
Sources of Ontology
Observation: provides knowledge of the
physical world
Reasoning: make sense of observation by
generating a framework of abstractions
called “metaphysics”.
Ontology Development in Biomedical
Domain
Areas that directly involve ontology
Data
model
Vocabulary/terminology
Knowledge based system
Philosopher’s Approach to Ontology
Top-down
Concerned
with the entire universe
Build top level ontology first
Long history
Lao
Zi (Book of Tao)
Plato
Aristotle
Kant (1787)
Computer/Information Science’s
Approach
Bottom Up
Start
with limited world or specific
applications
Exception:
Designed
Cyc system
with computing in mind
Short History
First
use of the term “ontology” in computer
science community: McCarthy, J. 1980
“Circumscription – A Form of Non-Monotonic
Reasoning”, Artificial Intelligence, 5: 13, 27–
39.
Problems Faced by Computer/Information
Scientists
Tower of Babel
Ontology
used/developed by different groups
for applications
Terminological
and conceptual incompatibilities
Problem arise in system development and
maintenance as well as data/knowledge exchange
Insufficient expressive power
Example
Problem Oriented Medical Record
Weed
LL. Medical records that guide and
teach. 1968. MD Comput. 1993 MarApr;10(2):100-14.
Where “SOAP” comes from…
The gist: organizing medical data/information
by patient problem
Many EMRs has a place for “problem list”
Example
Which one of the following is a “problem”
Cough
Anxiety
Pregnancy
Sleep
disorder
Rash
Physicians can not agree
Cited
by a number of POEMRs as one of the
reasons of failure
Another Example
What does “acute” mean?
or severity e.g. acute pain
having a sudden onset, sharp rise, and short
course, e.g. acute pancreatitis
sharpness
In a data model for finding, we had
severity as an attribute. Thus need to
decide where acute fit in.
To Solve the Problem
Develop formalism for sharing (e.g. KIF,
CGIF)
Develop standard ontology
Develop new formalism to increase
expressive power
Ontological Categories
Making a choice on ontological categories
is first step in system design – John Sowa
Ontological Categories is
“Class”
in OO system
“Domain” in database theory
“type” in AI theory
“type” or “sort” in logic
Ontological Categories
Making a choice on ontological categories
is first step in system design – John Sowa
Ontological Categories is
“Class”
in OO system
“Domain” in database theory
“type” in AI theory
“type” or “sort” in logic
Brentano’s tree of Aristotle’s
Categories
Being
Substance
Accident
Property
Inherence
Relation
Directness
Containment
CYC Ontology
Thing
Individual Object
Intangible
Represented Thing
Relationship
Event
Stuff
IntangibleObject
Collection
Contrast -> Distinction
All perceptions start with contrast
Bright
– dark
Tall – short
Healthy – ill
Happy – sad
Distinction (discrete/continuous)
conceptual interpretations of perceptual
contrasts
Contrast -> Distinction
All perceptions start with contrast
Bright
– dark
Tall – short
Healthy – ill
Happy – sad
Distinction (discrete/continuous)
conceptual interpretations of perceptual
contrasts
Distinction -> Categories
Distinctions maybe combined to generate
categories. E.g.
Classify
patients.
Distinctions: (insured, uninsured), (inpatient,
outpatient), (infant, child, adult), (emergency,
urgent, general)……..
Categories: insured pediatric emergency
patient, uninsured adult inpatient……
Sowa’s Ontology (Peirce and
Whitehead)
AXIOMS:
Physical:
physical entities have location
in space and a point in time. E.g. hand,
hair, computer.
Abstract: abstract entities do not have
location in space or a point in time. E.g.
theorem, knowledge, story.
Sowa’s Ontology
AXIOMS:
Independent: independent entities can exist without
being dependent on the existence of another entity.
E.g. person, diary, song.
Relative: relative entities require the existence of
some other entity. E.g. joints between bones, middle
child, remission after a disease episode.
Mediating: mediating entities require the existence of
(at least) two other entities and establish new
relationship among them. E.g. theory of relativity,
diagnostic strategy, cardiovascular system.
Sowa’s Ontology
AXIOMS:
Continuant:
has only spatial parts and no
temporal parts; identity cannot depend on
location in space and time. E.g. gender, alert
and reminder system, medication formula.
Occurrant: has both spatial parts
(participants) and no temporal parts (stages);
can only identify by location in space and
time. E.g. disease episode, clinical event,
medication order.
Matrix of Central Categories
Physical
Abstract
Continuant Occurrent
Continuant Occurrent
Indepen- Object
dent
Process
Schema
Script
Relative
Participation
Description
History
Situation
Reason
Purpose
Juncture
Mediating Structure
Exercise
Assume you are developing an alert system
to monitor errors in laboratory information
systems. Identify some distinctions for
categorizing the errors and describe which
distinctions are in contrast with which
other distinctions.
Semantic Network
An long existing notion: there are different
pieces of knowledge of world, and they
are all linked together through certain
semantics.
Basic Components
Nodes
Represent
Arcs
Represent
concepts
relations
Labels for nodes and arcs
Little Constraint
patient
Interact
Interact
Nurse
physician
Interact
Little Constraint
DSG Site
Link
Link
Instructors’
Homepage
Course Site
Link
Web
Relation
Directed or non-directed
Multiple relations between two concepts
Can have different properties
Reflexive
(e.g. co-ocurrence)
Transitive (e.g. causal)
Symmetric (e.g. sibling)
………..
Some Often Used Relations in
Biomedical Domain
IS A
IS PART OF
CAUSE OF
MEASURES
CO-OCCURS
…………
Major Limitation
Lack of Semantics
No
formal semantic of the relations
E.g.
Does “ISA” mean subclass, member, etc?
Possible
multiple interpretations
Restricted expressiveness
E.g.
can not distinguish between instance and
class
Extension
Extending expressivity (distinguish
different types of concepts and relations”
Distinguish
between “some” and “all”
Distinguish between “existence” and
“intension”
Distinguish between “definition” and
“assertion”
Add semantic rigor
Map
to logic (Sowa – CG)
Frame-based Network
Distinguish instance vs. class
Hierarchical structure (superclass and
subclass)
Multiple hierarchy
Slots
Member
Own
slot
slot
Slot
Frame identifying information
Relationship between frames
Descriptors of requirements for frame
match
Procedural information
Default information
Restrictions and constraints
New instance information
Strength
Help organize knowledge hierarchically
Procedure information
Support multiple inheritance
Weakness
Expressiveness (e.g. quantifier)
Inheritance
Sub
classing (override slot value)
Multiple inheritance
Large complex knowledge system
Example: MED
Example: Protégé
Example: Protégé
Example: Protégé
Production Rules
Also called IF-THEN rules
Many forms:
IF
condition THEN action
IF premise THEN conclusion
IF proposition p1 and proposition p2 are true
THEN proposition p3 is true
Components
Rule base
Inference engine
Working memory
Inference
Modus ponens
Forward
chaining
Modus tollens
Background
chaining
Example: MYCIN
IF the identity of the germ is not known
with certainty
AND the germ is gram-positive
AND the morphology of the organism is
"rod"
AND the germ is aerobic
THEN there is a strong probability (0.8) that
the germ is of type enterobacteriacae
Example
POINT
Control the
execution of
inference engine by
retrieve and
providing needed
knowledge
Define semantic
relations between
concepts
Main
Inference
Control
Medical
Knowledge
base
Jess
Inference
Engine
Inference
Rules
Fire rules when
adequate
knowledge is
provided
Define rules of
relevance base on
semantic relations
between concepts
Example
Medical
Knowledgebase
Inference Rules
Inference Process
Inference Results
Pro and Con
Pro
Modular
Natural
Con
Not
efficient
Not expressive
Exercise
The thyroid gland is located at the base of your neck in front of your
trachea (or windpipe). It has two sides and is shaped like a
butterfly.
The thyroid gland makes, stores, and releases two hormones - T4
(thyroxine) and T3 (triiodothyronine). Thyroid hormones control the
rate at which every part of your body works. This is called your
metabolism. Your metabolism controls whether you feel hot or cold
or tired or rested. When your thyroid gland is working the way it
should, your metabolism stays at a steady pace -not too fast or too
slow.
If no cancer cells are found, your doctor may prescribe a thyroid
hormone to decrease the size of your nodule. Or, your doctor may
suggest surgery to remove it. If cancer cells are found, further
treatment will be needed. Thyroid cancer usually can be treated
with success.
Excise
Which representation scheme to choose?
Reading
Sowa: Chapter 2
Sowa: Chapter 4