Relational Taxonomy Tree and BioDataBase by Huhn

Download Report

Transcript Relational Taxonomy Tree and BioDataBase by Huhn

Relational Taxonomy Tree
and
BioDataBase
by Huhn-Kie Lee
7/7/2015
1
Part I.
Relational Taxonomy Tree
7/7/2015
2
Relational Taxonomy Tree (RTT)
• Taxonomic hierachy
– Kingdom, phylum, class, order, family, genus,
species
• Lower level inherits higher level’s property:
– Properties may be stored “redundantly”
• Siblings differ by some properties:
– Properties are “disparate,” so we need different
relation schemes
7/7/2015
3
Relational Taxonomy Tree
(RTT)
•
animal
Carnivore
dogs
Dalmatian
7/7/2015
herbivore
cats
Chihuahua
Russian cat
Italian cat
4
Relational Taxonomy Tree
(RTT)
•
animal
Carnivore (prey, hunting method)
dogs
Dalmatian
7/7/2015
Herbivore (feeding plant,
chewing method)
cats
Chihuahua
Russian cat
Italian cat
5
Relational Taxonomy Tree
(RTT)
•
animal
Carnivore
Dogs (barking sound, snout size)
Dalmatian
7/7/2015
Chihuahua
Herbivore
Cats (meowing sound, whiskers size)
Russian cat
Italian cat
6
Relational Taxonomy Tree
(RTT)
•
animal
Carnivore
Dogs (barking sound, snout size)
7/7/2015
Herbivore
Cats (meowing sound, whiskers size)
speciesID
bark
snout
speciesID
meow
whisker
Dalmatian
Bow-bow
3 cm
Russian cat
yao
2 cm
Chihuahua
Wow-wow
1 cm
Italian cat
mao
1 cm
7
Relational Taxonomy Tree
(RTT)
•
animal
Carnivore (prey, preying method)
7/7/2015
Herbivore
speciesID
prey
hunting method
Dalmatian
Ground hog
Dig out its hole
Chihuahua
rats
Bark-and-chase
Russian cat
rats
Hide-and-attack
Italian cat
squrrels
Jump-and-chase
8
Relational Taxonomy Tree
(RTT)
• Vertical query:
– join a relation with its ancestor relation
– “Find hunting method of a dog which barks
“bow-bow” “
(See relations in slide 6, 5)
• SELECT Carn.hunting_method
FROM Dogs D, Carnivore Carn
WHERE D.speciesID = Carn.speciesID AND
D.barking_sound = “bow-bow”
7/7/2015
9
Relational Taxonomy Tree
(RTT)
• Horizontal query:
– join any two relations (may not in same level)
– “Find (barking sound, meowing sound) pair of
dogs and cats which prey on the same animal
(See relations in slide 6, 5)
• SELECT D.barking_sound, C.meowing_sound
FROM Dogs D, Carnivore Carn1,Carn2, Cats C
WHERE D.speciesID = Carn1.speciesID AND
C.speciesID = Carn2.speciesID AND
Carn1.prey = Carn2.prey
7/7/2015
10
Multiple Inheritance from
same-level parents
animal
Carnivore (prey, hunting method)
Herbivore (feeding plant,
chewing method)
Omnivore(prey, hunt, plant, chew)
bear
Black bear
7/7/2015
Grizzly bear
11
Multiple Inheritance from
diff-level parents
animal
Carnivore (prey, hunting method)
dogs
Herbivore (feeding plant,
chewing method)
Cats(meowing sound, whiskers size)
Pseudo-cat(meow,whisker,plant,chew)
7/7/2015
12
Multi-Inherit Rules
MN
Add a taxon whose attribute set is MNABCD
AB
MN
CD
AB
7/7/2015
ABCD
CD
13
Multi-Inherit Rules
MN
Add a taxon whose attribute set is MNCDEF
AB
EF
MN
CD
AB
EF
7/7/2015
CD
EF
14
Multi-Inherit Rules
MN
Add a taxon whose attribute set is MNBC
AB
MN
CD
AB
7/7/2015
BC
CD
15
Multi-Inherit Rules
MN
Add a taxon whose attribute set is KL
AB
CD
MN
AB
7/7/2015
KL
CD
16
Multi-Inherit Rules
MN
Add a taxon whose attribute set is MK
AB
CD
M
N
AB
7/7/2015
K
CD
17
RTT is skewed
Genorg
karyote
eukaryote
virus
prokaryote
bacteria
Virus1, virus2….
archaea
Multi-cellular
Archaea1,archaea2…
mono-cellular
Gram+ bact
Gram+bact1,2…
7/7/2015
gram - bact
Gram-bact1,2…
18
Terminal Relation
Genorg
karyote
speciesID
size
AIDS virus
10 nm
human
1.7 m
…
…
virus
eukaryote
Multi-cellular
mono-cellular
7/7/2015
19
Non-terminal Relation
Genorg
Sub-taxon Ave. size
virus
10 nm
Karyote
1m
karyote
virus
eukaryote
-Save general trend in
Multi-cellular
each subtaxon.
mono-cellular
7/7/2015
20
Non-terminal Relation
eukaryote
Asexual eukaryote
Sexual eukaryote
Sub-taxon How to mate
animal
animal
Search-for
plant
Via carrier
plant
-Save common values of each subtaxon.
-Terminal relation would be redundant.
7/7/2015
21
Part II.
BioDataBase
7/7/2015
22
BioDataBase (BDB)
• Want to store all the information about all
the living organisms on the planet
– Too many data!
– Solution: partition database into “Domains”
– Each domain has its own database that stores
relevant biological infomation
• Want to find correlation between different
domains’ information
7/7/2015
23
BioDataBase (BDB)
• Consider 3 domains and their relevant info:
– Genomics: genes of each species
– Ecology: population distribution of species
– Environment: a location’s humidity, temperature
7/7/2015
24
BioDataBase (BDB)
• Genomics:
– Species/gene is
many-to-many relation
– Hence,
(species, gene) relation
7/7/2015
lion
geneA
zebra
geneB
speciesID
lion
geneID
geneA
lion
zebra
geneB
geneA
geneC
25
BioDataBase (BDB)
• Ecology:
– Want to store species_A lives in location_B
and the number of them is population_C
– PRIMARY KEY: (speciesID, locationID)
speciesID
lion
zebra
locationID
Israel
Jordan
population
3000
20000
tiger
China
900
7/7/2015
26
BioDataBase (BDB)
• Environment:
– Want to store environmental factors that affect
living organisms
locationID
Israel
Jordan
humidity
low
low
temperature
85
80
China
high
35
7/7/2015
27
BioDataBase (BDB)
• Want to answer a query that spans all 3
domains:
– simply join relations from 3 domains!
– “Find genes that are common to (genomics)
all species that live in the area (ecology)
where humidity is low (environment)”
7/7/2015
28
BioDataBase (BDB)
• “Find genes that are common to all species that live in the
area where the humidity is low“ (see relations in 14,15,16)
(SELECT G.geneID, G.speciesID
FROM Genomic G, Ecology Eco, Environment Env
WHERE G.speciesID = Eco.speciesID AND
Eco.location = Env.location AND
Env.humidity = low )
DIVIDE
(SELECT Eco.speciesID
FROM Ecology Eco, Environment Env
WHERE Eco.location = Env.location AND
7/7/2015
29
Env.humidity = low )
Part III.
Conclusion & cs632 Project
7/7/2015
30
Conclusion
• Relational Taxonomy Tree solves
– Redundancy problem:
• diff. species have common attributes.
– Disparity problem:
• diff. species have diff. attributes
• RTT and BDB can serve as the prototype
for the infrastructure of the Library of Life
Project.
7/7/2015
31
Tentative Project Suggestion
• There are four of us:
– Helgi, Yoni, Shobhi, mi.
• Two of us work on implementation of
mini-Relational Taxonomy Tree
• The other two of us work implement
mini-BioDataBase
• All of us implement a program that can
process SQL queries on RTT & BDB
7/7/2015
32
So what do you say?
7/7/2015
33