The Gene Ontology Project

Download Report

Transcript The Gene Ontology Project

17/07/15
The Gene Ontology Project
An Introduction
EBI is an Outstation of the European Molecular Biology Laboratory.
There is a lot
of biological
research output.
2
17/07/15
Search on
mesoderm
development…
3
17/07/15
You get 6752
results!
How will you
ever find what
you want?
Another
example…
4
17/07/15
time
Microarray data
shows changed
expression of
thousands of genes.
How will you spot
the patterns?
attacked control
ree:
pear
s on lw n3d
...lw n3d ...Colored
Selected Gene
Tree:
pearson
Coloredby
by::
ss ific ation:
Set_LW_n3d_5p_...
Gene
Lis
t:
Branch color
classification: Set_LW_n3d_5p_...
Gene
List:
5
17/07/15
Copy
of ofCC5_RMA
opy of
C5_RMA
( Def a...
Copy
of Copy
(Def
a...
genes
allall
genes
(14010)( 14010)
Bregje Wertheim at the Centre for Evolutionary Genomics,
Department of Biology, UCL and Eugene Schuster Group, EBI.
Scientists
work hard.
http://www.teamtechnology.co.uk/f-scientist.jpg
6
17/07/15
http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif
There are
lots of
papers
to read.
http://www.teamtechnology.co.uk/f-scientist.jpg
7
17/07/15
http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif
more
every
week.
http://www.teamtechnology.co.uk/f-scientist.jpg
8
17/07/15
http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif
and
more…
http://www.teamtechnology.co.uk/f-scientist.jpg
9
17/07/15
http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif
more and
more
and more!
http://www.teamtechnology.co.uk/f-scientist.jpg
10
17/07/15
http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif
more and
more
and more!
Help!
http://www.teamtechnology.co.uk/f-scientist.jpg
11
17/07/15
Help!
http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif
Ontology is a way to capture
knowledge in a written and
computable form.
Computable
Computable means that the computer finds patterns
so we don’t have to.
12
17/07/15
Ebay search
(keyword ‘lead’)
v.
Pubmed search
(keyword ‘flower’)
Demo and practical work
13
17/07/15
The Gene Ontology
14
17/07/15
This is our
browser.
15
17/07/15
Search on
mesoderm
development.
16
17/07/15
Here is
mesoderm
development.
17
17/07/15
Definition of
mesoderm
development.
Gene products
involved in
mesoderm
development.
18
17/07/15
There are many
gene products
involved in
mesoderm
development.
But fewer gene
products than
papers.
You can read
papers describing
what is known
about them.
19
17/07/15
20
17/07/15
time
attacked control
ne Tree:
pear
s on lw n3d
...lw n3d ...Colored
Selected Gene
Tree:
pearson
Coloredby
by::
r c lass ific ation:
Set_LW_n3d_5p_...
Gene
Lis
t:
Branch color
classification: Set_LW_n3d_5p_...
Gene
List:
21
17/07/15
Copy
of ofCC5_RMA
opy of
C5_RMA
( Def a...
Copy
of Copy
(Def
a...
genes
allall
genes
(14010)( 14010)
Bregje Wertheim at the Centre for Evolutionary Genomics,
Department of Biology, UCL and Eugene Schuster Group, EBI.
See which processes are upregulated or downregulated.
time
Defense response
Immune response
Response to stimulus
Toll regulated genes
JAK-STAT regulated genes
Puparial adhesion
Molting cycle
hemocyanin
Amino acid catabolism
Lipid metobolism
Peptidase activity
Protein catabloism
Immune response
Immune response
Toll regulated genes
attacked control
Selected Gene
Tree:
pearson
Coloredby
by::
ne Tree:
pear
s on lw n3d
...lw n3d ...Colored
Branch color
classification: Set_LW_n3d_5p_...
Gene
List:
r c lass ific ation:
Set_LW_n3d_5p_...
Gene
Lis
t:
22
17/07/15
Copy
of Copy
(Def
a...
Copy
of ofCC5_RMA
opy of
C5_RMA
( Def a...
allall
genes
(14010)( 14010)
genes
Bregje Wertheim at the Centre for Evolutionary Genomics,
Department of Biology, UCL and Eugene Schuster Group, EBI.
Practical work:
Search AmiGO
Did you find your favourite
gene product or process?
23
17/07/15
How does the
Gene Ontology work?
24
17/07/15
25
17/07/15
The Gene Ontology
is like a dictionary
term: transcription initiation
id: GO:0006352
definition: Processes involved
in the assembly of the RNA
polymerase complex at the
promoter region of a DNA
template resulting in the
subsequent synthesis of
RNA from that promoter.
26
17/07/15
The whole system.
Clark et al., 2005
is_a
part_of
27
17/07/15
An example…
Mitochondrial P450
(CC24 PR01238; MITP450CC24)
28
17/07/15
Where is it?
Mitochondrial
p450
mitochondrial inner
membrane
GO cellular component term:
GO:0005743
29
17/07/15
What does it do?
substrate + O2 = CO2 +H20 product
monooxygenase activity
GO molecular function term:
GO:0004497
30
17/07/15
Which process is this?
electron transport
http://ntri.tamuk.edu/cell/
mitochondrion/krebpic.html
31
17/07/15
GO biological process term:
GO:0006118
The whole system.
Clark et al., 2005
is_a
part_of
32
17/07/15
The Gene Ontology is for all species
and that means
we have to
*bridge*
some language barriers.
33
17/07/15
Same name, same thing?
http://www.darknessandlight.co.uk/cambridge_photographs.html
Bridge of Sighs,
Cambridge.
34
17/07/15
http://www.lockeheemstra.com/italy/bridge-of-sighs-venice.html
Ponte dei Sospiri,
Venice.
Tactition
Taction
Tactile sense
In biology…
35
17/07/15
?
Tactition
Taction
Tactile sense
perception of touch ; GO:0050975
36
17/07/15
Bud initiation?
37
17/07/15
= tooth bud initiation
= reproductive bud initiation
= branch bud initiation
38
17/07/15
Demo:
Writing an ontology
The car ontology
39
17/07/15
• Demo: The gene ontology
40
17/07/15
Categorization of gene products
using GO is called annotation.
So how does that happen?
41
17/07/15
P05147
Choose your
favourite gene.
42
17/07/15
P05147
Find a paper
about it.
PMID: 2976880
43
17/07/15
P05147
PMID: 2976880
Find the GO
term describing its
function, process
or location of action.
GO:0047519
44
17/07/15
P05147
PMID: 2976880
What
evidence
do they
show?
IDA
GO:0047519
45
17/07/15
P05147
PMID: 2976880
Write these down…
P05147
GO:0047519
IDA
PMID:2976880
IDA
GO:0047519
46
17/07/15
Send to the GO Consortium .
47
17/07/15
Finding annotations in a paper
…for B. napus PERK1 protein (Q9ARH1)
In this study, we report the isolation and molecular characterization
of the B. napus PERK1 cDNA, that is predicted to encode a novel
receptor-like kinase. We have shown that like other plant RLKs,
serine/threonine kinase activity,
the kinase domain of PERK1 has serine/threonine
In addition, the location of a PERK1-GTP fusion protein to the
plasma membrane supports the prediction that PERK1 is an
integral
integralmembrane
membraneprotein
protein…these kinases have been implicated in
early stages of wound
wound response…
response
PubMed ID: 12374299
Function:
48
17/07/15
protein serine/threonine kinase activity
GO:0004674
Component:
integral to plasma membrane
GO:0005887
Process:
response to wounding
GO:0009611
17/07/15
Annotation details
EBI is an Outstation of the European Molecular Biology Laboratory.
50
17/07/15
51
17/07/15
Where to get annotations?
• Non-redundant species database
• Contains all GO annotations for given species + other information.
• http://www.arabidopsis.org/
• Multispecies database - GOA
• Contains all GO annotations.
• http://beta.uniprot.org/
52
17/07/15
Evidence codes
53
17/07/15
IDA - inferred from direct assay
Enzyme assays
In vitro reconstitution (e.g. transcription)
Immunofluorescence (for cellular component)
Cell fractionation (for cellular component)
Physical interaction/binding
IEP - inferred from expression pattern
Transcript levels (e.g. Northerns, microarray data)
Protein levels (e.g. Western blots)
IGC - inferred from genomic context
Operon structure
Syntenic regions
Pathway analysis
Genome-scale analysis of processes
54
17/07/15
IGI - inferred from genetic interaction
"Traditional" genetic interactions such as suppressors, synthetic lethals, etc.
Functional complementation
Rescue experiments
Inference about one gene drawn from the phenotype of a mutation
in a different gene.
IMP - inferred from mutant phenotype
Any gene mutation/knockout
Overexpression/ectopic expression of wild-type or mutant genes
Anti-sense experiments
RNAi experiments
Specific protein inhibitors
Polymorphism or allelic variation
IPI - inferred from physical interaction
2-hybrid interactions
Co-purification
Co-immunoprecipitation
Ion/protein binding experiments
55
17/07/15
ISS - inferred from sequence or structural similarity
Sequence similarity (homologue of/most closely related to)
Recognized domains
Structural similarity
Southern blotting
RCA - inferred from reviewed computational analysis
Large-scale protein-protein interaction experiments
Microarray experiments
Integration of large-scale datasets of several types
Text-based computation
IEA - Inferred from Electronic Annotation
NAS - non-traceable author statement
ND - no biological data available
TAS - traceable author statement
NR - not recorded
56
17/07/15
Should we trust electronic annotations?
PMID: 15960829
57
17/07/15
http://www.geneontology.org/GO.indices.shtml
58
17/07/15
59
17/07/15
ec2go mapping
!version: $Revision: 1.67 $
!date: $Date: 2008/01/21 11:29:01 $
!Mapping of GO function_ontology "enzymes" to Enzyme Commission Numbers.
!original mapping by Michael Ashburner, Cambridge.
!This version parsed from function.ontology on 2008/01/15 14:01:16
!by Daniel Barrell, EBI, Hinxton
!
EC:1 > GO:oxidoreductase activity ; GO:0016491
EC:1.1 > GO:oxidoreductase activity, acting on CH-OH group of donors ; GO:0016614
EC:1.1.1 > GO:oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor ; GO:0016616
EC:1.1.1.1 > GO:alcohol dehydrogenase activity ; GO:0004022
EC:1.1.1.10 > GO:L-xylulose reductase activity ; GO:0050038
EC:1.1.1.100 > GO:3-oxoacyl-[acyl-carrier-protein] reductase activity ; GO:0004316
EC:1.1.1.101 > GO:acylglycerone-phosphate reductase activity ; GO:0000140
EC:1.1.1.102 > GO:3-dehydrosphinganine reductase activity ; GO:0047560
EC:1.1.1.103 > GO:L-threonine 3-dehydrogenase activity ; GO:0008743
EC:1.1.1.104 > GO:4-oxoproline reductase activity ; GO:0016617
60
17/07/15
interpro2go mapping
InterPro is a database of protein families,
domains and functional sites in which identifiable
features found in known proteins can be applied
to unknown protein sequences.
!date: 2008/01/15 13:01:24
!Mapping of InterPro entries to GO
!Nicola Mulder, Hinxton
!
InterPro:IPR000003 Retinoid X receptor > GO:DNA binding ; GO:0003677
InterPro:IPR000003 Retinoid X receptor > GO:steroid binding ; GO:0005496
InterPro:IPR000003 Retinoid X receptor > GO:regulation of transcription, DNA-dependent ; GO:0006355
InterPro:IPR000003 Retinoid X receptor > GO:nucleus ; GO:0005634
InterPro:IPR000005 Helix-turn-helix, AraC type > GO:transcription factor activity ; GO:0003700
InterPro:IPR000005 Helix-turn-helix, AraC type > GO:intracellular ; GO:0005622
InterPro:IPR000006 Metallothionein, vertebrate > GO:metal ion binding ; GO:0046872
InterPro:IPR000013 Peptidase M7, snapalysin > GO:extracellular region ; GO:0005576
InterPro:IPR000014 PAS > GO:signal transducer activity ; GO:0004871
InterPro:IPR000015 Fimbrial biogenesis outer membrane usher protein > GO:transporter activity ; GO:0005215
InterPro:IPR000018 P2Y4 purinoceptor > GO:purinergic nucleotide receptor activity, G-protein coupled ; GO:0045028
InterPro:IPR000020 Anaphylatoxin/fibulin > GO:extracellular region ; GO:0005576
InterPro:IPR000021 Hok/gef cell toxic protein > GO:membrane ; GO:0016020
InterPro:IPR000022 Carboxyl transferase > GO:ligase activity ; GO:0016874
InterPro:IPR000023 Phosphofructokinase > GO:6-phosphofructokinase activity ; GO:0003872
InterPro:IPR000025 Melatonin receptor > GO:integral to membrane ; GO:0016021
InterPro:IPR000026 Guanine-specific ribonuclease N1 and T1 > GO:endoribonuclease activity ; GO:0004521
InterPro:IPR000028 Chloroperoxidase > GO:peroxidase activity ; GO:0004601
61
17/07/15
Manual annotation
appears in AmiGO.
Manual and
electronic annotation
appears in QuickGO.
62
17/07/15
Clark et al., 2005
Many species
groups annotate.
We see the
research of one
function across
all species.
63
17/07/15
Exercise:
Search for your favourite gene
and see if the annotation
is electronic or manual.
http://www.ebi.ac.uk/ego/
64
17/07/15
Submit new GO terms:
http://www.geneontology.org/
65
17/07/15
66
17/07/15
17/07/15
GO slims
EBI is an Outstation of the European Molecular Biology Laboratory.
Clark et al., 2005
is_a
part_of
68
17/07/15
Clark et al., 2005
is_a
part_of
69
17/07/15
Whole
genome
analysis
(J. D. Munkvold et al., 2004)
70
17/07/15
…analysis of high-throughput data according to GO
time
Puparial adhesion
Molting cycle
hemocyanin
Amino acid catabolism
Lipid metobolism
Peptidase activity
Protein catabloism
Immune response
Immune response
Toll regulated genes
attacked control
pear
s on lw n3d
...lw n3d ...Colored
Selected Gene
Tree:
pearson
Coloredby
by::
tion:
Set_LW_n3d_5p_...
Gene
Lis
t:
Branch color
classification: Set_LW_n3d_5p_...
Gene
List:
71
17/07/15
Bregje Wertheim at the Centre for Evolutionary Genomics,
Department of Biology, UCL and Eugene Schuster Group, EBI.
Copy
of ofCC5_RMA
opy of
C5_RMA
( Def a...
Copy
of Copy
(Def
a...
genes
allall
genes
(14010)( 14010)
Making Slims:
OBO-Edit
72
17/07/15
Reapplying slimmed ontology to annotations:
AmiGO
http://amigo.geneontology.org/
73
17/07/15
Converting IDs:
PICR
http://www.ebi.ac.uk/Tools/picr/
74
17/07/15
GOOSE
http://www.berkeleybop.org/goose
75
17/07/15
2006 Consortium Meeting,
St. Croix,
U.S. Virgin Islands, March 30 - April 3, 2006
76
17/07/15
E. Coli
hub
http://www.geneontology.org
Reactome
77
17/07/15