EMBL-EBI Powerpoint Presentation
Download
Report
Transcript EMBL-EBI Powerpoint Presentation
13.04.2015
Master title
Molecular Interactions and Pathways
5
Sandra Orchard
EMBL-EBI
EBI is an Outstation of the European Molecular Biology Laboratory.
Why is it useful to study PPI interactions,
networks and pathways?
• Proteins are the workhorses of cell and all their activities are
controlled through interactions with other molecules.
• To understand the biology of a single protein, you have to
study its interacting partners
• Network/pathway analysis increasingly used as a tool to
annotate large data sets – proteins involved in a common
process tend to cluster and be present in the same pathway
2
Why are there so many issues with interaction
data?
1. Wide variety of methods for demonstrating molecular
interactions – all have their strengths and weaknesses
2. No single method accurately defines an interaction as
being a true binary interaction observed under
physiological conditions
Why do we need interaction databases
• Issues with all interaction data – true picture can only be
built up by combining data derived using multiple
techniques, multiple laboratories
• Problematic for any bench researcher to do – issues with
data formats, molecular identifiers, sheer volume of data
• Molecular interaction databases publicly funded to collect
this data and annotate in a format most useful to
researchers
Why are data standards essential
• Prior to 2003, many databases= many formats. User
must reformat when merging data
• File conversion inevitably leads to data loss
• Many formats compromised tool development – each tool
developed tended to be database specific
5
PSI-MI XML format
•
Community standard for Molecular Interactions
•
XML schema and detailed controlled vocabularies
•
Jointly developed by major data providers:
BIND, CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct, MINT, MIPS, Serono, U. Bielefeld, U.
Bordeaux, U. Cambridge, and others
•
Version 1.0 published in February 2004
The HUPO PSI Molecular Interaction Format - A community standard for the representation of
protein interaction data.
Henning Hermjakob et al, Nature Biotechnology 2004, 22, 176-183.
•
Version 2.5 published in October 2007
Broadening the Horizon – Level 2.5 of the HUPO-PSI Format for Molecular Interactions;
Samuel Kerrien et al. BioMed Central. 2007.
6
PSI-MI XML benefits
•
Collecting and combining data from different sources
has become easier
•
Standardized annotation through PSI-MI ontologies
•
Tools from different organizations can be chained,
e.g. analysis of IntAct data in Cytoscape.
Home page
http://www.psidev.info/MI
7
Controlled
vocabularies
www.ebi.ac.uk/ols
IMEx
• Consortium of 9 molecular interaction databases
dedicated to producing high quality, annotated data,
curated to the same standards
• Data is curated once at a single centre then exchanged
between partners
• Users need only go to a single site to obtain all data
• www.imexconsortium.org
10
www.imexconsortium.org
IntAct goals & achievements
1. Publicly available repository of molecular
interactions (mainly PPIs) - ~305K binary
interactions taken from >6,200 publications
(December 2012)
2. Data is standards-compliant and available via our
website, for download at our ftp site or via PSICQUIC
http://www.ebi.ac.uk/intact
ftp://ftp.ebi.ac.uk/pub/databases/intact
www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml
3. Provide open-access versions of the software to
allow installation of local IntAct nodes.
11
IntAct Curation
“Lifecycle of an Interaction”
Sanity Checks
(nightly)
reject
Public web site
Publication
(full text)
.
exp
accept
p2
I
p1
FTP site
check
CVs
annotate
Curation
manual
IMEx
report
report
MatrixDB
curator
Master headline
Super curator
Mint
DIP
UniProt Knowledge Base
http://www.ebi.uniprot.org/
Interactions can
be mapped to
the canonical
sequence…
13
.. to splice variants..
.. or to postprocessed chains
Data model
•
Support for detailed features
i.e. definition of interacting interface
Interacting domains
Overlay of Ranges on sequence:
14
How to deal with Complexes
15
•
Some experimental protocol do generate complex data:
Eg. Tandem affinity purification (TAP)
•
One may want to convert these complexes into sets of
binary interactions, 2 algorithms are available:
http://www.ebi.ac.uk/intact
IntAct – Home Page
16
Ontology search
17
Interaction detail
Choice of UniProtKB
or Dasty View
18
PubMed/IMEx ID
Details of
interaction
Viewing Interaction Details
Additional
information
19
Interaction Details
20
Visualizing - networkView
21
Applying a better graph layout…
Visualization
Master headline
Cytoscape Plugins
23
A Database of
human biological
pathways
Reactome is…
Extensively cross-referenced
Tools for data analysis –
Pathway Analysis,
Expression Overlay, Species
Comparison, Biomart…
Used to infer orthologous
events in 20 other species
Using model organism data to build
pathways – Inferred pathway events
PMID:5555
Direct evidence PMID:4444
Direct evidence
human
PMID:8976
mouse
Indirect evidence
PMID:1234
cow
Theory - Reactions
Pathway steps = the “units” of Reactome
= events in biology
BINDING
DEGRADATION
DISSOCIATION
DEPHOSPHORYLATION
PHOSPHORYLATION
CLASSIC
TRANSPORT
BIOCHEMICAL
Reactions Connect into Pathways
CATALYST
CATALYST
CATALYST
INPUT
OUTPUT
INPUT
OUTPUT
INPUT
OUTPUT
Species Selection
Data Expansion – Projecting to Other Species
Human
B
A
+ ATP
A -P + ADP
Mouse
B
A
A -P + ADP
+ ATP
Drosophila
A
+ ATP
B
No orthologue - Protein not inferred
Reaction not
inferred
The Pathway Browser
Species selector
Diagram Key
Sidebar
Zoom/move
toolbar
Pathway Diagram Panel
Details Panel (hidden)
Thumbnail
The Details Panel
Pathway Analysis
Pathway Analysis – Overrepresentation
P-val
Reveal next level
‘Top-level’
Species Comparison I
Species Comparison II
Yellow = human/rat
Blue = human only
Grey = not relevant
Black = Complex
Expression Analysis I
Expression Analysis II
Step through
Data columns
‘Hot’ = high
‘Cold’ = low
Summary
Network and pathway analysis enable the researcher to:
1. Identify clusters of proteins – these may share the same
function (stable complex), process or subcellular location
2. Identify proteins involved in the same pathway i.e. in the same
process (only works for those proteins which can be placed in
pathways)
3. Add biological meaning to a list of gene/transcript/protein
identifiers.
39
http://www.ebi.ac.uk/training/online/
Interactions, Pathways and Networks
Analyzing protein-protein interaction networks.
Koh GC , Porras P , Aranda B , Hermjakob H , Orchard SE
PMID:22385417
J Proteome Res [2012 (11) ] page info:2014-31
40
?
?
?
?
?
?
?
?
? ?
?
?
?
?
?
?
?
?
41
?
?
?
Current IntAct support:
European Commission grants PSIMEx
(FP7-HEALTH-2007-223411)
APO-SYS (FP7-HEALTH-2007-200767)
Affinomics (241481)
The development of Reactome is supported
by a grant from the US National Institutes of
Health (P41 HG003751), EU grant LSHGCT-2005-518254 "ENFIN", Ontario
Research Fund, and the EBI Industry
Programme.
42