No Slide Title

Download Report

Transcript No Slide Title

-Presented by:
Peter Oledzki
John Pinney
Ashwin Sivakumar
Proteomics

Proteomics has been said to be the next step from genomics

Proteomics is the sudy of the proteome.

The proteome is the complete complement of proteins found
in a complete genome or specific tissue.
Proteomics and genomics are inter-dependent
Genome Sequence
Genomics
mRNA
Protein Fractionation
Primary Protein products
Functional protein products
Proteomics
2-D Electrophoresis
Proteomics
Protein
Identification
Determination of gene
Post-Translational
Modification
Aims of Proteomics




Detect the different proteins expressed by
tissue, cell culture, or organism using 2Dimensional Gel Electrophoresis
Store those information in a database
Compare expression profiles between a
healthy cell vs. a diseased cell
The data comparison can then be used for
testing and rational drug design.
Gel Electrophoresis


Motion of charged molecules in an electric
field.
Polyacrylamide gel provides a porous matrix
– (PAGE – Polyacrylamide Gel Electrophoresis)


Sample is stained with comassie blue to
make it visible in the gel.
Sample placed in wells on the gel.
1-D Gel electrophoresis


Separation in only
1 dimension: size.
Smaller molecules
travel further
through the gel
then large
molecules, thus
separation.
1-D continued

Electric field across gel separates molecules.
– Negatively charged molecules travel towards the
positive terminal and vice-versa.
– Western blotting(Protein) not to be confused with Southern
blotting (DNA) or Northern blotting (RNA)

Proteins are treated with the denaturing detergent
SDS (sodium dodecyl sulfate) which coats the protein
with negative charges, hence SDS-PAGE.
2-D – Separation is based on size
and charge


First step is to separate based on charge or
isoelectric point, called isoelectric focusing.
Then separate based on size (SDS-PAGE).
Isoelectric Focusing




The isoelectric point is the pH at which the
net charge of the protein molecule is neutral.
Different proteins have different isoelectric
points.
Isoelectric point is found by drawing the
sample through a stable pH gradient.
The range of the gradient determines the
resolution of the separation.
SDS-PAGE




Second Dimension.
Separation by size.
Run perpendicular to Isoelectric focusing.
The only unresolved proteins after the first
and second dimensions are those proteins
with the same size and same charge – rare!
2-D Proteomics Example
2D-PAGE Analysis Software


2D-PAGE technology has been in use for over 20
years, and potentially provides a vast amount of
information about a protein sample.
However, due to difficulties with data analysis, it
remains only partially exploited.
Analysis problems


It can be very difficult to compare the results of two
experiments to yield a differential expression profile:
Can be severe warping of gel due to
– uneven coolant flow
– voltage leaks
– tears in gel

Can be problems with normalisation of
– background
– spot intensity

Can be differences in sample preparations.
Current state of software



Correct identification and alignment of spots from the
two gels has generally been a process with a lot of
manual intervention - hence very slow.
The processing power available with today’s PCs
means that automated analysis is starting to become
possible.
One vendor claims a throughput of 4 gel pairs per
hour can be compared and annotated by an
experienced user of their package.
Automated gel matching



Gel matching, or “registration”, is the process of
aligning two images to compensate for warp.
Some packages still require the user to identify
corresponding spots to help with gel matching.
The Z3 program from Compugen has a fullyautomated gel matching algorithm:
– define set of small, unique rectangles.
– compute optimal local transformations for rectangles.
– Interpolate to make smooth global transformation.

Note that this makes use of spot shape, streaks,
smears and background structure, which other
programs discard.
Spot detection

Once the gel images have
been matched, the program
automatically detects spots.
Algorithms are generally
based
on
Gaussian
statistics.
Spot Quantitation



The positions of detected spots are calibrated to give
a pI / mW pair for each protein.
A value for the expression level of the protein can be
calculated from the overall spot intensity.
Some programs do not quantitate each gel
separately, but calculate relative intensity pixel by
pixel. This may be a more accurate approach.
Differential Expression


The user can set threshold
values for the detection of
differential expression. This
helps reduce the amount of
information displayed at
once.
In this example, a protein
expressed only in the
second sample is circled in
red. The yellow circles show
proteins
which
are
differentially expressed.
Annotation


Some systems allow semi-automatic annotation of
spots, based on a database of proteins listing their
pI / mW values.
Proteins of interest can also be excised from the gel
and sent on to mass spectrometry for definitive
identification. The ProteomeWorks system from Biorad offers such an integrated solution for 2D-PAGE
and MALDI.
Multi-experiment Analysis



One useful feature of modern programs is the ability
to collate data from many runs of the same
experiment.
Spots which only appear in one gel are likely to be
artifacts, and are removed from the analysis.
This is an excellent way to reduce noise and enhance
weak signals.
Links





Z3 system (Compugen) - http://www.2dgels.com/
Melanie3 (SIB) - http://us.expasy.org/melanie/
ProteomWeaver (Definiens) http://www.proteomweaver.com/
PDQuest (Bio-Rad) - http://www.biorad.com/
Delta2d (Decodon) - http://www.decodon.com/
Introduction to the databases
•With the advent of many 2-D PAGE databases there are a
number of protein spots that are already "identified" in a few
cell lines. Combined with the aims of the experiment, these
databases may give one the opportunity to guess at the
identity of a particular protein spot and confirm or deny this
by immunoblotting. The approach of obtaining accurate
peptide masses from specifically cleaved proteins to search
protein sequence databases, known as peptide mass
fingerprinting, provides one with another opportunity to
identify a previously sequenced protein or (hopefully)
confirm that it is indeed novel.
An animated SDS PAGE presentation
•A number of 2-D Gel databases exist.
•Quantitative
REF52.
databases:
S.cervisiae
and
•Annotative databases: E.coli and human
keratinocytes.
•An
annual
issue
of
the
journal
“Electrophoresis”-Major database for these
databases!!!…(I mean has links to many of
these).
•A best one would obviously the database
which is regularly updated.(Eg: Swiss 2D
page).
List of 2-D GEL DATABASES
One can find an extensive list of such databases by following
these links.
We would discuss a few “Interesting ones”.
•World 2-D PAGE
•NCIFCRF
•DEAMBULUM-Protein Databases
•Ludwig Institute of Cancer Research
•Phoretix
World 2-D Page:Index of 2-D page Databases-ExPaSy
•Basically a link to various 2-D Page databases.
•Has a useful tool called 2-D Hunt where one could
search for 2-DE related sites on the web.
•Indexed as databases for multi species, mammalia, yeast,
plant,bacteria,viruses and parasites, cell lines.
Swiss 2-D Page
•Basically a protein databank for 2-D page and SDS page reference
maps.
•May give the exact location of the protein in the map or the region in
the map assuming the fact that it has a Swissprot entry.
•Options: Search by keywords, Accession number, spot clicking,full
text,author,Swiss-2D Page spot serial number,SRS.Most of them being
self-explanatory.
•Protein list for a particular reference map(table)(can be
downloaded).It gives details on the gene name,protein description,S2DP reference number,S-2DP accession number,identification
method,Exp. Molecular weights and Pis for each entity found.
•We can also locate
the location of a protein sequence in
all/one/selected reference maps available.If it is not found a temporary
virtual entry is created on the ExPASy server.
SWISS 2D-PAGE (contd)
•It gives cross reference to Medline and a few other
databases.
•In addition to this textual data, SWISS-2DPAGE provides
several 2-D PAGE images showing the experimentally
determined location of the protein, as well as a theoretical
region computed from the sequence protein, indicating
where the protein might be found in the gel.
•Genbio (Geneva Bioinformatics) gives subscription(PAID)
for the Swiss 2D PAGE to Commercial Institutions.
•Vital Statistics
¯ Current release(15.0) has 861 entries in 33 reference maps.
¯Vital stats continued...
¯Sources of reference maps:
¯Human( Liver, plasma, HepG2, RBC, Lymphoma, HepG2 Secreted
Proteins, CBF, Macrophage like Cell Line, Erythroleukemia cell, platelet,
kidney, promycelocytic leukemia cells, colorectal epithelia cells,
colorectal adenocarcinoma cell line(DL-1), Soluble nuclear proteins and
matrix from liver tissue)
¯Mouse( Liver, gastrocnemius muscle, pancreatic islet cells,brown
adipose tissue, white adipose tissue,soluble nuclear proteins, matrix from
liver tissue).
¯Arabidopsis thaliana
¯Dictyostelium discoideum
¯Escherichia coli(for 7 pI ranges: 3.5-10,4-5,4.5-5.5,5-6,5.5-6.7,6-9,611)
¯Saccharomyces cerevisiae
Swiss 2D Page(cont..)
There have been some recent additions to the database.
SDS and 2-D Page of nuclear proteins from Human HeLa
cells have been added to the growing list of reference
maps.It is still an ongoing project.Information about known
proteins found within that gel stretch has been mapped(see
beloe: right-SDS, left-PAGE)
Swiss 2D Page(cont)
Some Useful abbreviations:
-ID line: comprises of ID, Entry name,Entry class and the
method(2Dgel) in the order as mentioned.They follow a
specific nomenclature.
-AC line basically contains Accession numbers seperated
by a semi colon.It’s a stable way of identifying entries
with each release.
-DT Line specifies date( self explanatory!).
-DE Line gives a descriptive information about the
protein.If the complete sequence was not determined then
last line would spell as “Fragment”.
Some useful abbreviations(cont…)
-The IM line
The IM (Images) lines list the 2-D PAGE and SDSPAGE images which are associated to the entry. These
may be, for example, TUMORAL LIVER, NORMAL
LIVER or just LIVER.
-RP(Reference Position) line: Describes extent of work
carried out by the author.Eg: Protein sequence, amino
acid composition, mapping on gel, characterisation and
review.
-The
“O”
series
contains
organism
species(OS),taxonomy(OX) and classification(OC).
-MT(Master) line has information about types of maps
used(Eg: Plasma, liver etc).
Methods used for zeroing on the identified spots.
•Total of 3398 identified spots(as of the latest
version).
•Amino acid composition has identified 5.3% of
these spots.
•Co-migration: 2.6%
•Gel-matching: 46.7%
•Immunoblotting: 20%
•Microsequencing: 15.5%
•Peptide mass fingerprinting: 26.3%
•Tandem mass spectroscopy: 2.3%
Well..does it carry
a message?
Browsing the Swiss 2-D Page using spot clicking
-We could get information about a known protein by clicking
on one of the “checks” in the extensive list of image maps
available.
-On clicking it throws a tailor-ready image map showing the
accurate/approximate position of that protein with respect to
all the image maps available.But for obvious reasons the best
view can be obtained from the reference image map we
initially clicked.
-A hypertext link can then be used to obtain the full SWISSPROT entry for that protein, displaying protein sequence,
domain structure, information on known post-translational
processing and modifications, and references.
Image clicking(continued…)
-From SWISS-PROT, the user can select a link to
SWISS-3DIMAGE to see the three-dimensional
structure of the protein, if it is known, or to submit the
sequence to the SWISS-MODEL three-dimensional
modelling tool or view the domain structure .
- Also, from SWISS-PROT, the user can select links to
pertinent information from DNA sequence databases
(EMBL/Genbank), chromosomal and genomic maps
(GDB Genome Database), bibliographic references and
abstracts (Medline), and databases on the association of
human proteins with diseases (OMIM Online
Mendelian Inheritance in Man).
Diagramatic representation of Image
Here we click on
Clicking...
this spot in
reference map of
the Colorectal
epithelia cell
Throws a screen
showing
the pictures of
different image
maps with respect
to that protein
Diagramatic representation(cont…)
Protein
identification
on chosen reference map
The red
rectangle is the
expected region
of the protein
on the gel.
Spots are the
proteins
identified
Dotted lines are extensions of
the possible regions if the
protein is acetylated,
phosphorylated or
glycosylated.
Enough of Swiss 2D PAGE!!!
-On image clicking we could also calculate the
theoretical pI and Molecular weight of different
sequence fragments with desired end points.One
could specify the N-Terminal and C-Terminal
values in the options available in the screen.By
default it would compute it for the entire sequence
available.
-Swiss 2D page also has a cross reference to
another popular 2D Gel database in Siena 2D Gel
database.
Now bye bye! Swiss 2D PAGE.
Biobase/Julio Celis Database
(Very well structured & lucid!!!)
- Hosted at Danish Center of Human Genome Research.
-Have the distinction of constructing the first 2D Gel database(HeLa cells) in
1981(Bravo, R., Bellatin, J. and Celis, J.E).
-Human and Mouse 2D PAGE Databases.
-annotated 2-D gel pattern of fluids from different species can be found in the
fluid gallery.
-One can find 2D-Gel immunoblots of selected proteins against various
antibodies.
-2D Gel gallary of various human cell types and fluids.(Includes tumors,
keratinocytes and post-translational modifications.).
Preparation and labelling of Human keratinocytes-please visit:
http://biosun.biobase.dk/~pdi/procedures/procedure_label.html
Biobase/Julio Celis Database(cont…)
Human 2-D PAGE Database:
The keratinocyte 2D PAGE database constructed using carrier
ampholytes, is the largest of its kind and currently list 3625
cellular (2313 isoelectric focusing, IEF; 954 non equilibrium
pH gradient electrophoresis, NEPHGE), and externalised
polypeptides (358, IEF) of which 1285 have been identified
using a combination of techniques including immunoblotting
[32], Edman degradation of internal peptides [33, 34], and mass
spectrometry [35]. (Might be outdated!!!!)
Representation through flow charts to
follow...
(Biobase Cont..)
By clicking on each of the available reference gels,we can get
information(links to medline,swissprot,PDB,cellular
location,Knockout,method used) on the available
proteins(checked spots) on the gel.
Databases for study of skin biology
HK-IEF d’base
865 IP
HK-NEPHGE d’base
KP present in medium
IEF Database
372 IP
59 IP
Biobase(cont…)
Database for study of Bladder Cancer
TCC-NEPHGE d’base
Urine-IEF d’base
TCC-IEF database
BSCC-IEF d’base
144 IP
449 IP
197 IP
309 IP
Biobase(cont…)
Other 2D Page Databases
Human MRC-5Fibroblasts-IEF
D’base
Human MRC-5Fibroblasts-NEPHGE
D’base
262 IP
84 IP
Biobase(cont…)
Search Options:
Seacrh by protein name, keyword, sample spot number,
Relative Molecular mass, pI, organelle /component.
Other options relating listing of proteins,views of the gels are
quite self explanatory.
Other utilities of the Database:
Has links to
-NCBI’S Human-Mouse Homology maps through its Mouse
2D-PAGE Databases.
-Interesting studies like Mouse-Genome Informatics(Jackson’s
lab) and Mouse Atlas Projects.
NCIFCRF(National Cancer Institute…..could
not sphere out what FCRF was!!!..sorry)
*Seems a very exhaustive and useful source.Lots of things
still to study.*
2D Protein Gel Databases
WebGel
Flicker
dbEngine
Maintained by
Image Processing
Section
Maintain the
gel analysis
softwareGELLAB II
WebGel:
WebGel is an Internet-based, interactive, qualitative and
quantitative gel
database analysis system.
A WebGel database contains previously quantified gel data
generated from a
stand-alone quantitataive gel analysis system.
wbdemoDB
demonstration
melanie2DB
demonstration
database
database
of serum
fasDB
database
of E.coli gelsfrom the of serum proteins
proteins in a
Melanie 2.3
in a fetal alcohol
fetal alcohol
demonstration
syndrome study
syndrome study.
database.
FLICKER
“Flicker is a method for comparing images from different
Internet sources on your Web browser. In the case of 2D
protein electrophoretic gel images, maps identifying proteins in
these gels are becoming increasingly available. Visually
comparing 2D sample gels against these 2D gel database maps
may suggest putative protein spot identification in many cases.
Flicker was originally developed for comparing 2D protein
gels across the Internet.”
-Part of the description in their web site.
Flicker comparing two Plasma 2D-PAGE Gels.
dbEngine
Manual
It is a simple database search engine which may be used to
quickly create a searchable database on a World Wide Web
(WWW) server. Data may be prepared from spreadsheet
programs (such as Excel, etc.) or from tables exported from
relational database systems. This Common Gateway Interface
(CGI-BIN) program is used with a WWW server such as
available commercially, or from NCSA or CERN. Its
capabilities include: 1) searching records by combinations of
terms connected with ANDs or ORs; 2) returning search results
as hypertext links to other WWW database servers; 3) mapping
lists of literature reference identifiers to the full references; 4)
creating bidirectional hypertext links between pictures and the
database.
SIENA 2-D PAGE
-Similar to Swiss 2D PAGE when it comes to
browsing the Database.
-Point to be noted: Last update was June 2000.
-Their Gel entries include many human protein
maps.
-For further information please toy around with their
site!
PROTEOME Inc.
-Available databases are:
-Caenorhabditis elegans Proteome Database(YPD)
-Saccharomyces cerevisiae Proteome Database (WormPD)
-S.pombe (PombePD)
-Human PSD(Quite interesting!!!):
Its sort of a survey database.It has greater than 17,000
human, mouse and rat proteins.Within this their
PDtm(Protein Coupled Receptor database) has around
600 GPCRs.
As with the case with most companies(Trick up their
sleeve!)…their Human PS database is available only for
subscription.
HSC-2D PAGE (Harefield)
-Hosted at Heart Science Center at Harefiled.
-Individual gel databases available:
Human Heart
Human Endothelial cell
Rat Heart
Ventricles
Dog Heart
-Well..visit their site,again quite similar to the Swiss 2DPAGE.
PPMDB at Sphinx
•Warning!!! “Its no more updated”….but could still be a
useful source.
•Was aimed to create directory of Arabidopsis plasma
membrane proteins and to obtain expression and sequence
data.
•Includes->
•a 2-D map agreeing to the rules of federated 2-D databases
•a repertoire of plasma membrane proteins not-present on the
map
•protein sequence data linked to EST or cDNA data
•and protein expression data according to ecotypes and plant
organs.
Sphinx(cont…)
-Available reference gels include Callus 2D gels,Cationic and
Anionic detergent 2D gels,analytical plant organs,analytical
solubization methods,analytical ecotypes and preparative.
-One could query a sequence against PPMdb sequences to find
matches.
-On clicking on sequence matching we can find detailed
information about procedures like 2DE,Running,staining and
scanning.
-We could also perform a more elaborate search on available gels
by entering user options like protein name,sub-cellular
location,accession
number,hydrophobicity
properties,Gel
family,MW,pI etc.
Challenges/shortcomings in 2D-Gel Databases
(Food for thought!)
-Detection of very low abundancy polypeptides.
-The resolution of very basic and high molecular weight
proteins
-The lack of satisfactory quantitation procedures for the
analysis of all the proteins resolved in the gels.
-Storage of image analysis results
-Data integrity- Merging data from different sources?
-Explanatory notes on the reference maps?
Subscribe to swiss-flash newsletter to keep
yourself updated on 2D gel and other resources
available to ExPASy!!!
Try these links and then
Time to chill!!!
•http://www.2dgels.com/
•http://www.genomicglossaries.com/content/lifesciences_data
basesdirectory.asp
•http://www.expasy.ch/ch2d/2d-index.html
•http://www.infobiogen.fr/services/deambulum/english/db4.ht
ml#GELS2D
•http://www.bio-mol.unisi.it/2d/2d.html
•http://biobase.dk/cgi-bin/celis
•http://www.phoretix.com/customers/wl_2d_specific_sites.ht
m
•http://sphinx.rug.ac.be:8080/ppmdb/index.html