Transcript Slide 1

Day 3 morning
Proteomics
Mining
Differential Expression
?
PROTEOMICS
PROBLEMS
R
Y
X
R
Protein Complexes
& protein-ligand interactions
Z
Modifications
HIGH THROUGHPUT PROTEOMICS
?






Analysis of all protein components; a ‘snapshot’
Useful for differential comparisons of tissues or cells
Complementary to RNA micoarray
Information about disease states
Identify useful drug target or diagnostic marker
May use robotics or semi-automated procedures
(medium-high throughput)
SINGLE PROTEIN PROTEOMICS
Y
X
Z
 Peptide Mass Fingerprinting (What is it?)
 Amino Acid Modifications (What is it? Where is it?)
 Amino Acid Changes (Mutations) (What is it? Where is
it?)
Protein Characterization using
Mass Spectrometry
Mass Spectrometry Schematic
Vacuum 1x10-5 to 10-11
Sample in
Inlet
System
Ion
Source
Mass
Analyzer
Detector
Inlet Systems:
 Simple vacuum lock
 HPLC
 GC
Data
System
Ion Sources:
 Matrix Assisted Laser Desorption Ionization (MALDI)
 Electrospray (ESI)
 Atmospheric Pressure Chemical Ionization (APCI)
 Electron Ionization (EI)
 FAB / LSIMS
Mass Analyzers:
 Multipole (Quad, Hexa, Octa)
 Time-of-flight (TOF)
 Traps (FT-ICR and QIT)
 Magnetic Sectors
Mass Spectrum
Mass spectrometry is an analytical method used
to measure Mass-to-charge ratio of a molecule
For example:
For example:
MW = 2,000
MW = 2,000
Charge = + 2
Charge = +1
M/z = 1,000
M/z = 2,000
Mass spectroscopy: Ionization modes
MALDI
Sample in solid state
Mix with matrix material
Laser desorption
Salt ‘tolerant’
No pre-analysis separation
Time-of-Flight detector
Electrospray (ESI)
Liquid sample sprayed & desolvated
No matrix materials
Salt intolerant
Direct coupling to LC or HPLC
Many detectors
MALDI-TOF Mass Spectrometer
MALDI-TOF Block Diagram
MALDI-TOF Spectrum
MASS ACCURACY IS CRITICAL TO PROPER
IDENTIFICATION OF PROTEINS (PMF) AND
CHARACTERIZING MODIFICATIONS AND
ALTERATIONS IN PRIMARY STRUCTURE.
Information from a Spectrum
Average Mass
[M + H]+ = 1047.2045
C = 12.011
Intensity
100
0
1,040
1,044
1,048
1,052
Mass-to-charge (m/z)
Angiotensin II
C50H71N13O12
Monoisotopic Mass
[M + H]+ = 1046.5422
C Defined as 12.000000
Mass accuracy; Important for Protein ID
Accuracy = Theoretical m/z - Measured m/z X 1000
Theoretical m/z
Theoretical (M+H)+ = 1378.8417
M = HGTVVLTALGGILK
73 ppm
100
1378.7409 (M+H)+
File calibration
1277.6970
50
1277.6970
0
% Intensity
13 ppm
100
1378.8594 (M+H)+
External calibration
50
1400.8627
0
Angiotensin I
100
50
Internal calibration
1296.6853
2 ppm
Des-arg1-Bradykinin
904.4682
0
872.0
1378.8390 (M+H)+
1040.4
1208.8
1377.2
Mass-to-charge (m/z)
Glu1-Fibrinopeptide B
1570.6776
Neurotensin
1672.9175
1545.6
1714.0
Protein Digestion & Identification
Select spot or band
EXCISE SPOT
Reduce & Alkylate
Digest
Extract
ALL IN THE GEL!
MASS SPEC
DATABASE QUERY
ID
Concepts
Glycine
57.0
Alanine
71.1
Protease cleavage is specific and
generates unique peptides
Serine
87.1
Proline
97.1
Valine
99.1
Every amino acid has a different
molecular mass
Threonine
101.1
Cysteine
103.1
Leucine
113.2
Masses are ‘unique’ to protein sequence
Isoleucine
113.2
Asparagine
114.2
Aspartic acid
115.1
Glutamine
128.1
Lysine
128.2
Glutamic acid
129.1
Methionine
131.2
Histidine
137.1
Phenylalanine
147.2
Arginine
156.2
Tyrosine
163.2
Tryptophan
186.2
G-L-S-E-T-W-D-D-H-K = 1186.5 Da
K-H-D-D-W-T-E-S-L-G = 1186.5 Da
Proteases Cleave Specifically
Trypsin cleaves after Lysine (K) and Arginine (R)
Chymotrypsin cleaves after aromatic amino acids
Staph aureus V8 cleaves after Aspartate (D) or Glutamate (E)
EndoLysC cleaves after Lysine (K) only
Therefore, peptide maps (m/z) can be predicted
from database sequences (in silico digestion).
This is their Mass Fingerprint.
Digestion with Trypsin
Robust, cheap, stable
Specificity
Cleaves on the C-terminus of Arginine ® and Lysine (K) ONLY
Sequence information about C terminus only
~1 our 10 amino acids, average peptide mass 1100 Da
Amino acids favorable for MS (charged!)
Intact protein sequence
Digest fragments
MASDFGHKILGFDSACV
MNQWSDFFIILRTHYWE
DTYRRLIPMASDFKTYH
MNGFDSAILIGRIISCFGK
PEQSADRTYIPLMKSDFV
CQELISEL
-R
-LIPMASDFK
-MASDFGHK
-THYWEDTYR
-THYWEDTYRR
-RLIPMASDFK
-SDFVCQELISEL
-PEQSADRTYIPLM
-ILGFDSACVMNQWSDFFIILR
-TYHMNGFDSAILIGRIISCFGK
Protein Digestion & Identification; Mass Fingerprint
GKVKVGVNGFGRLIGRVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVK
AENGKLVINGNPITIFQERDPKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAP
SADAPMFVMGVNHEKYDNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKl
TVDGPSGKWRDGRGALQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTC
RLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQVVSSDFNSDTHSSTFDAGAGIALNDHFV
KLISWYDNEFGYSNRVVDLMAHMASKE
Protein Digestion & Identification; Mass Fingerprint
GKVKVGVNGFGRLIGRVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVK
AENGKLVINGNPITIFQERDPKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAP
SADAPMFVMGVNHEKYDNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKl
TVDGPSGKWRDGRGALQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTC
RLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQVVSSDFNSDTHSSTFDAGAGIALNDHFV
KLISWYDNEFGYSNRVVDLMAHMASKE
DIGEST
Experimental m/z H+
4036.8947
3308.5642
2595.3599
2277.0379
2213.1093
1763.8023
1719.8768
1613.9009
1473.7730
1411.7903
1201.6067
909.4900
873.4676
869.5091
829.4414
805.4315
795.4181
739.3621
694.3518
688.3777
685.4243
Predicted m/z H+
653.3141
518.2569
458.3085
375.2350
374.2398
361.1982
359.1925
347.1673
345.2496
260.1968
246.1812
204.1342
DATABASE QUERY
MATCH!!
4036.8947
3308.5642
2595.3599
2277.0379
2213.1093
1763.8023
1719.8768
1613.9009
1473.7730
1411.7903
1201.6067
909.4900
873.4676
869.5091
829.4414
805.4315
795.4181
739.3621
694.3518
688.3777
685.4243
653.3141
518.2569
458.3085
375.2350
374.2398
361.1982
359.1925
347.1673
345.2496
260.1968
246.1812
204.1342
Mass Fingerprinting: What is a Real Peak?
Look for good isotope distribution
and good peak shape.
Avoid noise ‘spikes’.
Intensity is not necessarily an
indicator of quality.
Automatic peak picking becomes
less reliable at low S/N and high mass.
Other Peaks in Your Mass Spectrum
That are NOT from Your Protein
Keratin 3312.308, 2508.145, 2383.952, 1993.977, 1716.851, 1708.713,
1657.793, 1475.785, 1475.749, 1383.69, 1357.696, 1302.715, 1300.53,
1277.71, 1265.637, 1217.616, 1179.6, 1141.519, 1125.542, 1092.503,
1033.516, 1016.501, 1006.43, 999.445, 973.531, 910.415, 874.499, 832.489,
823.39
Tyrptic autolysis 2283.18, 2211.104, 2158.031, 1768.799, 1736.842, 1045.564,
1006.487, 906.504, 842.509
Matrix ion peaks 666, 672, 688, 855, 861, 877, 893, 1060, 1065
Abundant proteins (ie. Albumin, Ribosomes) and/or smears on the gel.
How Do We Identify Proteins Using MS?
BioInformatics:
Search Engines
MASCOT (www.matrixscience.com)
Protein Prospector (MS-FIT;prospector.ucsf.edu/)
ProFound (prowl.rockefeller.edu/prowl-cgi/profound.exe)
X! Tandem (www.thegpm.org/TANDEM/)
What You Need to Search Database
•A list of query masses (monoisotopic preferably)
•Protease(s) or cleavage reagents used
•Database(s) you want to search (SwissProt, NR)
•Estimated mass and pI of protein spot (or band)
•Cysteine/methionine or other modifications
•Mass tolerance (100 ppm = 1000.0 + 0.1Da)
•On-line Search Program (MS-Fit, Profound, Mascot, etc.)
Mass Fingerprinting: Mascot search
Using MALDI-TOF mass spec data
1.
Database
2.
Taxonomy
3.
Enzyme (essential)
4.
Modification (optional)
5.
Protein Mass (ignore)
6.
Peptide tolerance (100 ppm)
7.
Data (quality over quantity)
Mass Fingerprinting: Mascot search
Mass Fingerprinting: Mascot search
Mass Fingerprinting: Mascot search
Interpreting the Results
Probability based scoring:
•Compute the probability that the observed match between the experimental
data and mass values calculated from a candidate peptide sequence is a random
event.
•The correct match, which is not a random event, has a very low probability.
Allows Standard Statistical tests to be applied to the results:
Mascot score is: -10Log10(P)
In a database of 500,000 entries, a 1 in a 1,000 chance of getting a false positive
match is a probability of :
-P = 1/(1,000 x 500,000)
Calculated, it is a Mascot score of 87
Mass Fingerprinting: Mascot search
What if you don’t find anything?
•Widen mass tolerance
•Remove taxonomic limits
•Go to a larger database
•Increase the number of missed cleavages allowed
•Increase the number of variable modifications
•Get sequence information (MS/MS)
Common contaminants
Keratin
Autolysis peaks
Albumin
Actin
IgG
Fragmentation of Peptides: MS/MS
•Can give more information/confirmation
•Identification of post-translational modifications at residue level
•Generate a series of fragments of the original peptide – ideally the bonds
you are breaking are the peptide bonds between residues (y ions
and b ions)
Fragmentation of Peptides: MS/MS
Sequence Information
Glycine
57.0
Alanine
71.1
Serine
87.1
Proline
97.1
Valine
99.1
Threonine
101.1
Cysteine
103.1
Leucine
113.2
Isoleucine
113.2
Asparagine
114.2
Aspartic acid
115.1
Glutamine
128.1
Lysine
128.2
Glutamic acid
129.1
Methionine
131.2
Histidine
137.1
Phenylalanine
147.2
Arginine
156.2
Tyrosine
163.2
Tryptophan
186.2
How Do We Identify Proteins Using MS/MS?
Get database sequences
that match precusor peptide mass
Actual MS/MS scan
AVAGCAGAR
CVAAGAAGR
VGGACAAAR
etc….
Precursor petide
[M+H]* = 828.2
CVAAGAAGR
AVAGCAGAR
b2
y2b3 y3 b4 y4 b5
y5
b6
y6
b7
b2 y2
y3
b3 b4 y4
b5
y5
b6 y6 b7
VGGACAAAR
y2
b2 b3
y3 b4
y4 b5
y5 b6
y6
Compare virtual spectra
To real spectrum
Peptide
Score
Scoring
1.
2.
Detect matches between
Theoretical b- and y- ions
3.
Compute correlation
4.
Rank hits
AVAGCAGAR 2.56
CVAAGAAGR 0.57
VGGACAAAR 0.32
How Do We Identify Proteins Using MS/MS?
Sequest analysis of MS/MS Ion Trap data
How Do We Identify Proteins Using MS/MS?
Ion Trap MS/MS of doubly charged ions
What if you don’t find anything?
Combination of MS and MS/MS searching
Mascot allows for combination searching with MS and MS/MS data
Collision Induced Dissociation (CID) or Post Source Decay (PSD)
are used to fragment individual peptides (MALDI-TOF)
DeNovo Sequencing; hand work
Typhoon Trio
Typhoon Trio
Wash glass platen with Mill-Q water only!
Do not pry open gels. Keep closed.
Keep clean and dry.
Typhoon Trio
Typhoon Trio
Typhoon Trio
Set Up Window
Typhoon Trio
Scan Process
Scan at 1000 micron to adjust PMT (photomultiplier tube) voltages.
Each Laser PMT voltage (CyDye) must be adjusted individually.
PMT voltage starting point 600 volts.
We try to adjust PMT voltage to give spot volumes (ImageQuant) in
a linear response range (60,000-90,000 ‘counts’).
We try to keep responses of all lasers within ~15% of each other.
We look for ‘red pixels’ to determine oversaturation of the PMT.
Each change in 40 PMT volts is an ~2-fold change in sensitivity.
Once satisfied, we scan again at 100 micron (high resolution).
Back-down PMT voltage 10 volts for this scan.
Cannot assume that all gels will use same PMT voltages. Empirical.
Typhoon Trio
Pre-Scanning
Do not pry open the gels.
Wrap gels in plastic wrap.
Store at 4C until scanning.
Scanning should be performed as soon as possible to reduce diffusion.
Post-Scanning
Pry open the gels.
Gel should stick to Bind-Silane coated plate.
Fix gel overnight in MeOH/Acetic acid for Deep Purple Staining.
Clean upper plate.
Eventually, the Bind-Silane coated plate is cleaned with NaOH.
Deep Purple Staining
Procedure for Backed gels (large gels):
Incubate backed gel in 500 ml Fix Solution with gentle agitation OVERNIGHT.
Pour off the Fix Solution and replace with 500 ml Wash Solution.
Agitate gently for 30 minutes.
Replace with 250 ml of Miili-Q Water.
Remove Deep Purple from freezer; allow to stand at RT for 10 minutes before opening.
Shake bottle of Deep Purple well.
Add 1.25 ml Deep Purple to the water.
Incubate with gentle agitation for 1 hour; KEEP IN DARK.
Pour off stain (do not use again).
Replace with Storage Solution.
Cover and store in a dark place until imaging on Typhoon Trio.