Mascot Example Slides - Proteomics Lab in Univeristy of

Download Report

Transcript Mascot Example Slides - Proteomics Lab in Univeristy of

Introduction to mass spectrometrybased protein identification and
quantification
Austin Yang, Ph.D.
Aebersold R, Mann M.
Mass spectrometry-based proteomics.
Nature. 2003 Mar 13;422(6928):198-207. Review.
Mueller LN, Brusniak MY, Mani DR, Aebersold R
An assessment of software solutions for the
analysis of mass spectrometry based quantitative
proteomics data.
J Proteome Res. 2008 Jan;7(1):51-61.
The typical proteomics experiment consists
of five stages
Mass spectrometers used in proteome
research.
Monoistopic Mass = 1155.6
Average Mass = 1156.3 (calculated)
As shown in Figure 1. the monoisotoptic
mass of this compound is 1155.6. For a
given compound the monoisotopic mass is
the mass of the isotopic peak whose
elemental composition is composed of the
most abundant isotopes of those elements.
The monoisotopic mass can be calculated
using the atomic masses of the isotopes.
The average mass is the weighted average
of the isotopic masses weighted by the
isotopic abundances. The average mass
can be calculated using the atomic weights
of the elements.
www.ionsource.com
Atomic Masses and Abundances for a Subset of Naturally Occurring Biologically
Relevant Isotopes
Is
o
A
%
I
s
o
12
C
12
98.93(8)
1
3
C
13.0033548378(10)
1.07(8)
14
C
A+1
%
I
s
o
%
I
s
o
A+3
14.003241988(4)
-
-
3.0160492675(11)
-
A+2
%
I
s
o
A+4
%
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1H
1.0078250321(4)
99.9885(70)
2
H
2.0141017780(4)
0.0115(
70)
3H
14
N
14.0030740052(9)
99.632(7)
1
5
N
15.0001088984(9)
0.368(7)
-
16
O
15.9949146221(15)
99.757(16)
1
7
O
16.99913150(22)
0.038(1)
18
O
17.9991604(9)
0.205(14)
-
-
-
32
S
31.97207069(12)
94.93(31)
3
3
S
32.97145850(12)
0.76(2)
34
S
33.96786683(11)
4.29(28)
-
-
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
36
S
35.96708088(25)
0.02(1)
19F
18.99840320(7)
100
-
-
-
-
-
-
-
-
-
-
-
-
23
Na
22.98976967(23)
100
-
-
-
-
-
-
-
-
-
-
-
-
39
K
38.9637069(3)
93.2581(44)
4
0
K
6.7302(44
)
-
-
-
-
-
-
31
P
30.97376151(20)
100
-
-
-
-
-
-
-
-
-
-
-
35
Cl
34.96885271(4)
75.781(4)
-
-
-
37
Cl
-
-
-
-
-
-
55
Mn
54.9380496(14)
100
-
-
-
-
-
-
-
-
-
-
-
54F
e
53.9396148(14)
5.845(35)
-
-
-
56
Fe
55.9349421(15)
91.754(36
)
5
7
F
e
63
Cu
62.9296011(15)
69.17(3)
-
-
-
65
C
u
64.9277937(19)
30.83(3)
-
-
-
-
-
-
79
Br
78.9183376(20)
50.69(7)
-
-
-
81
Br
80.916291(3)
49.31(7)
-
-
-
-
-
-
127
I
126.904468(4)
100
-
-
-
-
-
-
-
-
-
-
39.96399867(29)
0.0117(
1)
41
K
40.96182597(28)
-
36.96590260(5)
-
-
24.22(4)
-
56.9353987(15)
2.11
9(10
)
58
Fe
57.9332805(15)
0.282(
4)
Peak Abundance, “Mass Crossover” and
Calibration
The Nobel Prize in Chemistry 2002
"for the development of methods for identification and
structure analyses of biological macromolecules"
"for their development of soft desorption ionisation
methods for mass spectrometric analyses of biological
macromolecules"
John Fenn
Koichi Tanaka
Mass Spectrometry:
A method to “weigh” molecules
Other information can be inferred
from a weight measurement.
• Post-translational
modifications
• Molecular
interactions
• Shape
• Sequence
• Physical dimensions
A simple measurement of
mass is used to confirm the • etc...
identity of a molecule, but it
can be used for much
more……
Matrix-assisted Laser
Desorption/Ionization (MALDI)
Time-of-Flight (TOF) Analyzer
detector
high voltage
v3
m3
MALDI
v2
m2
v1
m1
sample
laser
drift region
m1
m2
m3
Electrospray: Generation of aerosols and droplets
“Wings to Molecular Elephants”
Electrospray Ionization (ESI)
ESI
MS
highly charge
droplets
20+
19+
18+
21+
17+
16+
22+
15+
500
700
900
14+
1100
mass/charge (m/z)
• Multiple charging
– More charges for larger molecules
• MW range > 150 kDa
• Liquid introduction of analyte
– Interface with liquid separation
methods, e.g. liquid
chromatography
– Tandem mass spectrometry
(MS/MS) for protein sequencing
Origin of the ES Spectra of Peptides
m/z = (Mr+3H)/3
4+
m/z = (Mr+H)
3+
2+
1+
H
H
H
H
H
H
m/z = (Mr+4H)/4
H
H
H
H
m/z = (Mr+2H)/2
ES-MS
2+
3+
Rel. Inten.
1+
4+
m/z
Theoretical CID of a Tryptic Peptide
+
+
F L G
+
F L G K
+
K
b3
y1
+
Parent
ions
+
+
+
G K
b2
CID
+
F L G K
y2
+
+
F
L G K
b1
+
F L G K
y3
Non-dissociated
Parent ions
Daughter ions
y1
+
F L G K
+
F L
F L G K
+
y3
b1
y2 b
2
MS/MS
Spectrum
K
G
L
F
L
G
F
b3
K
m/z
(464.29)
Peptide Sequencing by LC/MS/MS
Web addresses of some representative internet
resources for protein identification from mass
spectrometry data
Program
Web Address
BLAST
http://www.ebi.ac.uk/blastall/
Mascot
http://www.matrixscience.com/cgi/index.pl?page=/home.html
MassSearch
http://cbrg.inf.ethz.ch/Server/ServerBooklet/MassSearchEx.html
MOWSE
http://srs.hgmp.mrc.ac.uk/cgi-bin/mowse
PeptideSearch
http://www.narrador.emblheidelberg.de/GroupPages/PageLink/peptidesearchpage.html
Protein Prospector
http://prospector.ucsf.edu/
Prowl
http://prowl.rockefeller.edu/
SEQUEST
http://fields.scripps.edu/sequest/
Data Mining through SEQUEST and
PAULA
Database
•Yeast ORFs (6,351 entries)
•Non-redundant protein (100k entries)
•EST (100K entries, 3-frames)
Search Time
52 sec:
0.104 sec/s
3500 min:
5-10,000 min:
SEQUEST Algorithm
Step 1.
Determine Parent
STEP 1.
Ion molecular
Step 2.
Theoretical MS/MS
spectra
SEQ 1
mass
SEQ 2
SEQ 3
(Experimental MS/MS
Spectrum)
SEQ 4
500 peptides with masses
closest to that of the parent ion
are retrieved from a protein
database. Computer generates
a theoretical MS/MS Spectrum
for each peptide sequence
(SEQ1, 2, 3, 4, …)
ZSA-charge assignment
Step 4.
Scores are ranked and
Protein Identifications
are made based on
these cross
correlation scores.
Step 3.
STEP
3.
Experimental Spectrum is
compared with each theoretical
spectra and correlation scores
are assigned.
Unified Scoring Function
(Experimental
MS/MS Spectrum)
Amplification of False Positive Error Rate
from Peptide to Protein Level
5
correct
(+)
+
Peptide 1
Peptide 2
+
+
+
Peptide 3
Peptide 4
Peptide 5
Peptide 6
Peptide 7
+
Peptide 8
Peptide 9
Peptide10
Peptide Level: 50% False
Positives
Prot A
Prot B
Prot
Prot
Prot
Prot
Prot
in the sample
(enriched for
‘multi-hit’ proteins)
not in the
sample
(enriched for
‘single hits’)
Protein Level: 71% False
Positives
Quantitative Mass Spec Analysis
1. Relative Quantitation
a. SILAC and iTRAQ
b. Digestion with Oxygen-18 Water
c. Spectra Counting and Non-labeling
Methodology
2. Absolute Quantitation
Trypsin Digestion with Oxygen18 and
Oxygen16 Water
Limitation of SILAC
Multiplexed Isobaric Tagging Technology
(iTRAQ)
Isobaric Tag = 145
Philip L. Ross, et al. Molecular & Cellular Proteomics 3:1154–1169, 2004.
Release of 114 and 117 Reporter Ions
BSA114_115_21 #6272
RT: 33.88
AV: 1
NL: 1.99E6
T: ITMS + c ESI d Full ms2 [email protected] [50.00-1965.00]
650.29
100
95
90
85
80
75
Parent Ion
70
65
60
55
50
45
40
35
30
25
20
649.67
15
10
5
371.76
133.01
0
200
400
619.71
928.15
653.27
600
800
1054.49
1000
m/z
1154.70
1437.90
1200
1560.73
1400
1647.55
1600
1800
BSA114_117_31 #6446
RT: 34.78
AV: 1
NL: 1.58E5
T: ITMS + c ESI d Full ms2 [email protected] [50.00-1965.00]
371.56
100
95
90
Regular CID to obtain sequence
Low mass cut-off and no reporter ion
85
80
75
70
65
60
55
50
45
372.42
40
35
30
927.89
25
20
929.26
15
420.28
10
5
145.13
291.36
742.55
578.07
635.59
822.79
1010.58
1154.58
0
200
400
600
800
1000
m/z
BSA114_117_35 #6127
RT: 32.86
AV: 1
NL: 1.34E4
T: ITMS + c ESI d Full ms2 [email protected] [50.00-1965.00]
1251.79
1200
1435.84
1400
1536.66
1711.79
1600
1800
928.57
100
95
420.15
90
85
80
75
742.65
70
291.14
402.30
65
60
55
50
45
40
606.37
735.56
114.05
35
878.72
30
25
578.14
High Energy Collision Cell
to quantify and sequence
772.20
145.06
20
15
228.20
1010.13
10
1034.50
5
1430.54
1501.85
1167.64
1566.86
0
200
400
600
800
1000
m/z
1200
1400
1600
1800
PSD_117: PSD_114=2:1
Loading 10ug
9 salt cuts online 2D_LC_MS/MS
962 proteins are quantified
Protein name
PSD93
PSD95
PSD95-AP1
GABA alpha
GABA beta
NR2B
AMPA1
AMPA2
AMPA4
NR1
350
Num. of Proteins
300
250
200
150
100
50
117/114 ratio
Expected ratio
5.25
5
4.75
4.5
4.25
4
3.75
3.5
3.25
3
2.5
2.75
2.25
2
1.75
1.5
1
1.25
0.75
0.5
0.25
0
0
117/114 Num of
ratio
pep
2.829
5
2.021
21
1.764
2
1.365
2
2.087
3
1.813
4
2.092
7
1.921
11
1.902
4
1.658
6
Absolute Quantification
Johri et al. Nature Reviews Microbiology 4, 932 – 942 (December 2006) | doi:10.1038/ nrmicro1552
Public Web Server
http://www.matrixscience.com/search_form_select.html
Class Data Download:
http://10.90.157.112/GPLS716
Local Web Server
http://10.90.157.112/mascot
Username: GPILS
Password: GPILS
MS1 PMF(peptide mass fingerprinting)
Search Example
• Data: testms1.txt, 210 MS1 peaks
• Database: bovine
• Fixed modifications : Carboxymethyl (C)
Variable modifications : Oxidation (M)
• Peptide Tolerance: 0.1 Da
• Monoisotopic mass
• Mass Value: Mr
Quantification Search Example
• Note: Save link as; Save this file to the desktop)
• Data:
18O_BSA_100fmol_1to5_01_071018.RAW.mgf
• Database: bovine
• Fixed modifications : Carbamidomethyl (C)
• Peptide Tolerance: 8 Da (required for O18
labeling)
• Fragment Tolerance: 0.2 Da
• Peptide Charge: Mr
• Quantification Method: 18O corrected multiplex
MS/MS Database Search Example
•
•
•
•
•
•
•
Data: BSA onespectra.mgf (one spectra)
Database: bovine
Fixed modifications: Carboxymethyl(C + 58.01)
Varied modifications: Oxidatation(M)
Peptide Mass Tolerance : 0.1 Da
Fragment Mass Tolerance: 0.1 Da
http://www.matrixscience.com/help/fragmen
tation_help.html
Alkylation of Cysteine Residue
Cysteine
C3H5NOS
103.00918
Carboxymethyl Cys
C5H7NO3S
161.01466
58.00548
MS2 mixture example
•
•
•
•
•
•
Data: mixture10spectra.mgf
Database: yeast
Fixed modifications : Carbamidomethyl (C+57.02)
Variable modifications : Oxidation (M)
Peptide Mass Tolerance : 0.1 Da
Fragment Mass Tolerance: 0.1 Da
Home Work
1. You will have to download your datasets from the following
url:http://10.90.157.112/GPLS716
a. Identification of phosphorylation site : Data:BIG3021307.RAW.mgf
Recommend parameters:
Database: human.
Variable Modification: Phospho(ST)
Fixed modification: Carboamidomethyl(C).
b. Quantificaiton of oxygen-18/oxygen-16 digested BSA
Data: 18O_BSA_500fmol_071013.RAW.mgf.
Submit your search results in pdf or html format to the following email address:
[email protected]; Please include the following information when you submit
your homework
1. Your name and ID in the subject of your email
2. Search parameters
3. A short summary of your search results.
Questions: Contact Yunhu Wan, email: [email protected]
Phone number: 8-2031