Transcript Slide 1

PRG2007 Research Study
Advanced Quantitative Proteomics
http://www.abrf.org/prg
1
ABRF PRG 2007
PRG Members
•
•
•
•
•
•
•
•
•
Arnold Falick (Chair) – UC Berkeley HHMI
William Lane (EB Liason) – Harvard University
Kathryn Lilley (ad hoc) – University of Cambridge
Michael MacCoss – University of Washington
Brett Phinney – UC Davis Genome Center
Nicholas Sherman – University of Virginia
Susan Weintraub – Univ. Texas Heath Science Center
Ewa Witkowska – UC San Francisco
Nathan Yates – Merck Research Laboratories
2
ABRF PRG 2007
Past Research Studies
• PRG2002: Identification of Proteins in a Simple Mixture
– Task: Identify components of a 5 protein mixture
• PRG2003: Phosphorylation Site Determination
– Task: Identify 2 phosphopeptides and sites of phosphorylation
• PRG2004: Differentiation of Protein Isoforms
– Task: Discrimination of 3 closely related proteins
• PRG2005: Sequencing Unknown Peptides
– Task: De novo sequence analysis of 5 peptide mixture
• PRG2006: Quantification of Proteins from a Simple Mixture
– Task: Relative Abundance of 8 Proteins Between 2 Different
Samples
3
ABRF PRG 2007
PRG2007 Study Objectives
• What methods are used in the community for
assessing differences between complex mixtures?
• How well established are quantitative
methodologies in the community?
• What is the accuracy of the quantitative data
acquired in core facilities?
• We wanted to build upon last years study by
providing samples that were more complicated, yet
more realistic.
4
ABRF PRG 2007
PRG2007 Sample Design
Identical
Sample A
Sample B
Sample C
100 µg E. coli lysate
100 µg E. coli lysate
100 µg E. coli lysate
12 Total Protein Spikes
- 10 Non-E. coli proteins
- 2 E. coli proteins
12 Total Protein Spikes
- 10 Non-E. coli proteins
- 2 E. coli proteins
12 Total Protein Spikes
- 10 Non-E. coli proteins
- 2 E. coli proteins
Spikes at Different Levels and Ratios
5
ABRF PRG 2007
PRG2007 Study Tasks
• Identify the proteins that had altered components
between the samples
• Determine the relative amounts of the proteins
between samples
6
ABRF PRG 2007
0
*E. coli Proteins
*
7
SampleA
SampleB
Ubiquitin
Tryptophanase
Serum albumin
0.31
Myoglobin
5
Lactoperoxidase
4.60
Horseradish peroxidase
0.67
Hexokinase
0.45
Glycerokinase
0.67
Glucose oxidase
25
Cytochrome c
Catalase
Carbonic anhydrase I
Quantity (pmol)
Proteins in PRG2007 Sample
4.60
20
15
2.20
10
10.00 0.67 0.31
0.31
0.31
*
ABRF PRG 2007
Proteins in PRG2007 Sample
Table 1. Proteins in PRG07 study samples
Quantity (pmol)1
2
Protein
Carbonic anhydrase I
Catalase
Cytochrome c
Glucose oxidase
Glycerokinase4
Hexokinase
Horseradish peroxidase
Lactoperoxidase
Myoglobin
Serum albumin
Tryptophanase4
Ubiquitin
Accession
Number3
69
465
3870
152
904
2938
2466
2648
161
1213
2366
4014
M.W.
(kDa)
28.9
57.5
13.0
80.0
54.0
50.0
43.3
77.5
16.5
66.6
51.0
8.7
A
2.50
0.50
2.50
0.50
2.50
0.50
5.00
2.50
0.50
5.00
5.00
5.00
B
1.14
0.34
11.50
0.33
0.78
0.16
11.00
0.78
5.00
3.33
1.56
23.00
Ratio
(B/A)
0.45
0.67
4.60
0.67
0.31
0.31
2.20
0.31
10.00
0.67
0.31
4.60
1
Sample C contained the same quantities of protein as sample B.
Proteins were purchased from Sigma-Aldrich.
3
Accession number in PRG database
4
Added E. coli proteins
2
8
ABRF PRG 2007
Protein Sequence Database
>gi|16131131|ref|NP_417708.1| putative membrane protein [Escherichia coli K12]
MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQVNVHDNQLVK
KGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMSREEIDQANNVLQTVLHQLAK
AQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGSTAVALVKQNSFYVLAYMEETKLEGVRPGYR
AEITPLGSNKVLKGTVDSVAAGVTNASSTRDDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTT
ATVVVTGKQDRDESQDSFFRKMAHRLREFG
Was converted to:
>PRG_seq_5 ABRF_PRG2007_Protein_5
MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQVNVHDNQLVK
KGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMSREEIDQANNVLQTVLHQLAK
AQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGSTAVALVKQNSFYVLAYMEETKLEGVRPGYR
AEITPLGSNKVLKGTVDSVAAGVTNASSTRDDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTT
ATVVVTGKQDRDESQDSFFRKMAHRLREFG
The file contains:
1) 4,346 protein sequences
2) common contaminants (e.g. keratins, trypsin, etc...)
3) an equal number of decoy sequences
9
ABRF PRG 2007
Samples Analyzed by 2D DIGE
Sample A
Sample B
Sample C
A pooled standard of all three samples was made and labelled with Cy5 (red). The samples
were then labelled individually with Cy3 (green) and each gel was run with a single sample
versus pooled standard.
10
ABRF PRG 2007
Samples by µLC-MS (1 µg on column)
Base Peak Chromatograms
100
Sample A
80
60
40
20
0
25
45
65
85
105
125
145
165
185
145
165
185
Time (min)
Sample B
100
80
60
40
20
0
25
45
65
85
105
125
11 Time (min)
ABRF PRG 2007
Demographics of the Participants
Non-member
21 (49%)
ABRF member
22 (51%)
12
ABRF PRG 2007
Demographics of the Participants
Quantitative Data
Returned = 35
Total Participants = 43
8 (19%)
1 (2.3%)
Academia
Vendor
Other (non-profit)
Government
Biotech/pharma
7 (20%)
1 (2.9%)
1 (2.9%)
2 (4.7%)
4 (11%)
6 (14%)
26 (60%)
22 (63%)
87 Labs Requested Samples: 49% Return Rate
13
ABRF PRG 2007
PRG2007 Abbreviations
DIGE
ICPL
iTRAQ
ICAT
18O
SRM
Differential In-Gel Electrophoresis
Isotope Coded Protein Label
isobaric Tags for Relative and Absolute
Quantitation
Isotope Coded Affinity Tag
Stable Oxygen Isotope Label
Selected Reaction Monitoring
14
ABRF PRG 2007
35 Participants Returned Methods Used
MS-Based
26 (74%)
Gel-Based
9 (26%)
15
ABRF PRG 2007
Techniques Applied
6 (17%)
2 (5.6%)
5 (14%)
1 (2.8%)
1 (2.8%)
iTRAQ
ICPL
ICAT
18O-Labeling
Label Free
2D Gels (nonDIGE)
2D DIGE
5 (14%)
16 (44%)
16
ABRF PRG 2007
Results: True Positives vs False Positives
17
ABRF PRG 2007
Results: True Positives vs False Positives
18
ABRF PRG 2007
Quantitative Accuracy:
2D Gels
Label Free
Stable
Isotope
Labeling
Ubiquitin
A = 5 pmol
B = 23 pmol
B/A Ratio
8
Anticipated Mole Ratio
4.6
6
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
4
2
0
19
ABRF PRG 2007
Quantitative Accuracy: Myoglobin
2D Gels
Label Free
Stable
Isotope
Labeling
16
A = 0.5 pmol
B = 5 pmol
14
B/A Ratio
12
Anticipated Mole Ratio
10
10
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
8
6
4
2
20
0
ABRF PRG 2007
Quantitative Accuracy: Serum Albumin
2D Gels
3.5
Label Free
Stable
Isotope
Labeling
A = 5 pmol
B = 3.3 pmol
B/A Ratio
3
2.5
Anticipated Mole Ratio
0.67
2
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
1.5
1
0.5
21
ABRF PRG 2007
Quantitative Accuracy:
2D Gels
Label Free
Stable
Isotope
Labeling
1.8
Carbonic Anhydrase I
A = 2.5 pmol
B = 1.14 pmol
1.6
B/A Ratio
1.4
Anticipated Mole Ratio
0.45
1.2
1
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
0.8
0.6
0.4
0.2
22
ABRF PRG 2007
Quantitative Accuracy:
2D Gels
Label Free
Stable
Isotope
Labeling
Glucose Oxidase
A = 0.5 pmol
B = 0.33 pmol
1
B/A Ratio
0.8
Anticipated Mole Ratio
0.67
0.6
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
0.4
0.2
0
23
ABRF PRG 2007
Quantitative Accuracy:
2D Gels
Label Free
Stable
Isotope
Labeling
Hexokinase
A = 0.5 pmol
B = 0.16 pmol
2.5
B/A Ratio
2
Anticipated Mole Ratio
0.31
1.5
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
1
0.5
0
24
ABRF PRG 2007
Quantitative Accuracy:
2D Gels
Label Free
Stable
Isotope
Labeling
Tryptophanase*
A = 5 pmol
B = 1.56 pmol
10
8
B/A Ratio
6
Anticipated Mole Ratio
from 1 to 0.31
4
2
Color Indicates Method Used
iTRAQ
ICPL
ICAT
18O Labeling
Label Free
Label Free + targeted SRM
2D-Gels (nonDIGE)
2D-DIGE
0
25
ABRF PRG 2007
Biggest Challenges Reported – Summary
• Complexity of the proteolytic digest. Long calculation times at several
analytical steps
• To find the resources: spent more than $1000 on [the study] and had
one technician busy for more than a week and a scientist for 2-3 days
• Finding the time
• No automation software available - too much hands-on work.
• Sample solubilization
• The ABRF fasta database: several search algorithms had problems.
• Number of replicates possible, making it difficult to determine a
reasonable error rate, making it difficult to determine whether a protein
is actually differentially expressed
• The MS identification of low abundance differential spots
26
ABRF PRG 2007
Selected Comments
• The study was very good for researchers new to the proteomics field.
• This was an excellent learning experience. This study highlighted my
facility's capabilities (peptide fractionation and MS) and weaknesses
(chemical labeling of proteins and peptides and quant. analysis).
• This years study was a much more realistic sample that imitates real
proteomic samples (without the dynamic range issue from
serum/plasma samples).
• Very interesting study because it addresses a 'real world' issue which is
the relative quantitation of a small number of proteins in a very complex
mixture.
• We didn't have enough time...
• The protein amount of these samples is small and so it is difficult to
have confident results.
27
ABRF PRG 2007
Selected Comments -- Continued
• More sample, more time. We would have run these in at least triplicate
as per our routine operation if we had had more sample and time.
• Make sure the solubilisation is as good as possible: I did not obtain any
useful data from the samples, probably because I was not able to
solubilise the sample completely.
• not fun!!!
• Overall peak intensity of the samples was not as high as the expected
intensity for the amount of protein specified (100 µg) in the study.
• Liked it, because we could evaluate ourselves. For regular samples
(500 µg on gel) I always am able to confidently assign most proteins.
That was not so with the concentrations here.
28
ABRF PRG 2007
Would you do this sort of study again?
N = 38
Yes, it was fun
No, it was too time
consuming
N=0
No way, never again
N=0
N=4
Other
0
5
10
15
20
25
30
35
40
Number of Responses
Other Responses:
• Yes, learned a lot, but need to watch resources
• Yes, but time issue
• Maybe
• Yes, but it was not fun
29
ABRF PRG 2007
Conclusions
• Quantitative proteomics experiments are complex and
require many factors for success
• A handful of participants reported excellent results
indicating that quantitative results are achievable
• Participants using similar techniques did not obtain similar
performance and suggests that expertise is a key factor
• Head to head comparisons of different approaches is not
possible because of the high dependence on expertise
• Interest in this area is high and many labs appear to be
developing these capabilities
30
ABRF PRG 2007
Acknowledgements
•
•
•
•
•
Kevin Hakala (UTHSCSA)
Michelle Salemi (UC Davis)
Rich Eigenheer (UC Davis)
Matthew Russell (University of Cambridge)
Ekaterina Deyanova (Merck Research
Laboratories)
31
ABRF PRG 2007
Acknowledgements
A huge thanks to all the labs that
participated in this year’s study!
32
ABRF PRG 2007