Transcript Slide 1
PRG2007 Research Study Advanced Quantitative Proteomics http://www.abrf.org/prg 1 ABRF PRG 2007 PRG Members • • • • • • • • • Arnold Falick (Chair) – UC Berkeley HHMI William Lane (EB Liason) – Harvard University Kathryn Lilley (ad hoc) – University of Cambridge Michael MacCoss – University of Washington Brett Phinney – UC Davis Genome Center Nicholas Sherman – University of Virginia Susan Weintraub – Univ. Texas Heath Science Center Ewa Witkowska – UC San Francisco Nathan Yates – Merck Research Laboratories 2 ABRF PRG 2007 Past Research Studies • PRG2002: Identification of Proteins in a Simple Mixture – Task: Identify components of a 5 protein mixture • PRG2003: Phosphorylation Site Determination – Task: Identify 2 phosphopeptides and sites of phosphorylation • PRG2004: Differentiation of Protein Isoforms – Task: Discrimination of 3 closely related proteins • PRG2005: Sequencing Unknown Peptides – Task: De novo sequence analysis of 5 peptide mixture • PRG2006: Quantification of Proteins from a Simple Mixture – Task: Relative Abundance of 8 Proteins Between 2 Different Samples 3 ABRF PRG 2007 PRG2007 Study Objectives • What methods are used in the community for assessing differences between complex mixtures? • How well established are quantitative methodologies in the community? • What is the accuracy of the quantitative data acquired in core facilities? • We wanted to build upon last years study by providing samples that were more complicated, yet more realistic. 4 ABRF PRG 2007 PRG2007 Sample Design Identical Sample A Sample B Sample C 100 µg E. coli lysate 100 µg E. coli lysate 100 µg E. coli lysate 12 Total Protein Spikes - 10 Non-E. coli proteins - 2 E. coli proteins 12 Total Protein Spikes - 10 Non-E. coli proteins - 2 E. coli proteins 12 Total Protein Spikes - 10 Non-E. coli proteins - 2 E. coli proteins Spikes at Different Levels and Ratios 5 ABRF PRG 2007 PRG2007 Study Tasks • Identify the proteins that had altered components between the samples • Determine the relative amounts of the proteins between samples 6 ABRF PRG 2007 0 *E. coli Proteins * 7 SampleA SampleB Ubiquitin Tryptophanase Serum albumin 0.31 Myoglobin 5 Lactoperoxidase 4.60 Horseradish peroxidase 0.67 Hexokinase 0.45 Glycerokinase 0.67 Glucose oxidase 25 Cytochrome c Catalase Carbonic anhydrase I Quantity (pmol) Proteins in PRG2007 Sample 4.60 20 15 2.20 10 10.00 0.67 0.31 0.31 0.31 * ABRF PRG 2007 Proteins in PRG2007 Sample Table 1. Proteins in PRG07 study samples Quantity (pmol)1 2 Protein Carbonic anhydrase I Catalase Cytochrome c Glucose oxidase Glycerokinase4 Hexokinase Horseradish peroxidase Lactoperoxidase Myoglobin Serum albumin Tryptophanase4 Ubiquitin Accession Number3 69 465 3870 152 904 2938 2466 2648 161 1213 2366 4014 M.W. (kDa) 28.9 57.5 13.0 80.0 54.0 50.0 43.3 77.5 16.5 66.6 51.0 8.7 A 2.50 0.50 2.50 0.50 2.50 0.50 5.00 2.50 0.50 5.00 5.00 5.00 B 1.14 0.34 11.50 0.33 0.78 0.16 11.00 0.78 5.00 3.33 1.56 23.00 Ratio (B/A) 0.45 0.67 4.60 0.67 0.31 0.31 2.20 0.31 10.00 0.67 0.31 4.60 1 Sample C contained the same quantities of protein as sample B. Proteins were purchased from Sigma-Aldrich. 3 Accession number in PRG database 4 Added E. coli proteins 2 8 ABRF PRG 2007 Protein Sequence Database >gi|16131131|ref|NP_417708.1| putative membrane protein [Escherichia coli K12] MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQVNVHDNQLVK KGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMSREEIDQANNVLQTVLHQLAK AQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGSTAVALVKQNSFYVLAYMEETKLEGVRPGYR AEITPLGSNKVLKGTVDSVAAGVTNASSTRDDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTT ATVVVTGKQDRDESQDSFFRKMAHRLREFG Was converted to: >PRG_seq_5 ABRF_PRG2007_Protein_5 MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQVNVHDNQLVK KGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMSREEIDQANNVLQTVLHQLAK AQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGSTAVALVKQNSFYVLAYMEETKLEGVRPGYR AEITPLGSNKVLKGTVDSVAAGVTNASSTRDDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTT ATVVVTGKQDRDESQDSFFRKMAHRLREFG The file contains: 1) 4,346 protein sequences 2) common contaminants (e.g. keratins, trypsin, etc...) 3) an equal number of decoy sequences 9 ABRF PRG 2007 Samples Analyzed by 2D DIGE Sample A Sample B Sample C A pooled standard of all three samples was made and labelled with Cy5 (red). The samples were then labelled individually with Cy3 (green) and each gel was run with a single sample versus pooled standard. 10 ABRF PRG 2007 Samples by µLC-MS (1 µg on column) Base Peak Chromatograms 100 Sample A 80 60 40 20 0 25 45 65 85 105 125 145 165 185 145 165 185 Time (min) Sample B 100 80 60 40 20 0 25 45 65 85 105 125 11 Time (min) ABRF PRG 2007 Demographics of the Participants Non-member 21 (49%) ABRF member 22 (51%) 12 ABRF PRG 2007 Demographics of the Participants Quantitative Data Returned = 35 Total Participants = 43 8 (19%) 1 (2.3%) Academia Vendor Other (non-profit) Government Biotech/pharma 7 (20%) 1 (2.9%) 1 (2.9%) 2 (4.7%) 4 (11%) 6 (14%) 26 (60%) 22 (63%) 87 Labs Requested Samples: 49% Return Rate 13 ABRF PRG 2007 PRG2007 Abbreviations DIGE ICPL iTRAQ ICAT 18O SRM Differential In-Gel Electrophoresis Isotope Coded Protein Label isobaric Tags for Relative and Absolute Quantitation Isotope Coded Affinity Tag Stable Oxygen Isotope Label Selected Reaction Monitoring 14 ABRF PRG 2007 35 Participants Returned Methods Used MS-Based 26 (74%) Gel-Based 9 (26%) 15 ABRF PRG 2007 Techniques Applied 6 (17%) 2 (5.6%) 5 (14%) 1 (2.8%) 1 (2.8%) iTRAQ ICPL ICAT 18O-Labeling Label Free 2D Gels (nonDIGE) 2D DIGE 5 (14%) 16 (44%) 16 ABRF PRG 2007 Results: True Positives vs False Positives 17 ABRF PRG 2007 Results: True Positives vs False Positives 18 ABRF PRG 2007 Quantitative Accuracy: 2D Gels Label Free Stable Isotope Labeling Ubiquitin A = 5 pmol B = 23 pmol B/A Ratio 8 Anticipated Mole Ratio 4.6 6 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 4 2 0 19 ABRF PRG 2007 Quantitative Accuracy: Myoglobin 2D Gels Label Free Stable Isotope Labeling 16 A = 0.5 pmol B = 5 pmol 14 B/A Ratio 12 Anticipated Mole Ratio 10 10 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 8 6 4 2 20 0 ABRF PRG 2007 Quantitative Accuracy: Serum Albumin 2D Gels 3.5 Label Free Stable Isotope Labeling A = 5 pmol B = 3.3 pmol B/A Ratio 3 2.5 Anticipated Mole Ratio 0.67 2 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 1.5 1 0.5 21 ABRF PRG 2007 Quantitative Accuracy: 2D Gels Label Free Stable Isotope Labeling 1.8 Carbonic Anhydrase I A = 2.5 pmol B = 1.14 pmol 1.6 B/A Ratio 1.4 Anticipated Mole Ratio 0.45 1.2 1 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 0.8 0.6 0.4 0.2 22 ABRF PRG 2007 Quantitative Accuracy: 2D Gels Label Free Stable Isotope Labeling Glucose Oxidase A = 0.5 pmol B = 0.33 pmol 1 B/A Ratio 0.8 Anticipated Mole Ratio 0.67 0.6 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 0.4 0.2 0 23 ABRF PRG 2007 Quantitative Accuracy: 2D Gels Label Free Stable Isotope Labeling Hexokinase A = 0.5 pmol B = 0.16 pmol 2.5 B/A Ratio 2 Anticipated Mole Ratio 0.31 1.5 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 1 0.5 0 24 ABRF PRG 2007 Quantitative Accuracy: 2D Gels Label Free Stable Isotope Labeling Tryptophanase* A = 5 pmol B = 1.56 pmol 10 8 B/A Ratio 6 Anticipated Mole Ratio from 1 to 0.31 4 2 Color Indicates Method Used iTRAQ ICPL ICAT 18O Labeling Label Free Label Free + targeted SRM 2D-Gels (nonDIGE) 2D-DIGE 0 25 ABRF PRG 2007 Biggest Challenges Reported – Summary • Complexity of the proteolytic digest. Long calculation times at several analytical steps • To find the resources: spent more than $1000 on [the study] and had one technician busy for more than a week and a scientist for 2-3 days • Finding the time • No automation software available - too much hands-on work. • Sample solubilization • The ABRF fasta database: several search algorithms had problems. • Number of replicates possible, making it difficult to determine a reasonable error rate, making it difficult to determine whether a protein is actually differentially expressed • The MS identification of low abundance differential spots 26 ABRF PRG 2007 Selected Comments • The study was very good for researchers new to the proteomics field. • This was an excellent learning experience. This study highlighted my facility's capabilities (peptide fractionation and MS) and weaknesses (chemical labeling of proteins and peptides and quant. analysis). • This years study was a much more realistic sample that imitates real proteomic samples (without the dynamic range issue from serum/plasma samples). • Very interesting study because it addresses a 'real world' issue which is the relative quantitation of a small number of proteins in a very complex mixture. • We didn't have enough time... • The protein amount of these samples is small and so it is difficult to have confident results. 27 ABRF PRG 2007 Selected Comments -- Continued • More sample, more time. We would have run these in at least triplicate as per our routine operation if we had had more sample and time. • Make sure the solubilisation is as good as possible: I did not obtain any useful data from the samples, probably because I was not able to solubilise the sample completely. • not fun!!! • Overall peak intensity of the samples was not as high as the expected intensity for the amount of protein specified (100 µg) in the study. • Liked it, because we could evaluate ourselves. For regular samples (500 µg on gel) I always am able to confidently assign most proteins. That was not so with the concentrations here. 28 ABRF PRG 2007 Would you do this sort of study again? N = 38 Yes, it was fun No, it was too time consuming N=0 No way, never again N=0 N=4 Other 0 5 10 15 20 25 30 35 40 Number of Responses Other Responses: • Yes, learned a lot, but need to watch resources • Yes, but time issue • Maybe • Yes, but it was not fun 29 ABRF PRG 2007 Conclusions • Quantitative proteomics experiments are complex and require many factors for success • A handful of participants reported excellent results indicating that quantitative results are achievable • Participants using similar techniques did not obtain similar performance and suggests that expertise is a key factor • Head to head comparisons of different approaches is not possible because of the high dependence on expertise • Interest in this area is high and many labs appear to be developing these capabilities 30 ABRF PRG 2007 Acknowledgements • • • • • Kevin Hakala (UTHSCSA) Michelle Salemi (UC Davis) Rich Eigenheer (UC Davis) Matthew Russell (University of Cambridge) Ekaterina Deyanova (Merck Research Laboratories) 31 ABRF PRG 2007 Acknowledgements A huge thanks to all the labs that participated in this year’s study! 32 ABRF PRG 2007