Transcript Slide 1
Day 3 morning Proteomics Mining Differential Expression ? PROTEOMICS PROBLEMS R Y X R Protein Complexes & protein-ligand interactions Z Modifications HIGH THROUGHPUT PROTEOMICS ? Analysis of all protein components; a ‘snapshot’ Useful for differential comparisons of tissues or cells Complementary to RNA micoarray Information about disease states Identify useful drug target or diagnostic marker May use robotics or semi-automated procedures (medium-high throughput) SINGLE PROTEIN PROTEOMICS Y X Z Peptide Mass Fingerprinting (What is it?) Amino Acid Modifications (What is it? Where is it?) Amino Acid Changes (Mutations) (What is it? Where is it?) Protein Characterization using Mass Spectrometry Mass Spectrometry Schematic Vacuum 1x10-5 to 10-11 Sample in Inlet System Ion Source Mass Analyzer Detector Inlet Systems: Simple vacuum lock HPLC GC Data System Ion Sources: Matrix Assisted Laser Desorption Ionization (MALDI) Electrospray (ESI) Atmospheric Pressure Chemical Ionization (APCI) Electron Ionization (EI) FAB / LSIMS Mass Analyzers: Multipole (Quad, Hexa, Octa) Time-of-flight (TOF) Traps (FT-ICR and QIT) Magnetic Sectors Mass Spectrum Mass spectrometry is an analytical method used to measure Mass-to-charge ratio of a molecule For example: For example: MW = 2,000 MW = 2,000 Charge = + 2 Charge = +1 M/z = 1,000 M/z = 2,000 Mass spectroscopy: Ionization modes MALDI Sample in solid state Mix with matrix material Laser desorption Salt ‘tolerant’ No pre-analysis separation Time-of-Flight detector Electrospray (ESI) Liquid sample sprayed & desolvated No matrix materials Salt intolerant Direct coupling to LC or HPLC Many detectors MALDI-TOF Mass Spectrometer MALDI-TOF Block Diagram MALDI-TOF Spectrum MASS ACCURACY IS CRITICAL TO PROPER IDENTIFICATION OF PROTEINS (PMF) AND CHARACTERIZING MODIFICATIONS AND ALTERATIONS IN PRIMARY STRUCTURE. Information from a Spectrum Average Mass [M + H]+ = 1047.2045 C = 12.011 Intensity 100 0 1,040 1,044 1,048 1,052 Mass-to-charge (m/z) Angiotensin II C50H71N13O12 Monoisotopic Mass [M + H]+ = 1046.5422 C Defined as 12.000000 Mass accuracy; Important for Protein ID Accuracy = Theoretical m/z - Measured m/z X 1000 Theoretical m/z Theoretical (M+H)+ = 1378.8417 M = HGTVVLTALGGILK 73 ppm 100 1378.7409 (M+H)+ File calibration 1277.6970 50 1277.6970 0 % Intensity 13 ppm 100 1378.8594 (M+H)+ External calibration 50 1400.8627 0 Angiotensin I 100 50 Internal calibration 1296.6853 2 ppm Des-arg1-Bradykinin 904.4682 0 872.0 1378.8390 (M+H)+ 1040.4 1208.8 1377.2 Mass-to-charge (m/z) Glu1-Fibrinopeptide B 1570.6776 Neurotensin 1672.9175 1545.6 1714.0 Protein Digestion & Identification Select spot or band EXCISE SPOT Reduce & Alkylate Digest Extract ALL IN THE GEL! MASS SPEC DATABASE QUERY ID Concepts Glycine 57.0 Alanine 71.1 Protease cleavage is specific and generates unique peptides Serine 87.1 Proline 97.1 Valine 99.1 Every amino acid has a different molecular mass Threonine 101.1 Cysteine 103.1 Leucine 113.2 Masses are ‘unique’ to protein sequence Isoleucine 113.2 Asparagine 114.2 Aspartic acid 115.1 Glutamine 128.1 Lysine 128.2 Glutamic acid 129.1 Methionine 131.2 Histidine 137.1 Phenylalanine 147.2 Arginine 156.2 Tyrosine 163.2 Tryptophan 186.2 G-L-S-E-T-W-D-D-H-K = 1186.5 Da K-H-D-D-W-T-E-S-L-G = 1186.5 Da Proteases Cleave Specifically Trypsin cleaves after Lysine (K) and Arginine (R) Chymotrypsin cleaves after aromatic amino acids Staph aureus V8 cleaves after Aspartate (D) or Glutamate (E) EndoLysC cleaves after Lysine (K) only Therefore, peptide maps (m/z) can be predicted from database sequences (in silico digestion). This is their Mass Fingerprint. Digestion with Trypsin Robust, cheap, stable Specificity Cleaves on the C-terminus of Arginine ® and Lysine (K) ONLY Sequence information about C terminus only ~1 our 10 amino acids, average peptide mass 1100 Da Amino acids favorable for MS (charged!) Intact protein sequence Digest fragments MASDFGHKILGFDSACV MNQWSDFFIILRTHYWE DTYRRLIPMASDFKTYH MNGFDSAILIGRIISCFGK PEQSADRTYIPLMKSDFV CQELISEL -R -LIPMASDFK -MASDFGHK -THYWEDTYR -THYWEDTYRR -RLIPMASDFK -SDFVCQELISEL -PEQSADRTYIPLM -ILGFDSACVMNQWSDFFIILR -TYHMNGFDSAILIGRIISCFGK Protein Digestion & Identification; Mass Fingerprint GKVKVGVNGFGRLIGRVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVK AENGKLVINGNPITIFQERDPKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAP SADAPMFVMGVNHEKYDNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKl TVDGPSGKWRDGRGALQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTC RLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQVVSSDFNSDTHSSTFDAGAGIALNDHFV KLISWYDNEFGYSNRVVDLMAHMASKE Protein Digestion & Identification; Mass Fingerprint GKVKVGVNGFGRLIGRVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTVK AENGKLVINGNPITIFQERDPKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVIISAP SADAPMFVMGVNHEKYDNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHAITATQKl TVDGPSGKWRDGRGALQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTANVSVVDLTC RLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQVVSSDFNSDTHSSTFDAGAGIALNDHFV KLISWYDNEFGYSNRVVDLMAHMASKE DIGEST Experimental m/z H+ 4036.8947 3308.5642 2595.3599 2277.0379 2213.1093 1763.8023 1719.8768 1613.9009 1473.7730 1411.7903 1201.6067 909.4900 873.4676 869.5091 829.4414 805.4315 795.4181 739.3621 694.3518 688.3777 685.4243 Predicted m/z H+ 653.3141 518.2569 458.3085 375.2350 374.2398 361.1982 359.1925 347.1673 345.2496 260.1968 246.1812 204.1342 DATABASE QUERY MATCH!! 4036.8947 3308.5642 2595.3599 2277.0379 2213.1093 1763.8023 1719.8768 1613.9009 1473.7730 1411.7903 1201.6067 909.4900 873.4676 869.5091 829.4414 805.4315 795.4181 739.3621 694.3518 688.3777 685.4243 653.3141 518.2569 458.3085 375.2350 374.2398 361.1982 359.1925 347.1673 345.2496 260.1968 246.1812 204.1342 Mass Fingerprinting: What is a Real Peak? Look for good isotope distribution and good peak shape. Avoid noise ‘spikes’. Intensity is not necessarily an indicator of quality. Automatic peak picking becomes less reliable at low S/N and high mass. Other Peaks in Your Mass Spectrum That are NOT from Your Protein Keratin 3312.308, 2508.145, 2383.952, 1993.977, 1716.851, 1708.713, 1657.793, 1475.785, 1475.749, 1383.69, 1357.696, 1302.715, 1300.53, 1277.71, 1265.637, 1217.616, 1179.6, 1141.519, 1125.542, 1092.503, 1033.516, 1016.501, 1006.43, 999.445, 973.531, 910.415, 874.499, 832.489, 823.39 Tyrptic autolysis 2283.18, 2211.104, 2158.031, 1768.799, 1736.842, 1045.564, 1006.487, 906.504, 842.509 Matrix ion peaks 666, 672, 688, 855, 861, 877, 893, 1060, 1065 Abundant proteins (ie. Albumin, Ribosomes) and/or smears on the gel. How Do We Identify Proteins Using MS? BioInformatics: Search Engines MASCOT (www.matrixscience.com) Protein Prospector (MS-FIT;prospector.ucsf.edu/) ProFound (prowl.rockefeller.edu/prowl-cgi/profound.exe) X! Tandem (www.thegpm.org/TANDEM/) What You Need to Search Database •A list of query masses (monoisotopic preferably) •Protease(s) or cleavage reagents used •Database(s) you want to search (SwissProt, NR) •Estimated mass and pI of protein spot (or band) •Cysteine/methionine or other modifications •Mass tolerance (100 ppm = 1000.0 + 0.1Da) •On-line Search Program (MS-Fit, Profound, Mascot, etc.) Mass Fingerprinting: Mascot search Using MALDI-TOF mass spec data 1. Database 2. Taxonomy 3. Enzyme (essential) 4. Modification (optional) 5. Protein Mass (ignore) 6. Peptide tolerance (100 ppm) 7. Data (quality over quantity) Mass Fingerprinting: Mascot search Mass Fingerprinting: Mascot search Mass Fingerprinting: Mascot search Interpreting the Results Probability based scoring: •Compute the probability that the observed match between the experimental data and mass values calculated from a candidate peptide sequence is a random event. •The correct match, which is not a random event, has a very low probability. Allows Standard Statistical tests to be applied to the results: Mascot score is: -10Log10(P) In a database of 500,000 entries, a 1 in a 1,000 chance of getting a false positive match is a probability of : -P = 1/(1,000 x 500,000) Calculated, it is a Mascot score of 87 Mass Fingerprinting: Mascot search What if you don’t find anything? •Widen mass tolerance •Remove taxonomic limits •Go to a larger database •Increase the number of missed cleavages allowed •Increase the number of variable modifications •Get sequence information (MS/MS) Common contaminants Keratin Autolysis peaks Albumin Actin IgG Fragmentation of Peptides: MS/MS •Can give more information/confirmation •Identification of post-translational modifications at residue level •Generate a series of fragments of the original peptide – ideally the bonds you are breaking are the peptide bonds between residues (y ions and b ions) Fragmentation of Peptides: MS/MS Sequence Information Glycine 57.0 Alanine 71.1 Serine 87.1 Proline 97.1 Valine 99.1 Threonine 101.1 Cysteine 103.1 Leucine 113.2 Isoleucine 113.2 Asparagine 114.2 Aspartic acid 115.1 Glutamine 128.1 Lysine 128.2 Glutamic acid 129.1 Methionine 131.2 Histidine 137.1 Phenylalanine 147.2 Arginine 156.2 Tyrosine 163.2 Tryptophan 186.2 How Do We Identify Proteins Using MS/MS? Get database sequences that match precusor peptide mass Actual MS/MS scan AVAGCAGAR CVAAGAAGR VGGACAAAR etc…. Precursor petide [M+H]* = 828.2 CVAAGAAGR AVAGCAGAR b2 y2b3 y3 b4 y4 b5 y5 b6 y6 b7 b2 y2 y3 b3 b4 y4 b5 y5 b6 y6 b7 VGGACAAAR y2 b2 b3 y3 b4 y4 b5 y5 b6 y6 Compare virtual spectra To real spectrum Peptide Score Scoring 1. 2. Detect matches between Theoretical b- and y- ions 3. Compute correlation 4. Rank hits AVAGCAGAR 2.56 CVAAGAAGR 0.57 VGGACAAAR 0.32 How Do We Identify Proteins Using MS/MS? Sequest analysis of MS/MS Ion Trap data How Do We Identify Proteins Using MS/MS? Ion Trap MS/MS of doubly charged ions What if you don’t find anything? Combination of MS and MS/MS searching Mascot allows for combination searching with MS and MS/MS data Collision Induced Dissociation (CID) or Post Source Decay (PSD) are used to fragment individual peptides (MALDI-TOF) DeNovo Sequencing; hand work Typhoon Trio Typhoon Trio Wash glass platen with Mill-Q water only! Do not pry open gels. Keep closed. Keep clean and dry. Typhoon Trio Typhoon Trio Typhoon Trio Set Up Window Typhoon Trio Scan Process Scan at 1000 micron to adjust PMT (photomultiplier tube) voltages. Each Laser PMT voltage (CyDye) must be adjusted individually. PMT voltage starting point 600 volts. We try to adjust PMT voltage to give spot volumes (ImageQuant) in a linear response range (60,000-90,000 ‘counts’). We try to keep responses of all lasers within ~15% of each other. We look for ‘red pixels’ to determine oversaturation of the PMT. Each change in 40 PMT volts is an ~2-fold change in sensitivity. Once satisfied, we scan again at 100 micron (high resolution). Back-down PMT voltage 10 volts for this scan. Cannot assume that all gels will use same PMT voltages. Empirical. Typhoon Trio Pre-Scanning Do not pry open the gels. Wrap gels in plastic wrap. Store at 4C until scanning. Scanning should be performed as soon as possible to reduce diffusion. Post-Scanning Pry open the gels. Gel should stick to Bind-Silane coated plate. Fix gel overnight in MeOH/Acetic acid for Deep Purple Staining. Clean upper plate. Eventually, the Bind-Silane coated plate is cleaned with NaOH. Deep Purple Staining Procedure for Backed gels (large gels): Incubate backed gel in 500 ml Fix Solution with gentle agitation OVERNIGHT. Pour off the Fix Solution and replace with 500 ml Wash Solution. Agitate gently for 30 minutes. Replace with 250 ml of Miili-Q Water. Remove Deep Purple from freezer; allow to stand at RT for 10 minutes before opening. Shake bottle of Deep Purple well. Add 1.25 ml Deep Purple to the water. Incubate with gentle agitation for 1 hour; KEEP IN DARK. Pour off stain (do not use again). Replace with Storage Solution. Cover and store in a dark place until imaging on Typhoon Trio.