www.i2bio.org

Download Report

Transcript www.i2bio.org

Introduction to Protein Chemistry

Gustavo de Souza IMM, OUS October 2013

Relevance of the Proteome

Relevance of the Proteome «The recipe of life» X Chocolate cake: - Egg - Flour - Sugar Baker’s yeast - Chocolate Biological relevance lies on how genes are expressed and translated to proteins, not if genes are present or not

Amino acid structure

AA side chains

Protein Translation

Peptide Bond

Primary Structure

Primary Structure > sp|F2Z333|CA233_HUMAN Fibronectin type-III domain-containing transmembrane protein C1orf233 MRAPPLLLLLAACAPPPCAAAAPTPPGWEPTPDAPWCPYKVLPEGPEAGGGRLCFRSPAR GFRCQAPGCVLHAPAGRSLRASVLRNRSVLLQWRLAPAAARRVRAFALNCSWRGAYTRFP CERVLLGASCRDYLLPDVHDSVLYRLCLQPLPLRAGPAAAAPETPEPAECVEFTAEPAGM QDIVVAMTAVGGSICVMLVVICLLVAYITENLMRPALARPGLRRHP

Folding

Primary Structure - Folding > sp|F2Z333|CA233_HUMAN Fibronectin type-III domain-containing transmembrane protein C1orf233 MR

APPLLLLLAACAPPPCAAAAPTPPGW

EPTPDAPWCPYKVLPEGPEAGGGRLCFRSPAR GFRCQAPGCVLHAPAGRSLRASVLRNRSVLLQWRLAPAAARRVRAFALNCSWRGAYTRFP CERVLLGASCRDYLLPDVHDSVLYRLCLQPLPLRAGPAAAAPETPEPAECVEFTAEPAGM QD

IVVAMTAVGGSICVMLVVICLLVAYITENLM

RPALARPGLRRHP

Folding Proteins can adopt only a limited number of different protein folds

Secondary Structure

Tertiary Structure

Quaternary Structure

Primary to Quaternary

Primary to Quaternary

What is a «protein sample» in proteomics?

RNA-binding protein modules

Take home message 1. Proteins are the functionally active molecule in a cell.

2. They possess a high degree of chemical and structural heterogeneity.

3. Heterogeneity interfere in how a protein sample can be analyzed

Challenges in Protein and Proteomic Analysis

Gustavo de Souza IMM, OUS October 2013

A dangerous idea… One gene, one protein

Homo sapiens

Complexity of Protein Samples in Eukaryotes

Complexity of Protein Samples in Eukaryotes

A less dangerous idea One gene, some proteins (let’s say average 5 per gene)

Homo sapiens

Complexity of Protein Samples in Eukaryotes PTMs (modifications that control conformation changes in histones)

An even less dangerous idea One protein, possible 8 modification sites

Homo sapiens

An even less dangerous idea

But in reality… One specific cell does NOT express all genes at once!

-Several transcriptomics studies indicated that the cells under study have ~14000 transcripts

at a certain time

Homo sapiens

A Proteome Dynamics B C Genome is a relatively static element of an organism, the proteome is changing accordingly to cell type, cell stage developmet, response to stress, etc.

Proteome dynamics within the same cell Proteome can change with the least of the stimuli within a cell

Proteome chemical heterogeneity

DNA -

Negatively charged molecule Has the same phisico-chemical features regardless of: its nucleotide sequence, its tissue source, its donor source, the species of the donor, etc.

Amino acid structure

AA side chains

Proteome chemical heterogeneity Membrane proteins

Proteome dynamic range Genome Transcription/Translation Mostly, individual genes are observed equimolar amounts in a DNA molecule Protein concentration within a cell is unique to each individual protein Difference between most and least abundant molecule = dynamic range

Proteome dynamic range

Proteome dynamic range Dynamic range of a proteome estimated to be around 10e8 (in serum is believed to be over 10e10) Geiger et al., MCP 2012

Proteome dynamic range Difference between the most and lowest abundant proteins Cytoskeleton (Actin, tubulin, vimentin) Chaperons (hsp60, hsp70, calreticulin) Metabolism (glycolisis, ribosomal) Mytochondria (respiratory chain) Structure Nucleus (histones) Organelles Protein GO classification

Signalling pathway proteins, transcription factors, etc

Instrumentation Aebersold & Mann, Nature 2003

Instrumentation - Instrumentations with different hardware generate different types of raw data.

- Different brands developed different computer formats, with need for different libraries to read the file.

- Which lead to development of a whole bunch of specific software using specific computational protocols.

- Lack of standard routine.

Take home message 1. Proteomic composition is at least 6x more complex than the genomic composition of a cell, if only number of entities is considered.

2. It is an ever changing feature, limited by spatial and time constrains.

3. Chemical properties and dynamic range has an relevant impact in success rate of identification using proteomic methods.

4. Instrumentation and Analysis is not standardized.

Introduction to Mass Spectrometry Interpreting peptide/protein data

Gustavo de Souza IMM, OUS October 2013

Lets talk about…physics

3D Quadrupole ion trap Linear Quadrupole ion trap

What is it?

-Instrument which can detect the

mass-to-charge

(m/z) of

ions (or ionized molecules)

.

a) Ionization must generate ions in gas-phase b) Ion detection is proportional to its abundance in the sample c) MS performs at extremely low pressures (vacuum) - Any molecule is ionizable: small organic/inorganic chemicals (less than 300 Da), average sized peptides or DNA fragments, intact proteins.

Mass Spectrometry Scheme

Inlet LC Ion Source MALDI ES Mass Analyzer Time-of-Flight Quadrupole Ion Trap Detector

Ion Intensity = Ion abundance

Isotopes Normally observed in nature.

Mass difference = 1 Da

What to expect from a mass spectrum Avogadro number = 6.022x10e23 /mol m/z

Peptide mass spectrum 100331_Gustavo_Tuberculosis_179rif_Rep1_07 # 2435 T: FTMS + p NSI Full ms [300.00-2000.00] 1034.49

100 RT: 38.32

AV: 1 NL: 4.95E5

95 90 85 80 1034.99

- Isotopes ( 12 C, 13 C, 14 N, 15 N) 55 50 45 40 75 70 65 60 35 30 25 20 15 10 5 0 1033.0

1033.5

1034.0

1034.5

1035.49

1035.0

m/z 1035.5

1035.99

1036.0

1036.5

1037.0

1037.5

Mass Spectrometry Scheme

Inlet LC Ion Source MALDI ES Mass Analyzer Time-of-Flight Quadrupole Ion Trap Detector

How is a sample ionized?

-Electron ionization -Chemical ionization -Fast Atom/Ion Bombardment -Field desorption -Plasma Desorption -Laser Desorption and MALDI -Thermospray -Electrospray -Atmospheric pressure chemical ionization

Matrix Assisted Laser Desorption Ionization

Peptide spectrum on MALDI

Protein spectrum on MALDI

A little history… 1985 – First use: up to a 3 kDa peptide could be ionized 1987 – Method to ionize intact proteins (up to 34 kDa) described

Instruments have no sequence capability

1989 – ESI is used for biomolecules (peptides)

Sequence capability, but low sensitivity

1994 – Term «Proteome» is coined 1995 – LC-MS/MS is implemented

«Gold standard» of proteomic analysis

A little history… - Laborious - Low reproducibility - Time consuming - Low sensitivity - Limited amount of identifications

Electrospray Ionization

Column (75 mm)/spray tip (8 mm) Reverse-phase C18 beads, 3 mm No precolumn or split Platin-wire 2.0 kV

15 cm

Sample Loading:500 nl/min Gradient elution:200 nl/min ESI

Fenn et al., Science 246:64-71, 1989.

ESI multiple charged elements + Peptides + (-NH2) + + + + + + + + + + + + Proteins + + + + + + + + + + + + + + +

+ + + + 1000 Da + + ESI multiple charged elements + + + 250.75 (+4) 334.0 (+3) 500.5 (+2) m/z

100331_Gustavo_Tuberculosis_179rif_Rep1_09 # 3828 T: FTMS + p NSI Full ms [300.00-2000.00] 766.72

100 RT: 56.72

AV: 1 NL: 1.53E7

100331_Gustavo_Tuberculosis_179rif_Rep1_09 T: FTMS + p NSI Full ms [300.00-2000.00] # 3828 T: RT: 56.72

AV: 1 NL: 2.36E6

FTMS + p NSI Full ms [300.00-2000.00] # 3828 100 RT: 56.72

AV: 1 NL: 1.53E7

766.72

95 95 95 90 709.06

90 85 80 1149.07

90 85 80 766.38

85 80 75 70 65 60 75 70 65 60 1150.07

767.05

75 70 65 867.95

55 50 45 40 35 55 50 45 40 35 1150.57

60 55 50 45 40 728.39

30 25 20 15 10 5 0 1148.0

1148.5

1149.0

30 25 20 15 10 5 0 1149.5

764.90

m/z 1151.08

1151.57

1152.08

765.39

766.04

1150.5

765.5

1151.0

766.0

1151.5

766.5

1152.0

767.0

m/z 767.38

767.43

767.72

767.5

768.05

768.39

768.0

768.5

35 30 0.5 Da (+2) 0.33 Da (+3) 25 20 653.33

578.64

15 10 5 0 557.31

343.21

483.80

400 600 800 1149.57

921.51

1063.09

1000 1227.11

1346.65

1453.23

m/z 1200 1400 Mr = 2297.14 Da 1600 1682.72

1891.35

Peptides on ESI 1800 2000 769.0

769.5

ESI of intact protein *

Mass Spectrometry Scheme

Inlet LC Ion Source MALDI ES Mass Analyzer Time-of-Flight Quadrupole Ion Trap Detector

Time-of-flight How is an ion mass measured?

m/z

How is a ion mass measured?

Quadrupoles (RF)

How is a ion mass measured?

Orbitraps

Tandem Mass Spectrometry

Inlet Ion Source Mass Analyzer Detector Ion Source Mass Analyzer Mass Analyzer Mass Analyzer Detector

Collision cell

Data Dependent Acquisition MS1 (or MS)

899.013

*

899.013

MS2 (or MS/MS)

899.013

Important Parameters in MS - Resolution - Sensitivity Dynamic Range… m/z m/z

2+ High resolution in MS Expected mass Observed mass 2+ m/z 1. mass accuracy m/z

High resolution in MS

600 400 200 0 -200 -400 -600 0 Ion trap (LTQ) Mass accuracy 1000 Av. = 65.8 ppm ± 2000 71.5

3000 Mass [Da] 30 20 10 0 -10 500 -20 -30 FTICR MS (LTQ-FT, 500K) 1000 1500 2000 2500 Av. = 2.1 ppm ± 1.9 3000 Mass [Da] 2 1 0 -1 0 -2 -3 60 40 20 0 -20 -40 -60 0 qTOF Mass Accuracy (QSTAR) 1000 2000 Av. = 16.5 ppm ± 11.2 3000 Mass [Da] FTICR MS SIM (LTQ-FT, 50K) 1000 2000 3000 Av. = 0.68 ppm ± 0.47 Mass [Da] 4000

1. mass accuracy

RT 2+

3+

High resolution in MS 2+

3+

RT m/z 2. Peak separation m/z

LC-MS/MS

With all we (hopefully) learned so far 1) Use strong detergent for cell lysis and protein solubization (SDS, Triton, NP40, Tween) 2) LysC (cuts C-terminal side of K) and/or Trypsin (C-terminal of K and R)

With all we (hopefully) learned so far ADFFFSTTHAAS

R

MSHHHGTYYPPH

KR

FSDDDDT ADFFFSTTHAAS

R

MSHHHGTYYPPH

K

FSDDDT + + Arg Lys

With all we (hopefully) learned so far 3) Nano-LC (300nL/min) 5) Quadrupole-Orbitrap (QExactive)

With all we (hopefully) learned so far Mobile phase 20 s A B C18 column, 25cm long Time A = 5% organic solvent in water B = 95% organic solvent in water

With all we (hopefully) learned so far MS1 (or MS)

899.013

899.013

MS2 (or MS/MS)

899.013

With all we (hopefully) learned so far Orbitrap Quadrupole

With all we (hopefully) learned so far From Michalski et al., MCP 10, 2011.

172,800

Take home message

- Mass spectrometry is used to analyze the molecular mass of molecules. - Great diversity of hardware and principles. Different forms of Ionization and Mass measurement.

- For protein ID, information regarding the mass of a integral peptide and the mass of its fragments is enough to provide identification