Transcript Label-free

Quantitative Proteomics: Applications and Strategies

Gustavo de Souza IMM, OUS October 2013

A little history… 1985 – First use: up to a 3 kDa peptide could be ionized 1987 – Method to ionize intact proteins (up to 34 kDa) described

Instruments have no sequence capability

1989 – ESI is used for biomolecules (peptides)

Sequence capability, but low sensitivity

1994 – Term «Proteome» is coined 1995 – LC-MS/MS is implemented

«Gold standard» of proteomic analysis

2DE-based approach

2DE-based approach “I see 1000 spots, but identify 50 only.”

LC-MS

Column (75 mm)/spray tip (8 mm) Reverse-phase C18 beads, 3 mm No precolumn or split Platin-wire 2.0 kV

15 cm

Sample Loading:500 nl/min Gradient elution:200 nl/min ESI

Fenn et al., Science 246:64-71, 1989.

MS-based quantitation

Inlet LC Ion Source MALDI ES Mass Analyzer Time-of-Flight Quadrupole Ion Trap Quadrupole-TOF Detector

Peak intensities can vary up to 100x between duplicate runs.

Quatitative analysis MUST be carried on a single run.

Ion Intensity = Ion abundance

MS measure m/z

Sample 1 Sample 2

m/z

Unlabeled peptide: a) b) a) Labeled peptide: b) Isotopic Labeling

Enzymatic Labeling

Metabolic Labeling

m/z m/z m/z m/z

Media with Normal AA (

) X 3

SILAC

Media with Labelled AA (*)

Cells in normal culture media

m/z

Start SILAC labelling by growing cells in labelling media (labelled AA / dialized serum )

* m/z

Passage cells to allow incorporation of labelled AA

X 3

* m/z

By 5 cell doublings cells have incorporated

* m/z

Grow SILAC labelled cells to desired number of cells for experiment Ong SE et al., 2002

Chemical Labeling

Heavy reagent: d8 ICAT ( X =deuterium) Light reagent: d0 ICAT ( X =hydrogen)

N O N S O N

X

O O O N O I Gygi SP et al., 1999

ICAT (

I

sotope-

C

oded

A

ffinity

T

ag)

ICAT ( (

All cysteines labeled with light ICAT

) ) Analyze by LC MS/MS Pro te in A Pro te in B Pro te in C Pro te in D Pro te in E Pro te in F . . . . Quantitate relative protein levels by measuring peak ratios Identify proteins by sequence information (MS/MS scan) NH 2 EACDPLR COOH Thiol-specific group = binds to Cysteins

ICAT Thiol-specific group = binds to Cysteins

Quantitation at MS1 level m/z Double sample complexity, i.e. instrument have more “features” to identify, i.e. decrease in identification rate

iTRAQ (

i

sobaric

T

ag for

R

elative and

A

bsolute

Q

uantitation) Recognizes Arg or Lys Total mass of label = 145 Da ALWAYS Sample prep

iTRAQ

iTRAQ Multiplexing

Metabolic VS Chemical Labeling • Metabolic labeling 15 N labeling - SILAC Living cells Efficient labeling Simple!

• Chemical methods many… but ICAT is prototype Isolated protein sample Depends on chemistry Multi-step protocols Require optimization

Summary Kolkman A et al., 2005

Label-free Mobile phase 20 s A B C18 column, 25cm long Time A = 5% organic solvent in water B = 95% organic solvent in water

Label-free Strassberger V et al., 2010

Summary

Summary

Take home message 1. Quantitation can be done gel-free 2. Labeling can be performed at protein or peptide level, during normal cell growth or in vitro 3. Quantitation can be achieved at MS1 or MS2 level 4. Method choice depends on experimental design, costs, expertise etc 5. In my PERSONAL OPINION, chemical label should be avoided at all costs unless heavy multiplexing is required

State A Light Isotope State B

Applications

Upregulated protein - Peptide ratio >1 C 6 Heavy Isotope Mix 1:1 C 6 Optional Protein Fractionation Digest with Trypsin Protein Identification and Quantitation by LC-MS

Control vs Tumor Cell?

Control vs drug treated cell?

Control vs knock-out cell?

Applications – Cell Biology Geiger T et al., 2012

Applications – Cell Biology

Applications – Immunology Meissner et al, Science 2013

Clinical Proteomics A. Amyloid tissue stained in Congo Red; B. After LMD.

Wisniewski JR et al., 2012

Interactomics Schulze and Mann, 2004 Schulze WX et al., 2005

Signaling Pathways

Take home message 1. Anything is possible!

SILAC

Gustavo de Souza IMM, OUS October 2013

m/z m/z m/z m/z

Media with Normal AA (

) X 3

SILAC

Media with Labelled AA (*)

Cells in normal culture media

m/z

Start SILAC labelling by growing cells in labelling media (labelled AA / dialized serum )

* m/z

Passage cells to allow incorporation of labelled AA

X 3

* m/z

By 5 cell doublings cells have incorporated

* m/z

Grow SILAC labelled cells to desired number of cells for experiment Ong SE et al., 2002

Importance of Dialyzed Serum • non-dialzed serum contains free (unlabeled) amino acids!

No alterations to cell phenotype C2C12 myoblast cell line Labeled cells behaved as expected under differentiation protocols

Why SILAC is convenient?

Why SILAC is convenient?

• Convenient - no extra step introduced to experiment, just special medium • Labeling is guaranteed close to 99%. All identified proteins in principle are quantifiable • Quantitation of proteins affected by different stimuli, disruption of genes, etc.

• Quantitation of post-translational modifications (phosphorylation, etc.) • Identification and quantitation of interaction partners

Catch 22 - SILAC  custom formulation media (without Lys and/or Arg) $$$$$$ - Labeled amino acids – Lys4, Lys6, Lys8, Arg6, Arg10. Use formulation accordingly to media formula (RPMI Lys, 40mg/L) ***** When doing Arg labeling, attention to Proline conversion!

(50% of tryptic peptides in a random mixture predicted to contain 1 Pro)

Proline Conversion!

State A Light Isotope

Typical SILAC experiment workflow

State B Upregulated protein - Peptide ratio >1 C 6 Heavy Isotope Mix 1:1 C 6 Optional Protein Fractionation Digest with Trypsin Background protein - Peptide ratio 1:1 C 6 C 6 Protein Identification and Quantitation by LC-MS

Additional validation criteria * Never use labelled Arg or Lys with same mass difference (Lys6/Arg6)

3 2

Triple SILAC Triple Encoding SILAC allows:  Monitoring of three cellular states simultaneously  Study of the dynamics of signal transduction cascades even in short time scales m/z Blagoev B et al., 2004

Five time point “multiplexing” profile Blagoev B et al., 2004

Quantitative phosphoproteomics in EGFR signaling

8x 8x

0’ EGF 1’ EGF

8x 8x

5’ EGF 5’ EGF

8x 8x

10’ EGF 20’ EGF

SILAC HeLa cells 0-5-10 min.

Cytoplasmic ext.

Nuclear extract 1-5-20 min.

Cytoplasmic ext.

Nuclear extract SCX / TiO2 SCX / TiO2 SCX / TiO2 SCX / TiO2 Lysis and Fractionation Anf digestion Phospho peptide enrichment 4x (10 SCX-frac tions +FT) 44 LC-MS runs ID and quantitation

Blagoev B et al., 2004

MAP kinases activation

40 10 Signal progression EGFr-pY1110 ShcA-pY427 ERK1-pY204 ERK2-pY187 EMS1-pS405 2 1 5 10 EGF (minutes) 15 20

Spatial distribution of phosphorylation dynamics

Cytosolic STAT5 translocates to the nucleus upon phosphorylation

Interactomics Schulze and Mann, 2004 Schulze WX et al., 2005

Limitations Expensive Quantitation at MS1 level  increased sample complexity Cells has to grow in culture. Not a choice for primary cells, tissues or body fluids.

Cell lines have to be dyalized serum-friendly.

SILAC-labeled organism Sury MD et al., 2010

Super-SILAC Geiger T et al., 2010

Spike-In SILAC Geiger T et al., 2013

Take home message 1. Arguably the best labeling strategies: easy to handle, no chemical steps, >98% incorporation  low variability 2. Successfully used in the most diverse applications 3. Cells must be stable and growing in the media 4. There are decent alternative strategies for primary cells or organisms.

Label-free

Gustavo de Souza IMM, OUS October 2013

Label-free

Label-free 10 s 500 fmol peptide Time 100 fmol peptide Time Strassberger V et al., 2010

Label-free Kiyonami R. et al, Thermo-Finnigan application note 500, 2010.

x x

Label-free

x x x x

Ideal (low std)

x x

Replicates

x x

Reality (late 90’s)

x x

Replicates

Label-free Strassberger V et al., 2010

Label-free Neilson et al., Proteomics 2011

MS1 (or MS) Spectral Count

899.013

899.013

MS2 (or MS/MS)

899.013

Spectral Count 20 s Time Time Depending on how complex the sample is at a specific retention time, the machine might be busy (i.e., doing many MS2) or idle (i.e., few or none MS2)

MS scan MS2 scan Limitation in Spectral Count Time Time 2 counts 2 counts

Area Under Curve measurement AUC Retention Time

Ion intensity in one MS1 MS2 scan Area Under Curve measurement Retention Time

RT Importance of Resolution for label-free 2+ 2+

3+ 3+

RT m/z m/z

Importance of Resolution for label-free -Label-free became reliable (*) Cox and Mann, Nature Biotechnol 26, 2008.

Area Under Curve measurement 080711_Gustavo_Mtub_07 # 1001 T: RT: 24.80

FTMS + p NSI Full ms [300.00-2000.00] AV: 1 NL: 3.43E6

100 95 90 85 80 75 70 65 40 35 30 25 60 55 50 45 20 15 10 5 790.90

0 791 792 792.68

793 793.82

794 795.17

795 796.31

796 797 797.73

798.32

798 m/z 798.83

799.33

799 799.83

800 800.32

801

2 1 x

802.13

802.72

1. Retention time 803.40

2. Peak intensity 802 803

3

Cox and Mann, Nature Biotechnol 26, 2008.

Regarding Label free… Calculate individual peptide “Intensity”. Protein Intensity = mean of peptides intensities - LFQ normalization

Data without Normalization -7422 proteins identified - 7105 proteins quantified (95.72%)

How this was demonstrated?

Yeast model Ghaemmagami S. et al., Nature 425, 2003 Huh WK. et al., Nature 425, 2003

How this was demonstrated?

Ghaemmagami S. et al., Nature 425, 2003

MaxQuant and Yeast De Godoy LM. et al, 2008.

-Label-free became reliable AND

showed good correlation with a well-established model

Label-free in primary cells

Higher CD4+ Higher CD8+ Pattern Recognition Receptors Pathway

Label-free in primary cells Infection with Sendai virus (activate RIG-I PRR) RIG-I knockout

Take home message 1.

“Labe-free” represents a myriad of ANY method that does not use any labeling 2. Area Under Curve calculations are the most appropriate 3. Reliability is heavily dependent in good instrumentation and good bioinformatics (MaxQuant) 4. Currently, almost as good as SILAC (yet slightly less accurate)

SRM / MRM

Gustavo de Souza IMM, OUS October 2013

A little history…

So far, ID everything we can Mobile phase 20 s A B C18 column, 25cm long Time

Targeted analysis In some cases, the researcher don’t want the MS instrument to waste time trying to sequence as much as possible, but just to “search” and sequence pre-determined peptides.

-Biomarker research -Tracking specific metabolic pathways -Tracking low abundant proteins in challenging sample (f.ex., in serum)

Plasma dynamic range Schiess R et al., 2009

Improving detection through tergeting Michalski A et al., 2011

Biomarker

Discovery phase Screening the sample gives you the following info: -For protein X  most intense peptides (not all peptides from same protein have the same intensity) - most common m/z format (+2, +3, PTM?) - their Retention times - their fragmentation profiles (does the +2 fragments well?)

Biomarker

Shorter gradient = More complex MS1 As you decrease separation resolution, you increase the chance that two or more peptides with different sequences BUT very close m/z elutes at the same time.

SRM (Selected Reaction Monitoring)

Different transitions from same peptide

Performance with synthetic peptides

Shorter gradient = More complex MS1 As you decrease separation resolution, you increase the chance that two or more peptides with different sequences BUT very close m/z elutes at the same time.

Number of biomarkers discovered so far by MS 0

Spiking sinthetic labeled peptide for absolute quantitation

Applying SRM to a proper model Bacterial genomic structure - 700-6000 genes - No alternative splicing - Limited PTM presence

Discovery Phase

Validation on metabolic network

Validation on metabolic network - It open possibilities to study molecular function implications at metabolic level.

- Generate knockout, discovery phase to visualize pahways possibly altered by the KO, targeted the candidate pathways for in-depth quantitation.

Take home message

- Targeted analysis: ignore whole sample and focus in few protein.

- 1 st step is to make the regular analysis to collect acquisition features for as many peptides as possible.

- Relevant in Biomarker research - Very challenging for complex samples, very powerful for simpler organisms and for pure biology projects.