Metabolomics

Download Report

Transcript Metabolomics

EMBO Practical Course on Metabolomics Bioinformatics for Life Scientists

“Dissecting an untargeted metabolomic workflow” Oscar Yanes, PhD

Untargeted metabolomics workflow

Sample preparation Experimental design Sample analysis by MS and NMR Pre-processing data analysis Metabolite identification Experimental validation Hypothesis

Untargeted metabolomics workflow

Sample preparation Experimental design Sample analysis by MS and NMR

EMBO Course

Pre-processing data analysis Metabolite identification Experimental validation Hypothesis

Ultimate goal of metabolomics

List of metabolites differentially regulated

Biomarker discovery

Disease vs. control Pathway analysis Model construction Scientific literature

Validation Mechanism Hypothesis

Untargeted metabolomics workflow

Sample preparation Experimental design Sample analysis by MS and NMR Pre-processing data analysis Metabolite identification Experimental validation Hypothesis

THE IMPORTANCE OF EXPERIMENTAL DESIGN

I want to do metabolomics ME COLLABORATOR

THE IMPORTANCE OF EXPERIMENTAL DESIGN

… I want to do metabolomics ME COLLABORATOR

THE IMPORTANCE OF EXPERIMENTAL DESIGN

I have many samples at -80 ° C. Could you do metabolomics and find out something?

ME COLLABORATOR

THE IMPORTANCE OF EXPERIMENTAL DESIGN

!!

I have many samples at -80 ° C. Could you do metabolomics and find out something?

ME COLLABORATOR

THE IMPORTANCE OF EXPERIMENTAL DESIGN

BASIC DIAGRAM OF A MASS SPECTROMETER

BASIC DIAGRAM OF A MASS SPECTROMETER

Gas-phase: Gas chromatography Liquid-phase: Liquid chromatography Capillary electrophoresis Solid-phase: Surface-based

BASIC DIAGRAM OF A MASS SPECTROMETER Electron ionization (EI) Chemical ionization (CI) Atmospheric pressure chemical ionization (APCI) Electrospray ionization (ESI) Laser desorption ionization (LDI)

Watch out serum/plasma samples from biobanks!

Glucose 0.4

0.3

0.2

0.1

0.0

0 4 12 Time (h) Pyruvic Acid 0.2

0.1

0.0

0 4 12 Time (h) 24 24 Lactate 1.0

0.8

0.6

0.4

0.2

0.0

0 4 12 Time (h) Choline 1.0

0.8

0.6

0.4

0.2

0.0

0 4 12 Time (h) 24 24

Untargeted metabolomics workflow

Sample preparation

Experimental design

Sample analysis by MS

Pre-processing data analysis Metabolite identification Experimental validation Hypothesis

Requisite for untargeted metabolomics

Maximize ionization efficiency over the whole mass range (e.g., m/z 80-1500)

Requisite for untargeted metabolomics

Maximize ionization efficiency over the whole mass range (e.g., m/z 80-1500)

Number of features Intensity of the features

Requisite for untargeted metabolomics

Maximize ionization efficiency over the whole mass range (e.g., m/z 80-1500)

Number of features Intensity of the features Coverage of the metabolome Accurate quantification and identification of metabolites

How

do we increase the number of features and their intensity??

intensity mass time

Feature: molecular entity with a unique

m/z

and retention time value

How

do we increase the number of features and their intensity??

intensity mass

Sample preparation: - Extraction method Chromatography: - Stationary-phase - Mobile-phase Ion Funnel Technology etc.

time

Extraction method

Hot EtOH/Amm. Acetate Cold Acetone/MeOH

Only 45% of the metabolites are detected with Acetone/MeOH

MS/MS threshold

Extraction method

Yanes O., et al. Anal. Chem. 2011; 83(6):2152-61

Liquid Chromatography: mobile-phase

Ammonium Fluoride Ammonium acetate Formic acid Yanes O et al. Anal. Chem. 2011; 83(6):2152-61

Ammonium fluoride Ammonium acetate F Ammonium fluoride

Chromatography: stationary phase

HILIC RP C18/C8 Effect of pH; ammonium salts; ion pairs (e.g. TBA) LC flow rate and pressure: UPLC vs. HPLC vs. nanoLC (vs. GC!) HPLC UPLC

BASIC DIAGRAM OF A MASS SPECTROMETER Electron ionization (EI) Chemical ionization (CI) Atmospheric pressure chemical ionization (APCI) Electrospray ionization (ESI) Laser desorption ionization (LDI)

PRACTICAL ASPECTS 1.

Number of scans/second

Implications in LC/MS and GC/MS: Quantification Maximum intensity or integrated area

2.

Instrument resolution

Implications: Detector saturation Quantification

3.

Sample amount injected

Implications: Detector saturation

Untargeted metabolomics workflow

Sample preparation Experimental design Sample analysis by MS and NMR

EMBO Course Pre-processing data analysis Metabolite identification

Experimental validation Hypothesis

RAW METABOLOMICS DATA

FROM RAW DATA TO METABOLITE IDs METABOLITE IDENTIFICATIONS STATISTICAL ANALYSIS PRE-PROCESSING RAW DATA CONVERSION

FROM RAW DATA TO METABOLITES IDs

GC/MS

RAW DATA CONVERSION METABOLITE IDENTIFICATIONS PRE PROCESSING

LC/MS

STATISTICAL ANALYSIS

LC/MS GC/MS

PATHWAY ANALYSIS

LC-MS WORKFLOW

LC-MS RAW DATA PROTEOWIZARD mZDATA PREPROCESSING

M1 M2 ...

mZRT1 I mZRT1 M1 ...

...

mZRT2 ...

I mZRT2 M2 ...

mZRT3 ...

...

...

mZRT Features Table

Feature: individual ions with a unique mass-to charge ratio and a unique retention time

STATISTICAL ANALYSIS IDENTIFICATION

LC-MS WORKFLOW RAW LC-MS DATA TO mZXML: PROTEOWIZARD

[Nature Biotechnology, 30 (918–920) (2012)]

VENDOR Agilent Bruker Thermo Fisher Waters AB Sciex FORMATS MassHunter.d

Compass.d, YEP, BAF, FID RAW MassLynx.raw

WIFF CONVERTER ProteoWizard ProteoWizard ProteoWizard ProteoWizard ProteoWizard

LC-MS WORK-FLOW XCMS PRE-PROCESSING

• http://metlin.scripps.edu/download/ •Free & Open Source •Based on R •On-line version •Suitable for: -GC-MS -LC-MS

Analytical Chemistry, 78(3), 779–787, 2006 Analytical Chemistry, 84(11), 5035-5039, 2012

LC-MS WORKFLOW XCMS PRE-PROCESSING 1. FEATURE DETECTION

[BMC Bioinformatics, 2008 9:504]

1. Dense regions in m/z space 2. Gaussian peak shape in chromatogram LC-MS WORKFLOW XCMS PRE-PROCESSING 1. FEATURE DETECTION

LC-MS WORK-FLOW XCMS PRE-PROCESSING 2. RETENTION TIME CORRECTION

LC-MS WORKFLOW

• 10 3 -10 4 • mZRT features  features redundancy:  IDENTIFICATION NOT FEASIBLE!

-adducts: [M+H + ], [M+Na + ], [M+NH 4 + ], [M+H + -H 2 O]… -isotopes: [M+1], [M+2], [M+3] • Many mZRT features are noisy in nature and irrelevant to our phenomea

STATISTICAL ANALYSIS

FEATURES RANKING Those features varying according to our phenomena are retained to further identification experiments

LC-MS WORK-FLOW FEATURES RANKING CRITERIA (I) ANALYTICAL VARIABILITY -RANDOMIZE -USE QCs TO CHECK ANALYTICAL VARIATION WORKLIST

LC-MS WORK-FLOW FEATURES RANKING CRITERIA (I) ANALYTICAL VARIABILITY

T CV mZRT

(

j

) 

S T mZRT

(

j

)

T X mZRT

(

j

)  100

QC CV mZRT

(

j

) 

QC S mZRT

(

j

)

QC X mZRT

(

j

)  100

USEFUL PLOTS IN EXPLORATORY DATA ANALYSIS RETINAS Hypoxia (N=12) vs Normoxia (N=13) #mZRT=7654 NEURONAL CELL CULTURES KO (N=15) vs WT (N=11) #mZRT=6831

LC-MS WORK-FLOW FEATURES RANKING CRITERIA (IV) HYPOTHESIS TESTING+FDR

 =0.05 (235 features significantly varied by chance, 26% out of 900) FDR  =0.0074 (20 features varied by chance, 5% out of 404)

#features=4704

USEFUL PLOTS IN EXPLORATORY DATA ANALYSIS RETINAS Hypoxia (N=12) vs Normoxia (N=13) #mZRT=7654 NEURONAL CELL CULTURES KO (N=15) vs WT (N=11) #mZRT=6831

USEFUL PLOTS IN EXPLORATORY DATA ANALYSIS RETINAS Hypoxia (N=12) vs Normoxia (N=13) #mZRT=7654 NEURONAL CELL CULTURES KO (N=15) vs WT (N=11) #mZRT=6831

10M data points LC-MS WORKFLOW # mZRT=51908 (i) analytical variability # mZRT=38377 (ii) features intensity # mZRT=4704 (iii) hypothesis testing + fold change # mZRT=250 Annotation Data Base look-up Identification experiments 10-50 differential metabolites

Workflow for Metabolite Identification

Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards

Workflow for Metabolite Identification

Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards

Workflow for Metabolite Identification

Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards

Step 2: Search databases for accurate mass

Step 2: Search databases for accurate mass

Each feature returns many hits.

HMDB Metlin

Step 2: Search databases for accurate mass

Common adducts Na + , NH4 + , K + , Cl , and H 2 O loss

Adducts increase number of hits returned!

Workflow for Metabolite Identification

Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards

Step 3: Filter “putative” identification list

Eliminate

drugs?intensity in the mass spectrumadducts?matches with obviously inconsistent retention times Example: feature with m/z 733.56 is unlikely to be a phospholipid if it has a 1-min RT with reverse-phase chromatography.

Look for hits that implicate the same pathway, give those features priority.

Standards can be expensive, your intuition will save you money and time!

Workflow for Metabolite Identification

Step 1: Select interesting features Step 2: Search databases for accurate mass Step 3: Filter “putative” identification list Step 4: Compare RT and MS/MS of standards

What experimental data should be required to constitute a metabolite identification?

• Accurate mass?

• Retention time?

Unlike proteomics, no journals have requirements or guidelines for publication of metabolite identifications.

• MS/MS data?

accurate mass “The identification of certain metabolites as their exact masses in their given biological context was strategic in the context of searching for biomarkers for CD.” accurate mass and retention time “…this method enables untargeted profiling of metabolites using accurate mass-retention time (AMRT) identifiers.” accurate mass, retention time, and MS/MS “Metabolites were putatively identified on the basis of accurate mass and retention time, and confirmed by comparing MS/MS data of unknowns to model compounds.”

accurate mass “The identification of certain metabolites as their exact masses in their given biological context was strategic in the context of searching for biomarkers for CD.”

Accurate mass identifications are putative

All structures have a neutral mass of 146.0691

Mass error (even if small) and adducts add more possibilities!

accurate mass “The identification of certain metabolites as their exact masses in their given biological context was strategic in the context of searching for biomarkers for CD.” accurate mass and retention time “…this method enables untargeted profiling of metabolites using accurate mass-retention time (AMRT) identfiers.” accurate mass, retention time, and MS/MS “Metabolites were putatively identified on the basis of accurate mass and retention time, and confirmed by comparing MS/MS data of unknowns to model compounds.”

accurate mass and retention time “…this method enables untargeted profiling of metabolites using accurate mass-retention time (AMRT) identfiers.”

Many structural isomers have the retention time

citrate isocitrate Citrate and isocitrate have the same retention time but different MS/MS patterns.

accurate mass “The identification of certain metabolites as their exact masses in their given biological context was strategic in the context of searching for biomarkers for CD.” accurate mass and retention time “…this method enables untargeted profiling of metabolites using accurate mass-retention time (AMRT) identfiers.” accurate mass, retention time, and MS/MS “Metabolites were putatively identified on the basis of accurate mass and retention time, and confirmed by comparing MS/MS data of unknowns to model compounds.”

accurate mass, retention time, and MS/MS “Metabolites were putatively identified on the basis of accurate mass and retention time, and confirmed by comparing MS/MS data of unknowns to model compounds.”

Step 4: Compare RT and MS/MS of standards Q-TOF

Standard7α-hydroxy-cholesterol

HO H H H OH H H

Biological sample

367.33

367.33

60 100 140 180 220 260 Mass-to-Charge (m/z) 300 340 380 420

Step 4: Compare RT and MS/MS of standards Retention time will be available from the profiling experiment, however, to obtain MS/MS data for the feature of interest in the research sample typically another experiment is required.

Note: Only need to perform MS/MS on one research sample. Pick a sample from the group for which the feature is up regulated!

Do not pick this group

What if feature of interest is not in the database?

(or model compound is not commercially available)

FT-ICR MS can be used to limit chemical formulas MS/MS can be insightful to reveal structural insight (MS/MS library, bioinformatic approaches) NMR can provide structural details When a chemist is your best friend…

What if feature of interest is not in the database?

(or model compound is not commercially available) FT-ICR MS can be used to limit chemical formulas

MS/MS can be insightful to reveal structural insight (MS/MS library, bioinformatic approaches) NMR can provide structural details When a chemist is your best friend…

What if feature of interest is not in the database?

(or model compound is not commercially available)

FT-ICR MS can be used to limit chemical formulas MS/MS can be insightful to reveal structural insight (MS/MS library, bioinformatic approaches)

NMR can provide structural details

When a chemist is your best friend…

What if feature of interest is not in the database?

(or model compound is not commercially available)

FT-ICR MS can be used to limit chemical formulas

MS/MS can be insightful to reveal structural insight (MS/MS library, bioinformatic approaches)

NMR can provide structural details

When a chemist is your best friend…

• Thermophile organism adapted to live at high temperatures.

• Organisms challenged with cold temperature (72 º C) and compared to high-temperature (95 º C) controls.

Feature up-regulated at cold temperature

Natural product * N 1 -Acetylthermospermine

Identification???

*

Feature up-regulated at cold temperature

Natural product * N 1 -Acetylthermospermine *

Intensity of m/z 112 fragment is significantly different. NOT A MATCH!

Chemical synthesis of hypothesized structure is required

Synthesized metabolite produces comparable MS/MS data as natural product from Pyrococcusfuriosus.

Natural product N 4 (N Acetylaminopropyl)spermidine N 1 -Acetylthermospermine

Ultimate goal of metabolomics

List of metabolites differentially regulated

Biomarker discovery

Disease vs. control Pathway analysis Model construction Scientific literature

Validation Mechanism Hypothesis

Validate your metabolites!!

Targeted metabolomics Molecular biology techniques LC and GC-Triple quadrupole MS Immunohistochemistry Reverse Transcription-PCR Gene expression array Cell cultures Animal experimentation …..

Thank you

email: [email protected]

web: www.yaneslab.com

Twitter: @yaneslab