Transcript Document

Bioinformatics for Targeted
Metabolomics: Met and Unmet Needs
Klaus M. Weinberger
Biocrates Life Sciences AG, Innsbruck, Austria
3rd Annual Forum for SMEs
Information Workshop on European Bioinformatics Resources
Vienna, September 3 – 4, 2009
Agenda
• Why (targeted) metabolomics?
BIOCRATES
• Proof-of-concept in routine clinical
diagnostics
• Technology platform
• Workflow integration & data
analysis
• Issues
Socrates
470-399 BC
Hippocrates
460-377 BC
Intelligence
Wisdom
Medicine
Health
• Acknowledgements
“Creating Knowledge for Health”
Metabolomics is...
... the systematic identification and quantitation of all/
biologically relevant small molecules* in a given
compartment, cell, tissue or body fluid.
It represents the functional end-point of physiological
and pathophysiological processes depicting both
genetic predisposition and environmental influences
like nutrition, exercise or medication.
* no biopolymers (nucleic acids, polypeptides)
Why (targeted) metabolomics?
Six systems biologists examining an
elephant
Why metabolomics?
Polypeptides
Proteins
~106
Translation
~107
PTM
Enzymatic
activity
Transport
etc.
RNA
~105
Transcription
DNA
2.5·104
Metabolites
• Functional end-point of physiology and
pathophysiology
• Reasonable scale of the analytical
challenge
• Direct mirror of environmental
influences
• (Mal-)nutrition
• Exercize
• Medication
~104
Metabolomics approaches
Sample cohorts
Metabolic profiling
(e.g. full scan LC-MS)
Differential pattern
information
HPLC-ToF-MS of urine samples
+TOF MS: 4.995 to 9.994 min from PR01-40-1_040092_56_1_0204029486.wiff ,saturation correction applied
a=3.56735167855777570e-004, t0=3.08326670642854880e+001, subtracted (12.994 to 13.994 min)
Max. 32.0 counts.
376.1168
32
Sample:
30
114.0871
28
26
105.0298
24
22
600.2309
Intensity, counts
20
HPLC:
injection volume:
detection:
388.2420
327.1889
18
mass accuracy:
data content:
432.2672
16
377.1319
207.1518
14
359.1289
134.0554
12
10
391.2730
180.0597
344.2083
424.2011
163.1292
4
195.0611
584.2329
497.1240
8
6
570.2205
415.2349
297.1361 318.1748
107.0682
334.1510
mouse urine
ID 0204029486 (3/8)
Waters Atlantis dC18
10 µl
pos. ToF-MS
m/z 100-1500
~ 2 ppm
c. 2500 features per
spectrum for
statistical assessment
366.1678
520.3189
446.2369
624.2623
2
0
100
200
300
400
500
600
700
800
m/z, amu
900
1000
1100
1200
1300
1400
1500
PCA of LC/MS profiling data
Candidate drug vs. Untreated
Untreated vs. Rosiglitazone
Metabolomics approaches
Sample cohorts
Metabolic profiling
(e.g. full scan LC-MS)
Differential pattern
information
Identification of relevant
metabolites
Targeted metabolomics
(ID / quantitation
by SID on MS/MS)
Metabolite concentration
shifts
Functional annotation
Pathway mapping of quantitative Mx data
Asp
ASS
Cit
NO
Argsucc
ASL
OCT
Carb-P
NOS
Fum
ARG
Arg
Orn
Urea
Areas of application
 Basic research
- Functional genomics in biochemistry, physiology, cell biology,
microbiology, ecology, …
 Agricultural & nutrition industry
- Plant intermediary metabolism
- Health effects of functional food products
 Biotechnology
- Optimization and monitoring of fermentation processes
 Pharmaceutical R&D
- Pathobiochemistry / characterization of disease models
- Safety / toxicology
- Efficacy / pharmacodynamics and mode-of-action
 Clinical diagnostics & theranostics
- Early diagnosis and accurate staging
- Specific monitoring of therapeutic effects
History and proof-of-concept in
clinical diagnostics
Sir Archibald Edward Garrod
•
•
•
•
•
•
1857, London – 1936, Cambridge
Educated in Marlborough, Oxford,
and London
Postgraduate studies at the AKH in
Vienna in 1884/85
Publications on chemical pathology
(e.g. of alkaptonuria, cystinuria,
pentosuria)
One gene – one enzyme hypothesis
Concept of inborn errors of
metabolism (Croonian lectures to
the Royal College of Physicians,
1908)
Proof-of-concept in neonatology
•
Newborn screening for inborn metabolic
disorders
•
•
•
•
•
•
•
•
replaced expensive monoparametric assays
simultaneous detection of 40 - 60
metabolites (amino acids, acylcarnitines)
simultaneous diagnosis of 20 - 30
monogenic diseases (AA metabolism,
FATMO) with immediate treatment options
total incidence > 1:2000
unprecedented sensitivity, specificity, ppv
co-pioneered in the mid-90s by BIOCRATES
founder Bert Roscher
> 1,300,000 newborns screened in Munich
similar labs worldwide
Lessons from newborn screening
1) Quantitative tandem mass spectrometry (stable isotope
dilution) is able to meet the most stringent quality criteria
(precision, accuracy) for routine diagnostics
2) The concept of multiparametric biomarkers improving
assay sensitivity and, particularly, specificity is valid for
many monogenic (and multifactorial) diseases
3) MS-based diagnostics can save costs despite a wider
analytical panel and improved diagnostic quality
Also true for therapeutic drug monitoring of immunosuppressants,
antidepressants, antiretrovirals...
Goals in clinical diagnostics
Conventional
diagnostics
ill
Multiparametric
diagnostics
latent
healthy
genetic predisposition
•
•
Early diagnosis
Prophylaxis instead of therapy
•
•
•
•
Subtyping / Staging
Therapeutic drug monitoring
Phenotypic pharmacogenomics
Individualized (and more costefficient) medicine
Technology, workflow
integration & data analysis
Integrated technology platform
•
•
•
•
Technical validation
Statistical analysis
Data visualization
Biochemical
interpretation
ase 2
dise
• Separation (LC, GC)
• Quantitation (MRM, SID)
• QA/QC
BioInformatics
se
as
e1
• Automated
extraction and
derivatization
• SPE
Analytics
di
Sample preparation
LIMS/Database
BioBank
Clinical & experimental
samples
Diagnoses & lab data
Workflow overview
Staging of diabetic and non-diabetic
nephropathy by PCA-DA
MarkerViewTM
Identifying marker candidates: stage 3 vs.
stage 5 kidney disease (loadings)
Increasing oxidative stress in progressing CKD
• Oxidation of
methionine is highly
indicative for
oxidative stress
0,030
Met-SO/Met
0,025
0,020
• Ratio of Met-SO to
Met quantitative
measure for this
biomarker
0,015
0,010
0,005
0,000
stage 3
stage 4
stage 5
Decreasing ADMA secretion in progressing CKD
Metabolite vs. eGFR, non-diabetic, w/o Stage 5
ADMA (U)
Linear (ADMA (U))
60
50
Metabolite
40
30
20
10 y = 0.4995x - 5.8957
R² = 0.7523
0
100
80
60
40
20
0
eGFR
• Regression analysis to identify correlation of marker candidates with
continous (clinical) variables instead of discrete (=artificial) stages
Orchestration of fatty acid oxidation
Membrane phospholipids (GPC, GPE, GPS, ...)
SPL2
Lysophospholipids
LA 18:2w6
Free fatty acids
PUFAs
AA 20:4w6
13-HODE
DHA 22:6w3
LOX
ROS
9-HODE
EPA 20:5w3
12-HETE
15-HETE
COX
LTB4
TXB2
PGD2
PGE2
Pathway visualization in KEGG
(reference pathway)
Pathway visualization in KEGG (human)
Dynamic pathway visualization in MarkerIDQ
Exploring ‚metabolic shells‘
around metabolites
Route finding between metabolites across
pathways
Reactions vs. Reactant pairs!
Issues I: Databases
 Parallel / competing initiatives with incompatible / proprietary data
formats
 KEGG
 MetaCyc, HumanCyc, etc.
 Reactome
 HMDB
 OMIM
 Lipidomics consortia
 ...
 Compartmentalization not well depicted
 Incompleteness / generic entries (phospholipids, acylcarnitines,
etc.)
 Lack of curation
 Lack of publication
Issues II: Standardization and normalization
 Standardization
 Instrument vendors oppose common data formats
 What meta-data to record?
 No valid guidelines for quantitation of endogenous metabolites
(FDA guidance was developed for xenobiotics)
 Nomenclature vs. analytical reality (sum signals, isomers, etc.)
 Normalization
 Absolute quantitation overcomes the need for analytical
normalization
 Role of sample types (plasma, CSF, urine, tissue homogenates,
cell extracts, ...)
 How can biological normalization work? Are there ‚housekeeping metabolites‘?
Issues III: Biostatistics
 Overfitting & correction
 Suitable clustering algorithms for multivariate data sets?
 Metabolites are no equivalent independent variables
 Analytical validity/variability are usually not considered
 Often, groups of metabolites are synthesized or degraded
by the same enzyme(s)
 Consecutive reactions within a pathway/network depend on
each other (flux analysis!)
 How to incorporate this in biostatistics? Weighting? Derived
parameters, ratios, etc.?
 How to exploit this in (automated) plausibility checks?
Summary I
•
Metabolomics depicts the functional end-point of genetics and
environment
•
Targeted metabolomics data are analytically reproducible and
allow immediate biochemical interpretation
•
Proof-of-concept has been achieved in routine diagnostics of
inborn errors of metabolism
•
Many metabolic biomarkers are valid across species and enable
translational research
•
Comprehensive targeted metabolomics bridges the gap to open
profiling approaches
Summary II : Success factors for biomarker
development
Validated
biomarkers
Patent strategy
and experience
Biomarker
candidates
Welldocumented
biobanking
Diligent
study
design
Clinical &
scientific
experts
Solid multivariate
biostatistics
Validated
quantitative
assays
Biochemical
plausibility &
understanding
Selected
partners
Analytics
Stefanie Gstrein
Sascha Dammeier
Hai Pham Tuan
Cornelia Röhring
Therese Koal
Ali Alchalabi
Verena Forcher
Ines Unterwurzacher
Stefan Urban
Doreen Kirchberg
Ralf Bogumil
Patrizia Hofer
Lisa Körner
Peter Enoh
Acknowledgements
Brad Morie
Doris Gigele
Elgar Schnegg
Admin, IT & BizDev
Anton Grones
Ingrid Sandner
Georg Debus Wolfgang Samsinger
Patricia Aschacher
Bioinformatics
Daniel Andres
Olivier Lefèvre
Paolo Zaccaria Florian Bichteler
Marc Breit
Manuel Gogl
Bernd Haas
Mattias Bair
Robert Eller
Hamza Ovacin
Gerd Lorünser
Yi Zao
Statistics & Biochemistry
Ingrid Osprian
Marion Beier
Vera Neubauer
Oliver Lutz
Matthias Keller
Denise Sonntag
Hans-Peter Deigner Ulrika Lundin