Transcript Slide 1

Glycomics project overview
Knowledge Enabled Information and Services Science
Life Science Ontologies
• Glyco
• An ontology for structure and function of Glycopeptides
• 573 classes, 113 relationships
• Published through the National Center for Biomedical
Ontology (NCBO)
• ProPreO
• An ontology for capturing process and lifecycle information
related to proteomic experiments
• 398 classes, 32 relationships
• 3.1 million instances
• Published through the National Center for Biomedical
Ontology (NCBO) and Open Biomedical Ontologies (OBO)
Knowledge Enabled Information and Services Science
ProPreO ontology
Two aspects of glycoproteomics:
o What is it? → identification
o How much of it is there? → quantification
Heterogeneity in data generation process, instrumental parameters, formats
Need data and process provenance → ontology-mediated provenance
Hence, ProPreO models both the glycoproteomics experimental process and attendant
data
Knowledge Enabled Information and Services Science
ProPreO population: transformation to rdf
Scientific Data
Computational Methods
Ontology instances
Knowledge Enabled Information and Services Science
ProPreO population: transformation to rdf
Scientific Data
Computational Methods
Key
Extract Peptide Amino-acid Sequence
from Protein Amino-acid Sequence
Protein Path
amino-acid
sequence
amino-acid
sequence
Protein Data
Peptide Path
Determine
N-glycosylation
Concensus
Calculate
Chemical
Mass
Calculate
Monoisotopic
Mass
RDF
Chemical
Mass RDF
n-glycosylation
concensus
“Protein RDF”
chemical
mass
monoisotopic
mass
amino-acid
sequence
parent
protein
Monoisotopic
Mass RDF
n-glycosylation
concensus
chemical
mass
“Peptide RDF”
Knowledge Enabled Information and Services Science
Amino-acid
Sequence
RDF
monoisotopic
mass
amino-acid
sequence
Semantic annotation of
scientific/experimental data
Knowledge Enabled Information and Services Science
ProPreO: Ontology-mediated provenance
parent ion charge
830.9570
194.9604
2
580.2985
0.3592
parent ion m/z
688.3214
0.2526
779.4759
38.4939
784.3607
21.7736
1543.7476
1.3822
fragment ion m/z
1544.7595
2.9977
1562.8113
37.4790
1660.7776
476.5043
parent ion
abundance
fragment ion
abundance
ms/ms peaklist data
Mass Spectrometry (MS) Data
Knowledge Enabled Information and Services Science
ProPreO: Ontology-mediated provenance
<ms-ms_peak_list>
<parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer”
mode=“ms-ms”/>
<parent_ion m-z=“830.9570” abundance=“194.9604” z=“2”/>
<fragment_ion m-z=“580.2985” abundance=“0.3592”/>
<fragment_ion m-z=“688.3214” abundance=“0.2526”/>
<fragment_ion m-z=“779.4759” abundance=“38.4939”/>
<fragment_ion m-z=“784.3607” abundance=“21.7736”/>
<fragment_ion m-z=“1543.7476” abundance=“1.3822”/>
<fragment_ion m-z=“1544.7595” abundance=“2.9977”/>
<fragment_ion m-z=“1562.8113” abundance=“37.4790”/>
<fragment_ion m-z=“1660.7776” abundance=“476.5043”/>
</ms-ms_peak_list>
Ontological
Concepts
Semantically Annotated MS Data
Knowledge Enabled Information and Services Science
Semantic annotation of Scientific Data
<ms/ms_peak_list>
<parameter
instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer”
mode = “ms/ms”/>
<parent_ion_mass>830.9570</parent_ion_mass>
<total_abundance>194.9604</total_abundance>
<z>2</z>
<mass_spec_peak m/z = 580.2985 abundance = 0.3592/>
<mass_spec_peak m/z = 688.3214 abundance = 0.2526/>
<mass_spec_peak m/z = 779.4759 abundance = 38.4939/>
<mass_spec_peak m/z = 784.3607 abundance = 21.7736/>
<mass_spec_peak m/z = 1543.7476 abundance = 1.3822/>
<mass_spec_peak m/z = 1544.7595 abundance = 2.9977/>
<mass_spec_peak m/z = 1562.8113 abundance = 37.4790/>
<mass_spec_peak m/z = 1660.7776 abundance = 476.5043/>
<ms/ms_peak_list>
Annotated ms/ms peaklist data
Knowledge Enabled Information and Services Science
N-Glycosylation Process (NGP)
Cell Culture
extract
Glycoprotein Fraction
proteolysis
Glycopeptides Fraction
1
n
Separation technique I
Glycopeptides Fraction
n
PNGase
Peptide Fraction
Separation technique II
n*m
Peptide Fraction
Mass spectrometry
ms data
ms/ms data
Data reduction
ms peaklist
Data reduction
ms/ms peaklist
binning
Glycopeptide identification
and quantification
N-dimensional array
Peptide identification
Peptide list
Knowledge Enabled Information
and Services Science
Data correlation
Signal integration
Semantic Web Process to incorporate provenance
Agent
Biological
Sample
Analysis
by MS/MS
O
Semantic
Annotation
Applications
Agent
Raw
Data to
Standard
Format
I
Data
Preprocess
O
Raw
Data
Agent
I
Standard
Format
Data
(Mascot/
Sequest)
O
Filtered
Data
Agent
DB
Search
I
Search
Results
O
Final
Output
Storage
Biological Information
Knowledge Enabled Information and Services Science
Results
Postprocess
(ProValt)
I
O
Have I performed an error?
Give me all result files from a similar
organism, cell, preparation,
mass spectrometric conditions
and compare results.
Integrated Semantic Information
and knowledge System (Isis)
SPARQL query-based User Interface
ProPreO ontology
Is the result erroneous?
Experimental
Semantic
Give me
result files from
a similar
Data all
Semantic
Metadata
Annotation
Metadata
organism,
cell,
preparation,
Registry
File
mass spectrometric conditions
and compare results.
PROTEOMECOMMONS
EXPERIMENTAL DATA
Raw
mzXML
Pkl
pSplit
MACOT
result
ProVault
result
MASCOT Search
ProVault
PROTEOMICS WORKFLOW
Raw2mzXML
mzXML2Pkl
Pkl2pSplit
Knowledge Enabled Information and Services Science
Semantic Biological Web Service Registry
Semantic Web Service
Knowledge Enabled Information and Services Science
GLYDE-CT : GLYcan Data Exchange
Based on a Connection Table Format
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE GlydeCT SYSTEM "http://glycomics.ccrc.uga.edu/GLYDE-CT/GLYDE-CT_v2.11.DTD">
<GlydeCT xmlns:GlydeCT="http://glycomics.ccrc.uga.edu/GLYDE-CT/GLYDE-CT_v2.11">
<structure type="molecule" id="molecule_1" name=“GP1">
<part type="moiety" id=“moiety_1" ref=“some_file#GNGS" name="GNGS"/>
<part type="moiety" id=“moiety_2" ref=“some_file#Man3" name="Man3GlcNAc2"/>
<link from=“moiety_2" to=“moiety_1">
<link from=“residue_1" to=“residue_2">
<link from="C1" to="N4"/>
</link>
4
</link>
</structure>
3
2
1
</Glyde-CT>
5
moiety_2
Gly 1
|
Asn 2
|
Gly3
|
Ser 4
moiety_1
Knowledge Enabled Information and Services Science
Data, ontologies, more publications at Biomedical Glycomics project web site:
http://knoesis.wright.edu/research/bioinformatics/index.html
Thank You
Knowledge Enabled Information and Services Science