Transcript Slide 1
Glycomics project overview Knowledge Enabled Information and Services Science Life Science Ontologies • Glyco • An ontology for structure and function of Glycopeptides • 573 classes, 113 relationships • Published through the National Center for Biomedical Ontology (NCBO) • ProPreO • An ontology for capturing process and lifecycle information related to proteomic experiments • 398 classes, 32 relationships • 3.1 million instances • Published through the National Center for Biomedical Ontology (NCBO) and Open Biomedical Ontologies (OBO) Knowledge Enabled Information and Services Science ProPreO ontology Two aspects of glycoproteomics: o What is it? → identification o How much of it is there? → quantification Heterogeneity in data generation process, instrumental parameters, formats Need data and process provenance → ontology-mediated provenance Hence, ProPreO models both the glycoproteomics experimental process and attendant data Knowledge Enabled Information and Services Science ProPreO population: transformation to rdf Scientific Data Computational Methods Ontology instances Knowledge Enabled Information and Services Science ProPreO population: transformation to rdf Scientific Data Computational Methods Key Extract Peptide Amino-acid Sequence from Protein Amino-acid Sequence Protein Path amino-acid sequence amino-acid sequence Protein Data Peptide Path Determine N-glycosylation Concensus Calculate Chemical Mass Calculate Monoisotopic Mass RDF Chemical Mass RDF n-glycosylation concensus “Protein RDF” chemical mass monoisotopic mass amino-acid sequence parent protein Monoisotopic Mass RDF n-glycosylation concensus chemical mass “Peptide RDF” Knowledge Enabled Information and Services Science Amino-acid Sequence RDF monoisotopic mass amino-acid sequence Semantic annotation of scientific/experimental data Knowledge Enabled Information and Services Science ProPreO: Ontology-mediated provenance parent ion charge 830.9570 194.9604 2 580.2985 0.3592 parent ion m/z 688.3214 0.2526 779.4759 38.4939 784.3607 21.7736 1543.7476 1.3822 fragment ion m/z 1544.7595 2.9977 1562.8113 37.4790 1660.7776 476.5043 parent ion abundance fragment ion abundance ms/ms peaklist data Mass Spectrometry (MS) Data Knowledge Enabled Information and Services Science ProPreO: Ontology-mediated provenance <ms-ms_peak_list> <parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer” mode=“ms-ms”/> <parent_ion m-z=“830.9570” abundance=“194.9604” z=“2”/> <fragment_ion m-z=“580.2985” abundance=“0.3592”/> <fragment_ion m-z=“688.3214” abundance=“0.2526”/> <fragment_ion m-z=“779.4759” abundance=“38.4939”/> <fragment_ion m-z=“784.3607” abundance=“21.7736”/> <fragment_ion m-z=“1543.7476” abundance=“1.3822”/> <fragment_ion m-z=“1544.7595” abundance=“2.9977”/> <fragment_ion m-z=“1562.8113” abundance=“37.4790”/> <fragment_ion m-z=“1660.7776” abundance=“476.5043”/> </ms-ms_peak_list> Ontological Concepts Semantically Annotated MS Data Knowledge Enabled Information and Services Science Semantic annotation of Scientific Data <ms/ms_peak_list> <parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer” mode = “ms/ms”/> <parent_ion_mass>830.9570</parent_ion_mass> <total_abundance>194.9604</total_abundance> <z>2</z> <mass_spec_peak m/z = 580.2985 abundance = 0.3592/> <mass_spec_peak m/z = 688.3214 abundance = 0.2526/> <mass_spec_peak m/z = 779.4759 abundance = 38.4939/> <mass_spec_peak m/z = 784.3607 abundance = 21.7736/> <mass_spec_peak m/z = 1543.7476 abundance = 1.3822/> <mass_spec_peak m/z = 1544.7595 abundance = 2.9977/> <mass_spec_peak m/z = 1562.8113 abundance = 37.4790/> <mass_spec_peak m/z = 1660.7776 abundance = 476.5043/> <ms/ms_peak_list> Annotated ms/ms peaklist data Knowledge Enabled Information and Services Science N-Glycosylation Process (NGP) Cell Culture extract Glycoprotein Fraction proteolysis Glycopeptides Fraction 1 n Separation technique I Glycopeptides Fraction n PNGase Peptide Fraction Separation technique II n*m Peptide Fraction Mass spectrometry ms data ms/ms data Data reduction ms peaklist Data reduction ms/ms peaklist binning Glycopeptide identification and quantification N-dimensional array Peptide identification Peptide list Knowledge Enabled Information and Services Science Data correlation Signal integration Semantic Web Process to incorporate provenance Agent Biological Sample Analysis by MS/MS O Semantic Annotation Applications Agent Raw Data to Standard Format I Data Preprocess O Raw Data Agent I Standard Format Data (Mascot/ Sequest) O Filtered Data Agent DB Search I Search Results O Final Output Storage Biological Information Knowledge Enabled Information and Services Science Results Postprocess (ProValt) I O Have I performed an error? Give me all result files from a similar organism, cell, preparation, mass spectrometric conditions and compare results. Integrated Semantic Information and knowledge System (Isis) SPARQL query-based User Interface ProPreO ontology Is the result erroneous? Experimental Semantic Give me result files from a similar Data all Semantic Metadata Annotation Metadata organism, cell, preparation, Registry File mass spectrometric conditions and compare results. PROTEOMECOMMONS EXPERIMENTAL DATA Raw mzXML Pkl pSplit MACOT result ProVault result MASCOT Search ProVault PROTEOMICS WORKFLOW Raw2mzXML mzXML2Pkl Pkl2pSplit Knowledge Enabled Information and Services Science Semantic Biological Web Service Registry Semantic Web Service Knowledge Enabled Information and Services Science GLYDE-CT : GLYcan Data Exchange Based on a Connection Table Format <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE GlydeCT SYSTEM "http://glycomics.ccrc.uga.edu/GLYDE-CT/GLYDE-CT_v2.11.DTD"> <GlydeCT xmlns:GlydeCT="http://glycomics.ccrc.uga.edu/GLYDE-CT/GLYDE-CT_v2.11"> <structure type="molecule" id="molecule_1" name=“GP1"> <part type="moiety" id=“moiety_1" ref=“some_file#GNGS" name="GNGS"/> <part type="moiety" id=“moiety_2" ref=“some_file#Man3" name="Man3GlcNAc2"/> <link from=“moiety_2" to=“moiety_1"> <link from=“residue_1" to=“residue_2"> <link from="C1" to="N4"/> </link> 4 </link> </structure> 3 2 1 </Glyde-CT> 5 moiety_2 Gly 1 | Asn 2 | Gly3 | Ser 4 moiety_1 Knowledge Enabled Information and Services Science Data, ontologies, more publications at Biomedical Glycomics project web site: http://knoesis.wright.edu/research/bioinformatics/index.html Thank You Knowledge Enabled Information and Services Science