Transcript General
Overview of the Pathway Tools Software and Pathway/Genome Databases Introductions BRG Staff Peter Karp Pallavi Kaipa Mario Latendresse Suzanne Paley Markus Krummenacker Ingrid Keseler Ron Caspi Alex Shearer Carol Fulcher Attendees Where from, what genome? What do you hope to get out of the tutorial? SRI International Bioinformatics SRI International Private nonprofit research institute No permanent funding sources 1300 staff in Menlo Park – Founded in 1946 as Stanford Research Institute – Separated from Stanford University in 1970 – Name changed to SRI International in 1977 – David Sarnoff Research Center acquired in 1987 SRI International Bioinformatics SRI International Bioinformatics SRI Organization Bioinformatics Research Group Information and Computing Sciences Biopharmaceuticals And Pharmaceutical Discovery Education and Policy Engineering Systems And Sciences Physical Sciences SRI International Bioinformatics Research in the SRI Bioinformatics Research Group EcoCyc MetaCyc Pathway Tools Pathway Holes BioWarehouse Enzyme Genomics Outline for Tutorial SRI International Bioinformatics Monday Introduction Pathway/Genome Navigator PathoLogic tutorial and demo PathoLogic lab session – Make genome input files parsable Tuesday PathoLogic tutorial PathoLogic lab session – Build initial version of PGDB Wednesday Pathway hole filler, operon predictor, transport inference parser Thursday Editors Feedback session Tutorial Goals SRI International Bioinformatics General familiarity with Pathway Tools goals and functionality Ability to create, edit, and navigate a new PGDB Create new PGDB for genome(s) you brought with you Familiarity with information resources available about Pathway Tools to continue your work SRI International Bioinformatics SRI’s Support for Pathway Tools NIH grant finances software development and user support Additional grants finance other software development Email us bug reports, suggestions, questions Comprehensive bug reports are required for us to fix the problem you reported Keep us posted regarding your progress Administrative Details Please wear badges at all times Escort required outside this room/hallway Let us know when you are leaving Use E-Bldg Entrance Phone numbers to call from entrance Meals Wednesday outing possible Restrooms SRI International Bioinformatics Tutorial Format Questions SRI International Bioinformatics welcome during presentations Lab sessions will take different amounts of time for different people Refine your PGDB Read Pathway Tools manuals Buddy system for some computers Computer logins Internet connectivity Pathway/Genome Database SRI International Bioinformatics Integrating Genomic and Biochemical Data Pathways Reactions Compounds Proteins Genes Operons, Promoters, DNA Binding Sites Chromosomes, Plasmids CELL Terminology Organism Database (MOD) – DB describing genome and other information about an organism Model Pathway/Genome Database (PGDB) – MOD that combines information about Pathways, reactions, substrates Enzymes, transporters Genes, replicons Transcription factors, promoters, operons, DNA binding sites – Collection of 205 PGDBs at BioCyc.org EcoCyc, AgroCyc, HumanCyc BioCyc SRI International Bioinformatics BioCyc Collection of Pathway/Genome Databases Database (PGDB) – combines information about Pathways, reactions, substrates Enzymes, transporters Genes, replicons Transcription factors/sites, promoters, operons Pathway/Genome Tier 1: Literature-Derived PGDBs MetaCyc EcoCyc -- Escherichia coli K-12 BioCyc Open Chemical Database Tier 2: Computationally-derived DBs, Some Curation -- 12 PGDBs HumanCyc Mycobacterium tuberculosis Tier 3: Computationally-derived DBs, No Curation -- 191 DBs SRI International Bioinformatics Terminology – Pathway Tools Software SRI International Bioinformatics PathoLogic Predicts operons, metabolic network, pathway hole fillers, from genome Computational creation of new Pathway/Genome Databases Pathway/Genome Editors Distributed curation of PGDBs Distributed object database system, interactive editing tools Pathway/Genome Navigator WWW publishing of PGDBs Querying, visualization of pathways, chromosomes, operons Analysis operations Pathway visualization of gene-expression data Global comparisons of metabolic networks Bioinformatics 18:S225 2002 SRI International Bioinformatics Pathway/Genome DBs Created by External Users 600+ licensees -- 50 groups applying software to 100+ organisms Software freely available to academics; Each PGDB owned by its creator Saccharomyces cerevisiae, SGD project, Stanford University pathway.yeastgenome.org/biocyc/ TAIR, Carnegie Institution of Washington Arabidopsis.org:1555 dictyBase, Northwestern University GrameneDB, Cold Spring Harbor Laboratory Planned: CGD (Candida albicans), Stanford University MGD (Mouse), Jackson Laboratory RGD (Rat), Medical College of Wisconsin WormBase (C. elegans), Caltech Large scale users: C. Medigue, Genoscope, 67 PGDBs G. Burger, U Montreal, 20 PGDBs DOE GTL contractors: G. Church, Harvard, Prochlorococcus marinus MED4 Larimer/Uberbacher, ORNL, Shewanella onedensis J. Keasling, UC Berkeley, Desulfovibrio vulgaris Fiona Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa Terminology “Database” SRI International Bioinformatics = “DB” = “Knowledge Base” = “KB” = “Pathway/Genome Database” = “PGDB” Why Create PGDBs? SRI International Bioinformatics Extract more information from your genome Create an up-to-date computable information repository about an organism Perform analyses on the genome and pathway complement of the organism, e.g., analyses of omics data Perform comparative analyses with other organisms Generate a genome poster and metabolic wall chart Sequence Project Workflow Raw Sequence Phred SRI International Bioinformatics PathoLogic P/G Editors Pathway Tools Phrap P/G Navigator GeneMark/Glimmer BLAST, BLOCKS WWW Publishing Analyses SRI International Bioinformatics MetaCyc: Metabolic Encyclopedia Nonredundant metabolic pathway database Describe a representative sample of every experimentally determined metabolic pathway Literature-based DB with extensive references and commentary Pathways, reactions, enzymes, substrates Jointly developed by SRI and Carnegie Institution Nucleic Acids Research 34:D511-D516 2006 MetaCyc Data SRI International Bioinformatics Family of Pathway/Genome Databases EcoCyc MetaCyc SRI International Bioinformatics CauloCyc AraCyc MtbRvCyc HumanCyc SRI International Bioinformatics Omics Viewer Import gene expression, proteomics, metabolomics data Obtain pathway based visualizations of omics data Numerical spectrum of expression values mapped to a color spectrum Steps of overview painted with color corresponding to expression level(s) of genes that encode enzyme(s) for that step SRI International Bioinformatics Environment for Computational Exploration of Genomes Powerful ontology opens many facets of the biology to computational exploration Global characterization of metabolic network Analysis of interface between transport and metabolism Nutrient analysis of metabolic network SRI International Bioinformatics Pathway Tools Implementation Details Allegro Common Lisp Sun, Linux, Windows platforms Ocelot object database 300,000+ lines of code Lisp-based WWW server at BioCyc.org Manages 205 PGDBs SRI International Bioinformatics The Common Lisp Programming Environment Gatt studied Lisp and Java implementation of 16 programs by 14 programmers (Intelligence 11:21 2000) Survey Please SRI International Bioinformatics complete survey at end of each day PGDB(s) That You Build Before SRI International Bioinformatics you leave Tar up your PGDB directory and FTP it home, email it home, or copy it to flash disk We will create a backup copy of your PGDB directory if the directory is still there at the end of the tutorial Delete the PGDB directory if you don’t want us to back it up We will not give the backed up data to anyone else Summary SRI International Bioinformatics Pathway Tools and Pathway/Genome Databases Not just for pathways! Computational inferences Operons, metabolic pathways, pathway hole fillers Editing tools Analysis tools: Omics data on pathways Web publishing of PGDBs Main classes of users: Develop PGDB to extract more information from genome for genome paper Develop a model-organism DB for the organism that is updated regularly and published on the web Information Sources SRI International Bioinformatics Pathway Tools User’s Guide /root/aic-export/ecocyc/genopath/released/doc/manuals/userguide1.pdf /toot/aic-export/ecocyc/genopath/released/doc/manuals/userguide2.pdf Pathway/Genome Navigator Appendix A: Guide to the Pathway Tools Schema PathoLogic, Editing Tools NOTE: Location of the aic-export directory can vary across different computers Pathway Tools Web Site http://bioinformatics.ai.sri.com/ptools/ Publications, programming examples, etc. Slides from this tutorial http://bioinformatics.ai.sri.com/ptools/tutorial/