Transcript General

Overview of Microbial Pathway and Genome Databases

SRI International Bioinformatics

Overview

Survey of other databases / web sites that integrate hundreds of microbial genomes and pathway information

Most of these resources are described in publications that can be found via PubMed

Differences among each resource include:

 Genomes included    What other information is integrated with the genome data Value-added computational processing applied to each genome Query, visualization, and analysis tools available at each site

Overall Comparison to BioCyc

SRI International Bioinformatics

Many of the other databases contain more genomes than BioCyc

 This will change in 2011 as BioCyc transitions to RefSeq as its genome source 

BioCyc Tier 1 and Tier 2 databases more highly curated than other databases

BioCyc has more extensive query, visualization, and analysis tools than other sites

BioCyc desktop version can be installed locally, and allows editing of PGDBs

Some other sites re-annotate the genomes, which may or may not improve data quality

Microbial Genome Resources

SRI International Bioinformatics

CMR – Comprehensive Microbial Resource

Entrez

IMG – Integrated Microbial Genomes

KEGG – Kyoto Encyclopedia of Genes and Genomes

PATRIC

SEED/NMPDR

UMBBD – Univ of Minnesota Biocatalysis Biodegradation Database

CMR – Comprehensive Microbial

Bioinformatics

Resource J. Craig Venter Institute

http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi

~700 genomes

Genome data only, no pathways

Genome browser, gene pages

Many comparative operations

Will be discontinued later in 2010

Entrez Genomes National Center for Biotechnology Information

SRI International Bioinformatics

http://www.ncbi.nlm.nih.gov/sites/genome

Web portal to Genbank genomes

Genome browser, gene pages

IMG – Integrated Microbial Genomes Joint Genome Institute

http://img.jgi.doe.gov/cgi-bin/pub/main.cgi

1,911 microbial genomes (approx half are draft quality)

Genome browser, gene pages

Many comparative operations

Genome context analyses available

PATRIC Virginia Bioinformatics Institute

SRI International Bioinformatics

http://patric.vbi.vt.edu/

Genome browser, gene pages

KEGG pathways

SEED / NMPDR Argonne National Laboratory

SRI International Bioinformatics

http://www.nmpdr.org/FIG/wiki/view.cgi

782 microbial genomes

Funding ended in 2009

Unique features:

 Systems  Essential genes  Comparative genomics tools  Community annotation

UMBBD University of Minnesota

SRI International Bioinformatics

http://umbbd.msi.umn.edu/

Database of ~150 microbial biodegradation pathways

Does not include full microbial genomes

KEGG – Kyoto Encyclopedia of Genes and

Bioinformatics

Genomes Kyoto University

http://www.genome.ad.jp/kegg/

1,382 organisms

KEGG reannotates each genome

Static reference pathway maps are colored with the genes present in each organism

SRI International Bioinformatics

Comparison with KEGG

KEGG vs MetaCyc: Reference pathway collections

  KEGG maps are not pathways Nuc Acids Res 34:3687 2006    KEGG maps contain multiple biological pathways Two genes chosen at random from a BioCyc pathway are more likely to be related according to genome context methods than from a KEGG pathway KEGG maps are composites of pathways in many organisms -- do not identify what specific pathways elucidated in what organisms KEGG has no literature citations, no comments, less enzyme detail  KEGG assigns half as many reactions to pathways as MetaCyc 

KEGG vs organism-specific PGDBs

 KEGG does not curate or customize pathway networks for each organism  Highly curated PGDBs now exist for important organisms such as E. coli, yeast, mouse, Arabidopsis

SRI International

Comparison of Pathway Tools to KEGG

Inference tools

  KEGG does not predict presence or absence of pathways KEGG lacks pathway hole filler, operon predictor 

Curation tools

 KEGG does not distribute curation tools   No ability to customize pathways to the organism Pathway Tools schema much more comprehensive 

Visualization and analysis

   KEGG does not perform automatic pathway layout KEGG metabolic-map diagram extremely limited No comparative pathway analysis