Transcript General

Overview of the
Pathway Tools Software
and
Pathway/Genome Databases
Introductions

BRG Staff
 Peter Karp
 Pallavi Kaipa
 Mario Latendresse
 Suzanne Paley
 Markus Krummenacker
 Ingrid Keseler
 Ron Caspi
 Alex Shearer
 Carol Fulcher

Attendees
 Where from, what genome?
 What do you hope to get out of the tutorial?
SRI International
Bioinformatics
SRI International

Private nonprofit research
institute

No permanent funding
sources

1300 staff in Menlo Park
– Founded in 1946 as Stanford Research Institute
– Separated from Stanford University in 1970
– Name changed to SRI International in 1977
– David Sarnoff Research Center acquired in 1987
SRI International
Bioinformatics
SRI International
Bioinformatics
SRI Organization
Bioinformatics Research Group
Information and
Computing Sciences
Biopharmaceuticals
And
Pharmaceutical
Discovery
Education
and
Policy
Engineering Systems
And Sciences
Physical
Sciences
SRI International
Bioinformatics
Research in the SRI
Bioinformatics Research Group
 EcoCyc
 MetaCyc
 Pathway
Tools
 Pathway Holes
 BioWarehouse
 Enzyme Genomics
Outline for Tutorial




SRI International
Bioinformatics
Monday
 Introduction
 Pathway/Genome Navigator
 PathoLogic tutorial and demo
 PathoLogic lab session – Make genome input files parsable
Tuesday
 PathoLogic tutorial
 PathoLogic lab session – Build initial version of PGDB
Wednesday
 Pathway hole filler, operon predictor, transport inference parser
Thursday
 Editors
 Feedback session
Tutorial Goals
SRI International
Bioinformatics
 General
familiarity with Pathway Tools goals and
functionality
 Ability
to create, edit, and navigate a new PGDB
 Create
new PGDB for genome(s) you brought with
you
 Familiarity
with information resources available
about Pathway Tools to continue your work
SRI International
Bioinformatics
SRI’s Support for Pathway Tools
 NIH
grant finances software development and
user support
 Additional
grants finance other software
development
 Email
us bug reports, suggestions, questions
 Comprehensive
bug reports are required for us to
fix the problem you reported
 Keep
us posted regarding your progress
Administrative Details



Please wear badges at all times
Escort required outside this room/hallway
Let us know when you are leaving

Use E-Bldg Entrance
Phone numbers to call from entrance

Meals

Wednesday outing possible

Restrooms

SRI International
Bioinformatics
Tutorial Format
 Questions
SRI International
Bioinformatics
welcome during presentations
 Lab
sessions will take different amounts of time
for different people
 Refine your PGDB
 Read Pathway Tools manuals
 Buddy
system for some computers
 Computer logins
 Internet
connectivity
Pathway/Genome Database
SRI International
Bioinformatics
Integrating Genomic and Biochemical Data
Pathways
Reactions
Compounds
Proteins
Genes
Operons,
Promoters,
DNA Binding Sites
Chromosomes,
Plasmids
CELL
Terminology
Organism Database (MOD) –
DB describing genome and other
information about an organism
Model
Pathway/Genome
Database
(PGDB) – MOD that combines
information about
 Pathways, reactions, substrates
 Enzymes, transporters
 Genes, replicons
 Transcription factors, promoters,
operons, DNA binding sites
– Collection of 205 PGDBs
at BioCyc.org
 EcoCyc, AgroCyc, HumanCyc
BioCyc
SRI International
Bioinformatics
BioCyc Collection of
Pathway/Genome Databases
Database (PGDB) –
combines information about
 Pathways, reactions, substrates
 Enzymes, transporters
 Genes, replicons
 Transcription factors/sites, promoters,
operons
Pathway/Genome
Tier
1: Literature-Derived PGDBs
 MetaCyc
 EcoCyc -- Escherichia coli K-12
 BioCyc Open Chemical Database
Tier
2: Computationally-derived DBs,
Some Curation -- 12 PGDBs
 HumanCyc
 Mycobacterium tuberculosis
Tier
3: Computationally-derived DBs,
No Curation -- 191 DBs
SRI International
Bioinformatics
Terminology –
Pathway Tools Software
SRI International
Bioinformatics

PathoLogic
 Predicts operons, metabolic network, pathway hole fillers, from genome
 Computational creation of new Pathway/Genome Databases

Pathway/Genome Editors
 Distributed curation of PGDBs
 Distributed object database system, interactive editing tools

Pathway/Genome Navigator
 WWW publishing of PGDBs
 Querying, visualization of pathways, chromosomes, operons
 Analysis operations


Pathway visualization of gene-expression data
Global comparisons of metabolic networks
Bioinformatics 18:S225 2002
SRI International
Bioinformatics
Pathway/Genome DBs Created by
External Users
600+
licensees -- 50 groups applying software to 100+ organisms
Software freely available to academics; Each PGDB owned by its creator
Saccharomyces
cerevisiae, SGD project, Stanford University
 pathway.yeastgenome.org/biocyc/
TAIR, Carnegie Institution of Washington
Arabidopsis.org:1555
dictyBase, Northwestern University
GrameneDB, Cold Spring Harbor Laboratory
Planned:
 CGD (Candida albicans), Stanford University
 MGD (Mouse), Jackson Laboratory
 RGD (Rat), Medical College of Wisconsin
 WormBase (C. elegans), Caltech
Large


scale users:
C. Medigue, Genoscope, 67 PGDBs
G. Burger, U Montreal, 20 PGDBs
DOE GTL contractors:
 G. Church, Harvard, Prochlorococcus marinus MED4
 Larimer/Uberbacher, ORNL, Shewanella onedensis
 J. Keasling, UC Berkeley, Desulfovibrio vulgaris
Fiona Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa

Terminology
 “Database”
SRI International
Bioinformatics
= “DB” = “Knowledge Base” = “KB” =
“Pathway/Genome Database” = “PGDB”
Why Create PGDBs?
SRI International
Bioinformatics

Extract more information from your genome

Create an up-to-date computable information repository
about an organism

Perform analyses on the genome and pathway complement
of the organism, e.g., analyses of omics data

Perform comparative analyses with other organisms

Generate a genome poster and metabolic wall chart
Sequence Project Workflow
Raw Sequence
Phred
SRI International
Bioinformatics
PathoLogic
P/G Editors
Pathway
Tools
Phrap
P/G Navigator
GeneMark/Glimmer
BLAST, BLOCKS
WWW Publishing
Analyses
SRI International
Bioinformatics
MetaCyc: Metabolic Encyclopedia
 Nonredundant
metabolic pathway database
 Describe a representative sample of every
experimentally determined metabolic pathway
 Literature-based
DB with extensive references
and commentary
 Pathways, reactions, enzymes, substrates
 Jointly
developed by SRI and Carnegie Institution
Nucleic Acids Research 34:D511-D516 2006
MetaCyc Data
SRI International
Bioinformatics
Family of Pathway/Genome
Databases
EcoCyc
MetaCyc
SRI International
Bioinformatics
CauloCyc
AraCyc
MtbRvCyc
HumanCyc
SRI International
Bioinformatics
Omics Viewer
 Import
gene expression, proteomics,
metabolomics data
 Obtain pathway based visualizations of omics
data
 Numerical spectrum of expression values mapped to a color
spectrum
 Steps of overview painted with color corresponding to
expression level(s) of genes that encode enzyme(s) for that
step
SRI International
Bioinformatics
Environment for Computational
Exploration of Genomes
 Powerful
ontology opens many facets of the
biology to computational exploration
 Global
characterization of metabolic network
 Analysis of interface between transport and
metabolism
 Nutrient analysis of metabolic network
SRI International
Bioinformatics
Pathway Tools Implementation Details
 Allegro
Common Lisp
 Sun, Linux, Windows platforms
 Ocelot
object database
 300,000+
lines of code
 Lisp-based
WWW server at BioCyc.org
 Manages 205 PGDBs
SRI International
Bioinformatics
The Common Lisp Programming
Environment
 Gatt
studied
Lisp and Java
implementation
of 16 programs
by 14
programmers
(Intelligence
11:21 2000)
Survey
 Please
SRI International
Bioinformatics
complete survey at end of each day
PGDB(s) That You Build
 Before
SRI International
Bioinformatics
you leave
 Tar up your PGDB directory and FTP it home, email it home,
or copy it to flash disk
 We will create a backup copy of your PGDB directory if the
directory is still there at the end of the tutorial
 Delete the PGDB directory if you don’t want us to back it up
 We will not give the backed up data to anyone else
Summary
SRI International
Bioinformatics
 Pathway
Tools and Pathway/Genome Databases
 Not just for pathways!
 Computational inferences

Operons, metabolic pathways, pathway hole fillers
Editing tools
 Analysis tools: Omics data on pathways
 Web publishing of PGDBs

 Main
classes of users:
 Develop PGDB to extract more information from genome for
genome paper
 Develop a model-organism DB for the organism that is
updated regularly and published on the web
Information Sources

SRI International
Bioinformatics
Pathway Tools User’s Guide
 /root/aic-export/ecocyc/genopath/released/doc/manuals/userguide1.pdf



/toot/aic-export/ecocyc/genopath/released/doc/manuals/userguide2.pdf


Pathway/Genome Navigator
Appendix A: Guide to the Pathway Tools Schema
PathoLogic, Editing Tools
NOTE: Location of the aic-export directory can vary across different
computers

Pathway Tools Web Site
 http://bioinformatics.ai.sri.com/ptools/
 Publications, programming examples, etc.

Slides from this tutorial
 http://bioinformatics.ai.sri.com/ptools/tutorial/