Transcript General
Overview of the
Pathway Tools Software
and
Pathway/Genome Databases
Introductions
BRG Staff
Peter Karp
Pallavi Kaipa
Mario Latendresse
Suzanne Paley
Markus Krummenacker
Ingrid Keseler
Ron Caspi
Alex Shearer
Carol Fulcher
Attendees
Where from, what genome?
What do you hope to get out of the tutorial?
SRI International
Bioinformatics
SRI International
Private nonprofit research
institute
No permanent funding
sources
1300 staff in Menlo Park
– Founded in 1946 as Stanford Research Institute
– Separated from Stanford University in 1970
– Name changed to SRI International in 1977
– David Sarnoff Research Center acquired in 1987
SRI International
Bioinformatics
SRI International
Bioinformatics
SRI Organization
Bioinformatics Research Group
Information and
Computing Sciences
Biopharmaceuticals
And
Pharmaceutical
Discovery
Education
and
Policy
Engineering Systems
And Sciences
Physical
Sciences
SRI International
Bioinformatics
Research in the SRI
Bioinformatics Research Group
EcoCyc
MetaCyc
Pathway
Tools
Pathway Holes
BioWarehouse
Enzyme Genomics
Outline for Tutorial
SRI International
Bioinformatics
Monday
Introduction
Pathway/Genome Navigator
PathoLogic tutorial and demo
PathoLogic lab session – Make genome input files parsable
Tuesday
PathoLogic tutorial
PathoLogic lab session – Build initial version of PGDB
Wednesday
Pathway hole filler, operon predictor, transport inference parser
Thursday
Editors
Feedback session
Tutorial Goals
SRI International
Bioinformatics
General
familiarity with Pathway Tools goals and
functionality
Ability
to create, edit, and navigate a new PGDB
Create
new PGDB for genome(s) you brought with
you
Familiarity
with information resources available
about Pathway Tools to continue your work
SRI International
Bioinformatics
SRI’s Support for Pathway Tools
NIH
grant finances software development and
user support
Additional
grants finance other software
development
Email
us bug reports, suggestions, questions
Comprehensive
bug reports are required for us to
fix the problem you reported
Keep
us posted regarding your progress
Administrative Details
Please wear badges at all times
Escort required outside this room/hallway
Let us know when you are leaving
Use E-Bldg Entrance
Phone numbers to call from entrance
Meals
Wednesday outing possible
Restrooms
SRI International
Bioinformatics
Tutorial Format
Questions
SRI International
Bioinformatics
welcome during presentations
Lab
sessions will take different amounts of time
for different people
Refine your PGDB
Read Pathway Tools manuals
Buddy
system for some computers
Computer logins
Internet
connectivity
Pathway/Genome Database
SRI International
Bioinformatics
Integrating Genomic and Biochemical Data
Pathways
Reactions
Compounds
Proteins
Genes
Operons,
Promoters,
DNA Binding Sites
Chromosomes,
Plasmids
CELL
Terminology
Organism Database (MOD) –
DB describing genome and other
information about an organism
Model
Pathway/Genome
Database
(PGDB) – MOD that combines
information about
Pathways, reactions, substrates
Enzymes, transporters
Genes, replicons
Transcription factors, promoters,
operons, DNA binding sites
– Collection of 205 PGDBs
at BioCyc.org
EcoCyc, AgroCyc, HumanCyc
BioCyc
SRI International
Bioinformatics
BioCyc Collection of
Pathway/Genome Databases
Database (PGDB) –
combines information about
Pathways, reactions, substrates
Enzymes, transporters
Genes, replicons
Transcription factors/sites, promoters,
operons
Pathway/Genome
Tier
1: Literature-Derived PGDBs
MetaCyc
EcoCyc -- Escherichia coli K-12
BioCyc Open Chemical Database
Tier
2: Computationally-derived DBs,
Some Curation -- 12 PGDBs
HumanCyc
Mycobacterium tuberculosis
Tier
3: Computationally-derived DBs,
No Curation -- 191 DBs
SRI International
Bioinformatics
Terminology –
Pathway Tools Software
SRI International
Bioinformatics
PathoLogic
Predicts operons, metabolic network, pathway hole fillers, from genome
Computational creation of new Pathway/Genome Databases
Pathway/Genome Editors
Distributed curation of PGDBs
Distributed object database system, interactive editing tools
Pathway/Genome Navigator
WWW publishing of PGDBs
Querying, visualization of pathways, chromosomes, operons
Analysis operations
Pathway visualization of gene-expression data
Global comparisons of metabolic networks
Bioinformatics 18:S225 2002
SRI International
Bioinformatics
Pathway/Genome DBs Created by
External Users
600+
licensees -- 50 groups applying software to 100+ organisms
Software freely available to academics; Each PGDB owned by its creator
Saccharomyces
cerevisiae, SGD project, Stanford University
pathway.yeastgenome.org/biocyc/
TAIR, Carnegie Institution of Washington
Arabidopsis.org:1555
dictyBase, Northwestern University
GrameneDB, Cold Spring Harbor Laboratory
Planned:
CGD (Candida albicans), Stanford University
MGD (Mouse), Jackson Laboratory
RGD (Rat), Medical College of Wisconsin
WormBase (C. elegans), Caltech
Large
scale users:
C. Medigue, Genoscope, 67 PGDBs
G. Burger, U Montreal, 20 PGDBs
DOE GTL contractors:
G. Church, Harvard, Prochlorococcus marinus MED4
Larimer/Uberbacher, ORNL, Shewanella onedensis
J. Keasling, UC Berkeley, Desulfovibrio vulgaris
Fiona Brinkman, Simon Fraser Univ, Pseudomonas aeruginosa
Terminology
“Database”
SRI International
Bioinformatics
= “DB” = “Knowledge Base” = “KB” =
“Pathway/Genome Database” = “PGDB”
Why Create PGDBs?
SRI International
Bioinformatics
Extract more information from your genome
Create an up-to-date computable information repository
about an organism
Perform analyses on the genome and pathway complement
of the organism, e.g., analyses of omics data
Perform comparative analyses with other organisms
Generate a genome poster and metabolic wall chart
Sequence Project Workflow
Raw Sequence
Phred
SRI International
Bioinformatics
PathoLogic
P/G Editors
Pathway
Tools
Phrap
P/G Navigator
GeneMark/Glimmer
BLAST, BLOCKS
WWW Publishing
Analyses
SRI International
Bioinformatics
MetaCyc: Metabolic Encyclopedia
Nonredundant
metabolic pathway database
Describe a representative sample of every
experimentally determined metabolic pathway
Literature-based
DB with extensive references
and commentary
Pathways, reactions, enzymes, substrates
Jointly
developed by SRI and Carnegie Institution
Nucleic Acids Research 34:D511-D516 2006
MetaCyc Data
SRI International
Bioinformatics
Family of Pathway/Genome
Databases
EcoCyc
MetaCyc
SRI International
Bioinformatics
CauloCyc
AraCyc
MtbRvCyc
HumanCyc
SRI International
Bioinformatics
Omics Viewer
Import
gene expression, proteomics,
metabolomics data
Obtain pathway based visualizations of omics
data
Numerical spectrum of expression values mapped to a color
spectrum
Steps of overview painted with color corresponding to
expression level(s) of genes that encode enzyme(s) for that
step
SRI International
Bioinformatics
Environment for Computational
Exploration of Genomes
Powerful
ontology opens many facets of the
biology to computational exploration
Global
characterization of metabolic network
Analysis of interface between transport and
metabolism
Nutrient analysis of metabolic network
SRI International
Bioinformatics
Pathway Tools Implementation Details
Allegro
Common Lisp
Sun, Linux, Windows platforms
Ocelot
object database
300,000+
lines of code
Lisp-based
WWW server at BioCyc.org
Manages 205 PGDBs
SRI International
Bioinformatics
The Common Lisp Programming
Environment
Gatt
studied
Lisp and Java
implementation
of 16 programs
by 14
programmers
(Intelligence
11:21 2000)
Survey
Please
SRI International
Bioinformatics
complete survey at end of each day
PGDB(s) That You Build
Before
SRI International
Bioinformatics
you leave
Tar up your PGDB directory and FTP it home, email it home,
or copy it to flash disk
We will create a backup copy of your PGDB directory if the
directory is still there at the end of the tutorial
Delete the PGDB directory if you don’t want us to back it up
We will not give the backed up data to anyone else
Summary
SRI International
Bioinformatics
Pathway
Tools and Pathway/Genome Databases
Not just for pathways!
Computational inferences
Operons, metabolic pathways, pathway hole fillers
Editing tools
Analysis tools: Omics data on pathways
Web publishing of PGDBs
Main
classes of users:
Develop PGDB to extract more information from genome for
genome paper
Develop a model-organism DB for the organism that is
updated regularly and published on the web
Information Sources
SRI International
Bioinformatics
Pathway Tools User’s Guide
/root/aic-export/ecocyc/genopath/released/doc/manuals/userguide1.pdf
/toot/aic-export/ecocyc/genopath/released/doc/manuals/userguide2.pdf
Pathway/Genome Navigator
Appendix A: Guide to the Pathway Tools Schema
PathoLogic, Editing Tools
NOTE: Location of the aic-export directory can vary across different
computers
Pathway Tools Web Site
http://bioinformatics.ai.sri.com/ptools/
Publications, programming examples, etc.
Slides from this tutorial
http://bioinformatics.ai.sri.com/ptools/tutorial/