Transcript Slide 1

ShewCyc and BeoCyc:
discovery platforms for environmental
and bioenergy research
Tatiana Karpinets, Gretta Serres, and Michael
Leuze
Oak Ridge National Lab, Marine Biological Laboratory
Pathway Tools Workshop 2010
ShewCyc and Shewanella
Knowledgebase
http://shewanella-knowledgebase.org:8080/Shewanella/
Experimental
data
Computational
predictions
Analytical and
visualization tools
Biological insight
Manually curated PGDB for
Shewanella oneidensis MR1
•Manual reannotation
•Localization prediction
•Regulon predictions (http://regprecise.lbl.gov/RegPrecise/)
•Capture information from literature, gene expression data, proteomics
Yang et al. J Biol Chem. 281:29872-85
Fur
Fnr
Crp
ArgR
Shewanella oneidensis
Metabolic Pathway Viewer
developed by Erich Baker , Baylor University, TX
http://watson.ecs.baylor.edu/4360/
Multi-Genome Annotation Solution:
Ortholog Editor in Combination with Genome Editors
Manual
Curation
Consistency
Check
Improved Individual Genome Editors
Ortholog Table Tools
Sort
Search
Table Overview
Edit View
Evidence View
Alignments View
Consistency Check
View
Download
Edit View
Alignment View
MUSCLE (3.6) multiple sequence alignment
Sfri_3956
MKIRVLISLATAFFMLNTSSAFAKDPADTAVQPLLVKPKVIIFDVNETLLDLENMRASVG
Swoo_1992
---------------------MTLELRDTSIIKDF--PKAVIFDTDNTLYPYHYSHQQAS
:: :
**::
:
**.:***.::**
. ...
Sfri_3956
KALNGREDLLPLWFSTMLHHSLVVSATGDYQTFGSIGVA---------SLQMVAEINGIA
Swoo_1992
LAVQQKAEKILGIKQSRFSDALKISKREIKERLGETASSHSRLLYFQRTIELLGLKTQIM
*:: . : :
.: :
:* :*
: :*. . :
:::::.
. *
Sfri_3956
ITPEQAKTAILTPLRSLPAHPDVAEGLAKLKAQGYKLVTLTNSSLEGVTLQLKNANLSQY
Swoo_1992
TTLDLEQTYWRTFLTNSQLFPEMHEFLHDLRAHGIQSAVVTDLTAQIQFRKLVYFGLHEA
* :
:*
* * .
.*:: * * .*.*:* : ..:*: : :
:*
.* :
Sfri_3956
FDANLSIESVGVFKPHLKTYQWAIKDLGVNADEAL-MVAAHGW-DIAGADKAGLQTAFIR
Swoo_1992
FDYIVTTEEAGADKPNPLPFQLARSKLGLEKGDNLWMIGDHPVKDIQGAKKT-LGAITLQ
**
:: *..*. **:
.:* * ..**:: .: * *:. *
Sfri_3956
RQGKVLFPLAAQPDYNVL--DVNELASTLAKFN-----
Swoo_1992
KNHKDVKVLKGKEGPDILFDKYSELRELLGEISSNKGK
.: * :
* .: . ::*
. .** . *.::.
** **.*: * :
:.
Evidence View
Consistency Check View
Group annotation
Protein length consistency
Original annotations
Automatic
identification of
bad grouping


Domain consistency
Protein Length Consistency
Domain Consistency
ShewCyc
Probing Intergenic Regions (IGs)
in S. oneidensis using microarray
Experimental data (Many Microbe Microarrays Database): CRP
mutant vs wild type MR1 (various time points during the
transition from aerobic growth with lactate to anaerobic growth
with lactate and fumarate.
Affymetrix microarray was designed to probe transcripts derived
from both genes and IGs
Examples: IG SO0016_SO0022;
IG SO0017-SO0015
A regulatory effect of the IG transcription
Down-stream gene
IG
Up-stream gene
Subset I: IG regions with the same direction of change
in gene expression as their neighboring genes (1466)
Subset A: IG regions with directions of changes in gene
expression that are opposite to upstream genes (805)
Subset B: IG regions with directions of changes in gene
expression that are opposite to downstream genes (820)
Revealing a biological role of Intergenic Regions
transcription using Pathway Tools
IG (SO2490_SO2491)
SO2490 (HexR)
Enzymes of the
Entner–Doudoroff
(ED) pathway
HexR
SO2491 (PykA)
PykA
BioEnergy research Science Center
(BESC)
Breakdown
into sugars
Cellulosic
biomass BESC’s approach:
Fuel(s)
Sugar
Fermentation
(1) designing plant cell walls for rapid deconstruction and
(2) engineering microbes for converting plants into biofuel in a single step
(consolidated bioprocessing)
Manually curated (NREL, UGA)
pathway genome database for
Usage Summary
Populus trichocarpa
Private
Public
portal
http://besckb.ornl.gov
Metabolic reconstructions for
BESC relevant
microorganisms (BeoCyc)
Integrating Experimental Data from
LIMS and external resources:
Computational predictions:
•Orthologs/Inparalogs
•Protein Domains
•Protein Localization
•Metabolic enzymes and pathways
•Carbohydrates Active Enzymes
•and more
Genomes comparison,
analysis and visualization
tools:
Genome browsers
Comparative chromosome
maps (CMAP)
Metabolic maps
Omic Viewers
and more
Microbial Phenotypes
comparison toolkit
Analysis
Framework
CAZYmes Analysis Toolkit (CAT)
Novel approach based on the
association analysis to discover
links between CAZy families and
pfam domains
Web site: http://cricket.ornl.gov/cgi-bin/cat.cgi
Find conserved
associations
between CAZy
families and pfam
domains
Assign
carbohydrates
activity to
unknown protein
domains
Suggest novel
CAZy families
Find CAZymes
among
hypothetical
proteins
Assign CAZy families to
a sequence with high
specificity and
sensitivity
Private BeoCyc hosts a
P. trichocarpa PGDB manually curated by NREL team
1. GDP-mannose biosynthesis II,
2. GDP-L-fucose biosynthesis I (from GDP-Dmannose),
3. GDP-L-fucose biosynthesis II (from L-fucose),
4. UDP-D-galactose biosynthesis,
5. UDP-D-galacturonate biosynthesis I (from UDPD-glucuronate),
6. UDP-D-galacturonate biosynthesis II (from Dgalacturonate),
7. UDP-D-glucose biosynthesis (from sucrose),
8. UDP-D-glucuronate biosynthesis (from myoinositol),
9. UDP-D-xylose biosynthesis
(compartmentalized),
10. UDP-L-arabinose biosynthesis I (from UDPxylose) in Endoplasmic Reticulum,
11. UDP-L-arabinose biosynthesis I (from UDPxylose) in Cytosol,
12. UDP-L-arabinose biosynthesis I (from UDPxylose) in Golgi lumen,
13. UDP-L-arabinose biosynthesis II (from Larabinose) in Cytosol
BeoCyc and BESC knowledgebase
http://bobcat.ornl.gov/besc/index.jsp
Improving Populus Trichocarpa
genome annotation


Poor annotation of the poplar genome (gene models and
predicted enzymes)
Poor representation of the cell wall biosynthesis and
related pathways in the reference databases (MetaCyc,
KEGG, and PlantCyc)
Arabidopsis
Genes
Future!!!
RESD & PESD
Populus
EC numbers
EC numbers
Genes
Blast
Sequences
Ortholog
search
Sequences
Kyoto Encyclopedia of Genes and Genomes
genomic and molecular information
Integration of the metabolic reconstructions
into BESC knowledgebase
RefSeq files
from the NCBI
Input files
for Pathologic
Enzyme information
KEGG, CAZy
Pathway
Genome
Databases
Refine the PGDBs
Create MySQL tables
Supplement databases
by additional annotations
Compare phenotypes of the
organisms In terms of their
genomic and metabolic
characteristics
Challenge : automatic PGDB generation for draft genomes using one
table for orf predictions fastA contigs
!!! Predict automatically TU, complexes, transporters
C0 or4062
1176
C0
C0
C0
C0
C0
or4063
or4064
or4065
or4066
or4067
4667
5611
6384
7869
8597
C0 or4068
C0 or4069
8852
9812
C1r or2287
C1r or2288
343
1398
C1r
C1r
C1r
C1r
C1r
C1r
C3r
or2289
or2290
or2291
or2292
or2293
or2294
or2604
2852
2985
5705
6933
7743
8110
401
C3r
C3r
C3r
C3r
C3r
C3r
C4r
or2605
or2606
or2607
or2608
1499
1870
3807
4128
709Polyketide cyclase/dehydrase
L-threonine ammonia-lyase (21206 oxobutanoate-forming)
4682glutaryl-CoA dehydrogenase
5608hypothetical protein
7120naphthoate synthase
7887GntR domain protein
adenosylcobinamide-phosphate
8643 guanylyltransferase
8973hypothetical protein
protein of unknown function DUF6
1230 transmembrane
1679putative lipoprotein
Alcohol dehydrogenase GroES domain
1854 protein
3881LysR substrate-binding
3897peptidase M24
5764Cysteine desulfurase
7129Lysine exporter protein (LYSE/YGGA)
78413-hydroxybutyryl-CoA epimerase
222phage tail sheath protein FI
phosphonate metabolism protein/1,5bisphosphokinase (PRPP-forming)
903 PhnN
2196Arc-like DNA binding
2365glutaryl-CoA dehydrogenase
4640hypothetical protein
4.3.1.19
1.3.99.7
4.1.3.36
2.7.7.62
5.1.2.3 1.1.1.35
1.3.99.7
EC5
EC4
EC3
EC2
EC1
Product
Stop
Start
Locus
Replicon
for each contig
>C0
ATAAAGACGAAAAGCACCGGAT
CGAACACCGCCACTTCGAAAAC
TTCGAACGTCTACGG ….
>C1r
AGTGCGGCTAGGCCGTCGATGG
AGCTAGGCCGTCGA ….
>C3r
GACGAAAAGCAGACGAAAAGCA
GACGAAAAGCAGCT….
….
Involvement of Single-Genotype Consortia in
Degradation of Aromatic Compounds by
Rhodopseudomonas palustris
p-Coumarate -
- anoxygenic photosynthesis
Benzoate
- aerobic or anaerobic respiration and
fermentation
- fixation of nitrogen gas
- utilization of carbon through CO2
reduction using H2 as an electron
donor
Average log2 ratio of
the expression of
nitrogenases with
different cofactors in
the growth on pcoumarate and
benzoate versus
succinate
•Transpoters
•Chemotaxes operons
•Curli formation operon
Expression of R. palustris phenotypes under p-coumarate
(black columns) and benzoate (white columns) degrading
conditions if compared with growth on succinate.
p-Coumarate -
Benzoate
Structures of R. palustris consortia mediating
anaerobic growth on p-coumarate (A)
and on benzoate (B)
Putative electron donor and electron acceptor
reactions under different modes of the
Rhodopseudomonas palustris growth
Changes in total nitrogen, ammonium and
dissolved nitrogen gas during the benzoate
degradation as functions of OD660
Acknowledgements
ShewCyc and
Shewanella Knowledgebase
BeoCyc and
BESC Knowledgebase
PNNL:
Margaret Romine
NREL:
Ambarish Nag
Christopher Chang
Marine Biological Laboratory:
Margrethe Serres
ORNL:
Denise Schmoyer
Guruprasad Kora
Mustafa Syed
Erich Baker
Hoony Park
Nagiza Samatova
and Edward Uberbacher
UGA:
Maor Bar-Peled
ORNL:
Mustafa Syed
Hoony Park
Morrey Parang
Denise Schmoyer
and Edward C. Uberbacher