Introduction

Download Report

Transcript Introduction

Editing Pathway/Genome
Databases
Ron Caspi
Compounds, Reactions and Pathways
Activate Editing Mode
 Type
SRI International
Bioinformatics
(enable/disable-editors t) at the listener pane
Why Curation is Important!
 Curators
SRI International
Bioinformatics
need jobs
 “in
silico” information less solid than experimental
evidence
 Database
curation greatly enhances the
usefulness of the data
Pathway Tools Paradigms
SRI International
Bioinformatics

Separate database from user interface

Navigator provides one interface to the DB

Editors provide an alternative interface to the DB
• Reuse information whenever
possible!
• A PGDB should not describe the same
biological or chemical entity more than
once
• A tool helps to prevent creation of
duplicate reactions
Editing rules: Support Policy
 Do
 Do
SRI International
Bioinformatics
not modify the EcoCyc or MetaCyc datasets
not alter DB schema
 e.g. do not add or remove classes or slots
List of Editors













SRI International
Bioinformatics
Compound Editor
Compound Structure Editors
Reaction Editor
Synonym Editor
Publication Editor
Pathway Editor and Pathway Info Editor
Protein/Subunit structure/Enzymatic Reaction Editors
Gene Editor
Intron Editor (Eukaryotes only)
Transcription Unit Editor
Frame Editor
Relationships Editor
Ontology Editor
Saving Changes
 The
user must save changes explicitly
 File => Save Current DB
 Save DB button
 List Unsaved Changes in Current DB
 Revert Current DB
 Checkpoint
Current DB Updates to File
 Restore Updates from Checkpoint File
SRI International
Bioinformatics
SRI International
Bioinformatics
Other DB commands under the File
menu






Summarize databases
Summarize current organism
Refresh DB list
Refresh All Current DBs
Delete a DB
Attempt to Reconnect to Database Server
Invoking the Editors
SRI International
Bioinformatics
1. New Object: Use the “New” command
2. Existing Object: Right-Click on the Object Handle
Compound Editor

Create or edit a
compound

Specify Class
Common Name and
Synonyms
Comments, citations
Links to other DBs



SRI International
Bioinformatics
The Synonym Editor
 Lets
you easily edit the synonyms and set the
common name
SRI International
Bioinformatics
Citations
 Citation
boxes
 The CITS field
 File =>Import Citations
from Pubmed
 Publication editor
(invoke by right clicking
on a citation at bottom)
Non Pubmed citation:
Enter in citation box in the
form Smith06, invoke editor by
clicking out of a citation box.
SRI International
Bioinformatics
More Compound Editing





Compound Structure Editors (Marvin, JME)
Mol files
Exporting to other DBs
Merging
Duplicate Frame and Edit
SRI International
Bioinformatics
Reaction Editor




Enter or edit a reaction equation
EC number (official?)
Check for balance
Compound Resolver
SRI International
Bioinformatics
Pathway Info Editor
• Class (variant
class)
• Common Name
• Synonyms
• Evidence code
• Citations (CIT)
• Comments
• External Links
• Hypothetical
reactions
• Author credits
SRI International
Bioinformatics
Pathway Editor
 Graphically
 Two
create and modify pathways
tools:
 Connections Editor:
add reactions one by one
 Segment Editor:
enter a linear pathway segment
SRI International
Bioinformatics
SRI International
Bioinformatics
Connections Editor Operations





Two main display panes:
 left: unconnected pathway reactions
 right: draws connected reactions (looks like the regular Pathway display
window)
Connecting reactions:
 select initial reaction (in either pane) ===> red and green reactions
 select a green reaction
Useful Commands:
 choose main compounds for reaction
 disconnect all reactions
In circular pathways, specify which compound should be at the
top
Add links to other pathways, reactions, or comments
Pathway Segment Editor
 To
enter linear sequence of
reactions simultaneously
 Reactions
are specified by EC
numbers or reaction
substrates
 One
segment may contain up
to 7 reactions
SRI International
Bioinformatics
SRI International
Bioinformatics
Pathway Editor Limitations
 Complex
situations can cause ambiguity:
 link may be ignored
 dialog box for disambiguating
 pathway drawn in bizarre arrangement
 Fix:
 try removing offending link and add links in different order
 Pathway
Editor does not handle polymerization
pathways easily.
Evidence Codes for Pathways

http://brg.ai.sri.com/ptools/evidence-ontology.html

EV-COMP: Inferred from computation
 HINF - Human inference
 AINF - Artificial inference
EV-AS: Author statement
 TAS - traceable
 NAS – non-traceable
EV-IC: Inferred by curator
EV-EXP: Inferred from experiment
 IDA - inferred from direct assay
 IPI - inferred from physical interaction
 TAS – inferred from traceable data (review)
 IEP - inferred from expression pattern
 IGI - inferred from genetic interaction
 IMP- inferred from mutant phenotype



SRI International
Bioinformatics
Enzyme/Protein Editors






SRI International
Bioinformatics
To add an enzyme to a reaction:
Right click the reaction, choose Edit => Create/Add enzyme.
“Choose Protein”: specify ID, or “Search by genes or create new
protein”
=> Protein subunit structure editor
Protein Editor
Check the Curator Guide at
 http://bioinformatics.ai.sri.com/ptools/curatorsguide.pdf
Protein Editor
SRI International
Bioinformatics
Enzymatic Reaction Editor
SRI International
Bioinformatics
Protein Subunit Editor
SRI International
Bioinformatics
Super Pathways
SRI International
Bioinformatics

Need to keep pathways within well-defined end points
Link pathways to upstream or downstream pathways with pathway links.

Create more complex metabolic networks using superpathways

Example: superpathway of aromatic compound degradation (aerobic)










is composed of:
catechol degradation II
mandelate degradation I
benzoate degradation (aerobic)
b-ketoadipate degradation
protocatechuate degradation II
shikimate degradation
quinate degradation
4-hydroxymandelate degradation
tryptophan degradation I
Pathway Export
 Export
Edit => Add Pathway to File Export List
 File => Export => Selected Pathways to File

 Import

File => Import => Pathways from File
SRI International
Bioinformatics
SRI International
Bioinformatics
Creating Links to External Databases

Creating links from a PGDB to external databases

To define a new external database:
Tools => Ontology Browser
View => Browse from new root / type Databases
Highlight Databases
Frame => Create => Instance
Enter frame name, frame edit
Enter Common Name, Static-Search-URL
e.g. http:/gene.pharma.com/dbquery?
Enter a value for Search-Object-Class (e.g. Proteins)








Creating links to a PGDB
see http://biocyc.org/linking.shtml
Constraint Checking



SRI International
Bioinformatics
General rules that constrain the valid relationships among
instances
Constraints are checked when new facts are asserted to
assure that the DB remains logically consistent
Constraints on slots:
 Domain violation checks to make sure the slots are in instances of the
appropriate class
 Range violation :





value type
value cardinality
Inverse
Cardinality
Lisp-predicate
SRI International
Bioinformatics
Consistency Checking (correctify-kb)













Removes newlines from names
Converts “<“ to “|” in string citations
Checks isozyme sequence similarity
Fixes references between polypeptides and genes
Changes compound names to ids in a variety of slots
Matches physiological regulators to other regulators
Cross-references compounds to reactions
Checks pathways predecessors/reactions/subs
Checks reaction balancing
Checks compound structures
Calculates sub- and super-pathways
Finds missing sub-pathways links
Verifies chromosome components and positions
Update your computers!
 To
SRI International
Bioinformatics
install a patch:
Tools => Instant Patch => Download and Activate
All Patches
Make sure that…
SRI International
Bioinformatics
You perform all exercises
on the Hb. pylori database,
not on your own!!!
Creating New Reactions
SRI International
Bioinformatics
Don’t forget to include spaces between chemical names and
terms such as “+” and “=“:
1.
ascorbate + H2O = 3-keto-L-gulonate
2.
3-keto-L-gulonate + ATP = 3-keto-L-gulonate-6-phosphate + ADP
3.
3-keto-L-gulonate-6-phosphate = L-xylulose-5-phosphate + CO2
4.
L-xylulose-5-phosphate = L-ribulose-5-phosphate
5.
L-ribulose-5-phosphate = xylulose-5-phosphate
6.
xylulose-5-phosphate = D-ribulose-5-phosphate
Fill Reaction frame ID’s in your
handout
Reaction
ascorbate + H2O = 3-keto-L-gulonate
3-keto-L-gulonate + ATP =
3-keto-L-gulonate 6-phosphate + ADP
3-keto-L-gulonate 6-phosphate =
L-xylulose-5-phosphate + CO2
L-xylulose-5-phosphate =
L-ribulose-5-phosphate
L-ribulose-5-phosphate = xylulose-5-phosphate
xylulose-5-phosphate = D-ribulose-5-phosphate
SRI International
Bioinformatics
Frame ID
XXX
Duplicate Reaction?
Frame ID of the new
reaction to be
created. This frame
will NOT be created
unless you choose
“Keep”
Frame ID of the
existing reaction.
This reaction will
NOT be transferred
into your database
until you click
“Import”!
Record this BEFORE you click “Import”
SRI International
Bioinformatics
Define a New Pathway
SRI International
Bioinformatics

Define the pathway L-ascorbate degradation to D-ribulose-5phosphate by connecting the reactions together

Assign class:
(Pathways -> Degradation/Utilization/Assimilation -> Carboxylates,
Other)

Add the reactions, conect them, and add a link to the pathway
non-oxidative branch of the pentose phosphate pathway
(Generation of precursor metabolites and energy => Pentose
phosphate pathways =>)

Add a reverse link from non-oxidative branch of the pentose
phosphate pathway to the new pathway
Run (correctifykb)
 Open
 Run
SRI International
Bioinformatics
the database Hb. pylori (HypCyc) (so ‘hyp)
(correctify-kb)
 Analyze
output