Introduction
Download
Report
Transcript Introduction
Editing Pathway/Genome
Databases
Ron Caspi
Compounds, Reactions and Pathways
Activate Editing Mode
Type
SRI International
Bioinformatics
(enable/disable-editors t) at the listener pane
Why Curation is Important!
Curators
SRI International
Bioinformatics
need jobs
“in
silico” information less solid than experimental
evidence
Database
curation greatly enhances the
usefulness of the data
Pathway Tools Paradigms
SRI International
Bioinformatics
Separate database from user interface
Navigator provides one interface to the DB
Editors provide an alternative interface to the DB
• Reuse information whenever
possible!
• A PGDB should not describe the same
biological or chemical entity more than
once
• A tool helps to prevent creation of
duplicate reactions
Editing rules: Support Policy
Do
Do
SRI International
Bioinformatics
not modify the EcoCyc or MetaCyc datasets
not alter DB schema
e.g. do not add or remove classes or slots
List of Editors
SRI International
Bioinformatics
Compound Editor
Compound Structure Editors
Reaction Editor
Synonym Editor
Publication Editor
Pathway Editor and Pathway Info Editor
Protein/Subunit structure/Enzymatic Reaction Editors
Gene Editor
Intron Editor (Eukaryotes only)
Transcription Unit Editor
Frame Editor
Relationships Editor
Ontology Editor
Saving Changes
The
user must save changes explicitly
File => Save Current DB
Save DB button
List Unsaved Changes in Current DB
Revert Current DB
Checkpoint
Current DB Updates to File
Restore Updates from Checkpoint File
SRI International
Bioinformatics
SRI International
Bioinformatics
Other DB commands under the File
menu
Summarize databases
Summarize current organism
Refresh DB list
Refresh All Current DBs
Delete a DB
Attempt to Reconnect to Database Server
Invoking the Editors
SRI International
Bioinformatics
1. New Object: Use the “New” command
2. Existing Object: Right-Click on the Object Handle
Compound Editor
Create or edit a
compound
Specify Class
Common Name and
Synonyms
Comments, citations
Links to other DBs
SRI International
Bioinformatics
The Synonym Editor
Lets
you easily edit the synonyms and set the
common name
SRI International
Bioinformatics
Citations
Citation
boxes
The CITS field
File =>Import Citations
from Pubmed
Publication editor
(invoke by right clicking
on a citation at bottom)
Non Pubmed citation:
Enter in citation box in the
form Smith06, invoke editor by
clicking out of a citation box.
SRI International
Bioinformatics
More Compound Editing
Compound Structure Editors (Marvin, JME)
Mol files
Exporting to other DBs
Merging
Duplicate Frame and Edit
SRI International
Bioinformatics
Reaction Editor
Enter or edit a reaction equation
EC number (official?)
Check for balance
Compound Resolver
SRI International
Bioinformatics
Pathway Info Editor
• Class (variant
class)
• Common Name
• Synonyms
• Evidence code
• Citations (CIT)
• Comments
• External Links
• Hypothetical
reactions
• Author credits
SRI International
Bioinformatics
Pathway Editor
Graphically
Two
create and modify pathways
tools:
Connections Editor:
add reactions one by one
Segment Editor:
enter a linear pathway segment
SRI International
Bioinformatics
SRI International
Bioinformatics
Connections Editor Operations
Two main display panes:
left: unconnected pathway reactions
right: draws connected reactions (looks like the regular Pathway display
window)
Connecting reactions:
select initial reaction (in either pane) ===> red and green reactions
select a green reaction
Useful Commands:
choose main compounds for reaction
disconnect all reactions
In circular pathways, specify which compound should be at the
top
Add links to other pathways, reactions, or comments
Pathway Segment Editor
To
enter linear sequence of
reactions simultaneously
Reactions
are specified by EC
numbers or reaction
substrates
One
segment may contain up
to 7 reactions
SRI International
Bioinformatics
SRI International
Bioinformatics
Pathway Editor Limitations
Complex
situations can cause ambiguity:
link may be ignored
dialog box for disambiguating
pathway drawn in bizarre arrangement
Fix:
try removing offending link and add links in different order
Pathway
Editor does not handle polymerization
pathways easily.
Evidence Codes for Pathways
http://brg.ai.sri.com/ptools/evidence-ontology.html
EV-COMP: Inferred from computation
HINF - Human inference
AINF - Artificial inference
EV-AS: Author statement
TAS - traceable
NAS – non-traceable
EV-IC: Inferred by curator
EV-EXP: Inferred from experiment
IDA - inferred from direct assay
IPI - inferred from physical interaction
TAS – inferred from traceable data (review)
IEP - inferred from expression pattern
IGI - inferred from genetic interaction
IMP- inferred from mutant phenotype
SRI International
Bioinformatics
Enzyme/Protein Editors
SRI International
Bioinformatics
To add an enzyme to a reaction:
Right click the reaction, choose Edit => Create/Add enzyme.
“Choose Protein”: specify ID, or “Search by genes or create new
protein”
=> Protein subunit structure editor
Protein Editor
Check the Curator Guide at
http://bioinformatics.ai.sri.com/ptools/curatorsguide.pdf
Protein Editor
SRI International
Bioinformatics
Enzymatic Reaction Editor
SRI International
Bioinformatics
Protein Subunit Editor
SRI International
Bioinformatics
Super Pathways
SRI International
Bioinformatics
Need to keep pathways within well-defined end points
Link pathways to upstream or downstream pathways with pathway links.
Create more complex metabolic networks using superpathways
Example: superpathway of aromatic compound degradation (aerobic)
is composed of:
catechol degradation II
mandelate degradation I
benzoate degradation (aerobic)
b-ketoadipate degradation
protocatechuate degradation II
shikimate degradation
quinate degradation
4-hydroxymandelate degradation
tryptophan degradation I
Pathway Export
Export
Edit => Add Pathway to File Export List
File => Export => Selected Pathways to File
Import
File => Import => Pathways from File
SRI International
Bioinformatics
SRI International
Bioinformatics
Creating Links to External Databases
Creating links from a PGDB to external databases
To define a new external database:
Tools => Ontology Browser
View => Browse from new root / type Databases
Highlight Databases
Frame => Create => Instance
Enter frame name, frame edit
Enter Common Name, Static-Search-URL
e.g. http:/gene.pharma.com/dbquery?
Enter a value for Search-Object-Class (e.g. Proteins)
Creating links to a PGDB
see http://biocyc.org/linking.shtml
Constraint Checking
SRI International
Bioinformatics
General rules that constrain the valid relationships among
instances
Constraints are checked when new facts are asserted to
assure that the DB remains logically consistent
Constraints on slots:
Domain violation checks to make sure the slots are in instances of the
appropriate class
Range violation :
value type
value cardinality
Inverse
Cardinality
Lisp-predicate
SRI International
Bioinformatics
Consistency Checking (correctify-kb)
Removes newlines from names
Converts “<“ to “|” in string citations
Checks isozyme sequence similarity
Fixes references between polypeptides and genes
Changes compound names to ids in a variety of slots
Matches physiological regulators to other regulators
Cross-references compounds to reactions
Checks pathways predecessors/reactions/subs
Checks reaction balancing
Checks compound structures
Calculates sub- and super-pathways
Finds missing sub-pathways links
Verifies chromosome components and positions
Update your computers!
To
SRI International
Bioinformatics
install a patch:
Tools => Instant Patch => Download and Activate
All Patches
Make sure that…
SRI International
Bioinformatics
You perform all exercises
on the Hb. pylori database,
not on your own!!!
Creating New Reactions
SRI International
Bioinformatics
Don’t forget to include spaces between chemical names and
terms such as “+” and “=“:
1.
ascorbate + H2O = 3-keto-L-gulonate
2.
3-keto-L-gulonate + ATP = 3-keto-L-gulonate-6-phosphate + ADP
3.
3-keto-L-gulonate-6-phosphate = L-xylulose-5-phosphate + CO2
4.
L-xylulose-5-phosphate = L-ribulose-5-phosphate
5.
L-ribulose-5-phosphate = xylulose-5-phosphate
6.
xylulose-5-phosphate = D-ribulose-5-phosphate
Fill Reaction frame ID’s in your
handout
Reaction
ascorbate + H2O = 3-keto-L-gulonate
3-keto-L-gulonate + ATP =
3-keto-L-gulonate 6-phosphate + ADP
3-keto-L-gulonate 6-phosphate =
L-xylulose-5-phosphate + CO2
L-xylulose-5-phosphate =
L-ribulose-5-phosphate
L-ribulose-5-phosphate = xylulose-5-phosphate
xylulose-5-phosphate = D-ribulose-5-phosphate
SRI International
Bioinformatics
Frame ID
XXX
Duplicate Reaction?
Frame ID of the new
reaction to be
created. This frame
will NOT be created
unless you choose
“Keep”
Frame ID of the
existing reaction.
This reaction will
NOT be transferred
into your database
until you click
“Import”!
Record this BEFORE you click “Import”
SRI International
Bioinformatics
Define a New Pathway
SRI International
Bioinformatics
Define the pathway L-ascorbate degradation to D-ribulose-5phosphate by connecting the reactions together
Assign class:
(Pathways -> Degradation/Utilization/Assimilation -> Carboxylates,
Other)
Add the reactions, conect them, and add a link to the pathway
non-oxidative branch of the pentose phosphate pathway
(Generation of precursor metabolites and energy => Pentose
phosphate pathways =>)
Add a reverse link from non-oxidative branch of the pentose
phosphate pathway to the new pathway
Run (correctifykb)
Open
Run
SRI International
Bioinformatics
the database Hb. pylori (HypCyc) (so ‘hyp)
(correctify-kb)
Analyze
output