Transcript Introduction
1
Advanced Editing of Pathway/Genome Databases
Ron Caspi SRI International Bioinformatics
2
General Curation
SRI International Bioinformatics
User Preferences
3 SRI International Bioinformatics
4
Create and Use Author and Organization Frames
SRI International Bioinformatics
5
Using the Text Editor
●
Formatting
●
italic
,
bold
● know your α, β ● Text wrapper: need two newlines to force a new paragraph ● ● ● Text wrapper: Never leave empty spaces at the end of a line An internal link to a reaction frame will print the reaction equation To print an enzymatic activity name use an internal link to the enzymatic activity frame ID, not the enzyme frame ID (important when an enzyme is multifunctional e.g. CPLX-6934 ).
● When providing multiple citations, use |CITS:[PMID1][PMID2| (rather than |CITS:[PMID1]|, |CITS:[PMID2]|.
●
Special characters:
● Ångstrom Å (Å) ● Degree ° ( ◦) SRI International Bioinformatics
Use Internal Hyperlinks
6 SRI International Bioinformatics
7
Use Variant Classes
example: Putrescin Biosynthesis SRI International Bioinformatics
Hypothetical Reactions, Excluded Enzymes
Specified in the Pathway Info Editor
8 Enzymes Not Used : useful when an enzyme is associated with a reaction, but does not participate in a specific pathway. For example, a catabolic enzyme in a biosynthetic pathway (e.g. EC 2.1.3.3, ornithine carbamoyltransferase) Hypothetical reactions: useful when a pathway step is proposed, but has not been proven SRI International Bioinformatics
9
Super Pathways
● ● Need to keep pathways within well-defined end points Link pathways to upstream or downstream pathways with pathway links.
● Keep pathways simple ● Create more complex metabolic networks using superpathways ● Example: superpathway of aromatic compound degradation (aerobic) ● ● ● ● ● ● ● ● ● is composed of: catechol degradation II mandelate degradation I benzoate degradation (aerobic) b -ketoadipate degradation protocatechuate degradation II shikimate degradation quinate degradation 4-hydroxymandelate degradation tryptophan degradation I SRI International Bioinformatics
10
Advanced Curation
SRI International Bioinformatics
11
Using the Frame Editor
The frame editor is powerful, but dangerous… Use it when there are no alternatives.
Examples: ● Renaming frames ● ● ● Modified proteins Modifying dates of author credits Replacing an enzyme or reaction in an enzymatic-reaction frame ● Removing mistakes from pathway frames, such as predecessor pairs that the software ignores.
● Removing duplicated values from slots that should only have a single value (OFFIClAL-EC?) ● Investigated orphan enzymatic reaction frames reported by the consistency Checker SRI International Bioinformatics
12
Protein complexes
Adenosylmethionine decarboxylase is first synthesized as a proenzyme, and then self cleaves into two smaller polypeptides. Each cleavage product forms a homotetramer, and the two complexes form a heterooctamer.
A combination of editors enables creation of such multi-level complexes.
Tutorial: Creating Protein Complexes SRI International Bioinformatics
13
Classes and Instances
● Instance frames specific objects (e.g. a specific gene) describe ● Class frames genes) describe general types of biological objects (e.g. the class of all ● Proteins that are substrates of MetaCyc reactions are classes ● Every compound with an “R” in its structure should be a class SRI International Bioinformatics
Converting an existing compound instance to a class
Modifications of MetaCyc classes is considered a schema change, and will be overwritten during the next update!
Only use this procedure to correct curation errors that were introduced in your PGDB!
14 ● ● Open the compound editor Click “Convert to Class” and exit ● Rename the frame to follow class name convention (if necessary) ● Modify the common name to start with “a” SRI International Bioinformatics
The Ontology Editor
15 SRI International Bioinformatics
16
The Ontology Editor
● ● ● Changing parent classes Adding parent classes Creating new classes to improve ontology Tutorial: the Ontology Editor SRI International Bioinformatics
17
The Consistency Checker
Consistency Checking should be performed routinely (every few months), and detected problems should be addressed SRI International Bioinformatics
18
Consistency Checker – Automatic Tasks
Bad Links MetaCyc pathways are extensively linked to other pathways. When new PGDBs are created by Pathologic, these links are still there, even if they point to pathways that are not present in the new PGDB. These links are only removed by the Consistency Checker. SRI International Bioinformatics
19
Consistency Checker – Manual Tasks
Example: create an empty |FRAME: | construct, then run the task “Check Frame References” SRI International Bioinformatics
20
Exporting Pathways Between PGDBs
● To export a pathway to a file: (optional inclusion of enzymes and genes) ● Edit => Add Pathway to File Export List ● File => Export => Selected Pathways to Lisp-format File ● To import a pathway from file: ● File => Import => Pathways from File ● To export a pathway directly to another PGDB (both PGDBs must be installed on the same system): ● Edit -> Export Pathway to DB SRI International Bioinformatics
21
Moving Objects Between PGDBs
The following commands will import a frame from MetaCyc to EcoCyc: Both databases must be open before this will work. ● (import-compounds '(CPD-ID) (kb-of-organism 'meta) (kb-of-organism 'ecoli)) ● (import-reactions '(ID-RXN) (kb-of-organism 'meta) (kb-of-organism 'ecoli)) ● (import-proteins '(ID-MONOMER) (kb-of-organism 'meta) (kb-of-organism 'ecoli)) SRI International Bioinformatics
22
Exporting Graphics
● ● ● You can save any screen as a vector-based postscript file by using File -> Print The PS files are easily converted to PDF by Adobe Distiller (pat of the Acrobat Pro package) Graphics programs like Corel Draw or Illustrator can open the PDF files and let you manipulate the graphics ● The software also generates two posters – the cellular overview, and the genome poster. Those are also generated in postscript format.
SRI International Bioinformatics
23
Creating Links to External Databases
● To define a new external database link: ● File → Create → External Database Description ● Enter frame name ● Fill fields as shown in next slide ● To edit an existing link: ● Right-click on a link (from a Navigator page), and select “Edit External Database Info” ● Creating links to a PGDB see http://biocyc.org/linking.shtml
SRI International Bioinformatics
External Database Editor
24 SRI International Bioinformatics
25
Polymerization
example: folate polyglutamylation I SRI International Bioinformatics
The Pathway Registry
26 SRI International Bioinformatics
27
The Registry – Schema Upgrades
SRI International Bioinformatics
The Registry – Uploading Your PGDB
28 The process of uploading a PGDB to the Registry is largely automated. See “Publishing PGDBs in the Registry” in the User Guide for details SRI International Bioinformatics