Transcript Document

Handling ligands with PRODRG
Daan van Aalten
Division of Biological Chemistry and Drug Discovery
College of Life Sciences
PRODRG - why?
• Early 1990s - no software to generate
topologies for non-macromolecular entities
• Manual topology generation is time
consuming and error prone (but instructive)
• Small molecule coordinate generators
essentially only commercially available
PRODRG - why?
• For small molecules, we need to go from
imagination/some chemical info to a correct
topology and optimised coordinates in
seconds
+ Topologies for SHELX,
REFMAC5, CNS, O,
TNT, …
PRODRG - why?
Citrate (1AJ8)
NADP+ (1DDI)
Cyclohexylamine (1PPA)
(1997; 1.9 Å)
(1999; 2.5 Å)
(1991; 2.0 Å)
Diphosphate (1N5L) Sulphate (1DW9)
Ethylene glycol (1JKV)
(2002; 2.3 Å)
(1999; 1.7 Å)
(2001; 1.4 Å)
PRODRG History
• Version 1 (1995)
– Started as a DRuG PROgram in GROMOS87
– Takes PDB file and generates ‘MOLDES’
(SMILES-like 1D string) and MD topologies
• Version 2 (2004)
– Many additional input formats
– Many additional output formats, including
topologies for crystallographic software
• Version 2.5 (2005)
– Internal all-atom representation
PRODRG History
• Details covered in two publications
• Webserver (~300 runs/day) with short FAQ
PRODRG Guts
• Essentially FORTRAN (30000 lines) with
some supporting C (5000) lines
• Compiles well on all major platforms
• Few dependencies (GROMACS for
coordinate generation)
What is PRODRG?
• Generates information about small molecules
Molecular descripton
PDB file
Molfile
Human











PRODRG
Atomic coordinates
Chemical types
Connectivity
Bond orders / aromaticity
Hybridisation
Formal charges
Atomic charges
Force field parameters
Hydrogen atoms
Free torsions
Hydrogen bonding











Model building
& refinement
Molecular
dynamics
DB lookups &
property pred.
Docking
& analysis
Visualisation
How does PRODRG work?
• Fixed order of steps is bad
• Input analysis is rather rude:
– Deletes hydrogens
– Ignores bond order information
1.
2.
3.
4.
5.
6.
7.
8.
Analysis of input
Initial data gathering
Addition of hydrogens
Atom reordering
Topology generation
Formal and partial charges
Additional molecule data
Output
How does PRODRG work?
• Most steps use ‘chemical pattern matching’
• Example: hydrogen generation
Add 1+sp(x)-ncon(x) hydrogens
Do nothing
Add 1 hydrogen
How does PRODRG work?
• Currently all Hs generated by 17 ‘rules’
• Chemical knowledge in data, not code
More flexible
Potentially user-configurable
Limitations
• Supported atom types limited
– C,H,N,O,P,S,F,Cl,Br,I only
• Other chemical limitations
– No more than 4 connections/atom
– Standard version limited to <=300 atoms
• Ignoring hydrogens and bond types may
lead to unexpected results
• (Apolar hydrogens as second-class atoms)
• SMILES not yet implemented (but trivial)
Basic usage: web server
• Four easy steps:
1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg
Basic usage: web server
• Four easy steps:
1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg
2. Paste input
Basic usage: web server
• Four easy steps:
1. Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg
2. Paste input
3. Edit settings
 Chirality restraints?
 Reduced charges?
 Coordinates?
Basic usage: web server
• Four easy steps:
1.
2.
3.
4.
Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg
Paste input
Edit settings
Run it
Basic usage: web server
• Four easy steps:
1.
2.
3.
4.
Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg
Paste input
Edit settings
Run it
Success!
PRODRG inputs
•
•
•
•
•
PDB coordinates
MDL molfile
MOLDES (SMILES-like 1D string)
JME editor (web server)
“TEXT” input
Text drawings
• Atoms represented by their element symbols
• Connected by bonds
– Single: - or |
– Double: = or ”
– Triple: #
• Change case of symbol to invert chirality
N
C-C
|
" "
C-C-C C-O
|
| |
C=O C=C
|
O
D-Tyr
N
C-C
|
" "
c-C-C C-O
|
| |
C=O C=C
|
O
L-Tyr
PRODRG outputs
• PDB (generated/minimzed) coordinates (with/out
hydrogens, with proper atoms names for
protein/sugars/DNA), but GIGO principle applies
• Quality control on input coordinates vs topology
• WHAT IF topology - accurate protein-ligand Hbonds
• CNS/REFMAC/TNT/SHELX topology (including
PTM amino acid building blocks)
• GROMOS/GROMACS/OPLS topologies
• Consistent topology from crystal -> publication
Helping (or kicking) PRODRG
• Additional commands/hints in input file:
– PATCH (hybridisation)
– INSHYD and DELHYD
– PATCH (chirality)
– PATCH (torsions)
– CPNAME
Hybridisation hints
PATCH <atom> <number>
• Useful if PDB analysis did not quite work
• Allows to nudge PRODRG in right direction:
O
“
C=C-C
|
|
C-C=N
“
O
PRODRG> WARNING: multiplicity of generated molecule is not 1.
PRODRG> WARNING: bond type assignment failed at CAF .
Hybridisation hints
PATCH <atom> <number>
• Useful if PDB analysis did not quite work
• Allows to nudge PRODRG in right direction:
O
“
C=C-C
|
|
C-C=N
“
O
PATCH NAG 21
Adding/removing hydrogens
INSHYD <atom>
DELHYD <atom>
• Allows to override default protonation
• Often not actually what you want
C-C=O
|
O
INSHYD OAD
PRODRG> Cannot assign type to atom ' OAD'.
ERRDRG> Error in GROMOS atom names/types.
PRODRG> Drug topology not made, sorry!
Adding/removing hydrogens
INSHYD <atom>
DELHYD <atom>
• Allows to override default protonation
• Often not actually what you want
C-C=O
|
O
PATCH OAD 3
Modifying chirality
PATCH <atom> -1
• Inverts stereocenter <atom>, useful for
PDB input
PATCH <atom> <pattern>
• ‘Absolute’ chirality
for certain classes
of molecules
N
C-C
|
" "
C-C-C C-O
|
| |
C=O C=C
|
O
PATCH CA L
L-Tyr
N
C-C
|
" "
C-C-C C-O
|
| |
C=O C=C
|
O
PATCH CA D
D-Tyr
Adding dihedral restraints
PATCH <atom> ><pattern>
• After EM pyranose rings often found in
undesirable conformations
• PATCH statement introduces additional
dihedral restraints to fix conformation
C-C-O-C-O
| |
|
O C-C-C
| | |
O O O
PATCH C1 ALPHA
PATCH C2 D
PATCH C3 L
PATCH C4 D
PATCH C5 D
PATCH C1 >4C1
-D-Glucose
Building
• PRODRG can add molecular fragments to
existing molecules:
BUILD <atom> <fragment>
BUILD CB PHI
L-Ala
BUILD CZ OH
L-Phe
L-Tyr
Building
• Allows quick alterations to existing molecules
• Preserves coordinates of root structure
• Fragment libraries contain text drawings –
easy to define:
FRAG OH
X-O
FRAG PHI
X-C-C=C
"
|
C-C=C
FRAG ...
Building
• Can also be used to generate oligopeptides
and oligosaccharides, using BUILD and
START <fragment>
-D-Glc
START
bdGLC
BUILD O4 adMAN1
BUILD O0F bdNAG1
PATCH C1 >4C1
PATCH C0B >4C1
PATCH C1B >4C1
-D-Man
-D-NAG
PRODRG IP issues
• Currently PRODRG freely accessible for
academics through webserver and binaries
• Commercial licenses (~10) have provided useful
income that contributes (but does not cover)
PRODRG development / maintenance
• Currently no PRODRG grant funding (previously
WT senior fellowship)
Thoughts on the future:
• Make PRODRG as accessible as possible
• Release of source?
• Keen to incorporate/integrate with CCP4 but this
will require some development
PRODRG - what next
• Make PRODRG as accessible as possible
• Release of source?
• Keen to incorporate/integrate with CCP4 but this will
require some development
• Need to incorporate SMILES
• Make PDB input foolproof by quality control
• Move away from the united-atom-with-hydrogen-addition
model
• Link up with GUI - not only drawing but also “building”
• Link up with coot (build-place-fit ligand at pointer)
Acknowledgements
• Alexander Schüttelkopf
• PRODRG users