High Throughput Technique in Structural Bioinformatics

Download Report

Transcript High Throughput Technique in Structural Bioinformatics

High Throughput Technique in
Structural BioinformaticsApplication to Catalase, an enzyme
of 57 kDa molecular weight
By
Prof. D. VELMURUGAN
DEPARTMENT OF CRYSTALLOGRAPHY &
BIOPHYSICS
UNIVERSITY OF MADRAS
GUINDY CAMPUS
CHENNAI – 600 025
One of the main interests in the molecular
biosciences is in understanding structure function
relations and X-ray crystallography plays a major role in
this.
ab initio solutions of the crystal structures of small
molecules are possible by using atomic-resolution
diffraction data, usually at ~0.8 Å. Most of these small
molecular crystal structures are usually solved using
direct-methods programs.
Macromolecules have mainly been solved at
resolutions less than atomic and this has necessitated
determination of initial phases either from experimental
techniques such as Molecular replacement techniques,
MIR or MAD .
During the last decade, admirable advances have
taken place in the data-collection facilities and
techniques
available
to
the
macromolecular
crystallographer. To get better X-ray intensity data for this
purpose, new techniques like cryo temperature data
collection, halide soaking and passing of Ar, Ne, Hg gas
have been developed.
With the above advances, more data sets appear
to be coming from atomic-resolution data. The above
possibility of gaining atomic resolution data even for
macromolecules
prompted
the
direct
methods
practitioners to make attempts to extend the direct
methods using other macromolecular techniques to
enable them to tackle the structure solution of
macromolecules.
X-ray Crystallography has become a central tool in modern
drug and target discovery, providing important insights into
molecular interactions and biological function. The past few years
have seen many advances in the methods underlying
macromolecular crystallography such as protein production,
crystallization, cryo-crystallography and synchrotron technology.
Together these advances mean that X-ray data can be collected
extremely quickly for many different crystals and ligand-bound
complexes. The challenge is to ensure rapid and accurate
interpretation of the data to provide valuable structural information.
The High Throughput Crystallography (HTC) Consortium
offers scientists a valuable new dimension to the drug discovery
process. The HTC Consortium aims to accelerate crystallographic
structure determination by developing new science as well as
utilizing current technology to go from initial phasing through to
structure refinement and analysis while minimizing the amount of
human intervention that is required. The ability to examine in
atomistic detail the interactions between many different proteins
and ligands provides scientists unprecedented insight into the
mechanics of drug binding.
Rapid and revolutionary developments in genome
sciences, combinatorial chemistry, informatics and robotics are
having major impacts on drug discovery. Genome sequencing
projects in man and micro-organisms have provided an
unprecedented number of potential drug targets. These have
given impetus to the study of protein expression (proteomics)
and structure (structural genomics), and have allowed a clearer
description of drug targets as molecular components of disease
processes. At the same time, there is rapidly expanding range of
screening technologies, as well as consolidations in medicinal
chemistry arising from the combinatorial approaches that were
pioneered in the 1990s. These developments have created an
environment for the emergence of new strategies for drug
discovery.
High-Throughput Crystallography is essential for
structure-based lead discovery – a strategy that combines
features of random screening and rational structure-based
design.
More than 29,000 protein structures are deposited in
the Protein Data Bank (PDB) and more than 1,50,000
sequence (SWISS-PROT) entries exist for which the three
dimensional structures are not available.
In Structural Genomics, one is interested in determining
the structure in the fastest way to understand new folds and
this has opened up the “High Throughput Crystallography”. An
understanding of the three-dimensional structure (fold)
correlates the function of the molecule.
High Throughput Crystallography using Automated
procedures promotes a quicker elimination of the structure
having the same fold among the deposited ones when
analyzing thousands of macromolecular structures for which
functional assignments are yet to be known.
ACORN is a comprehensive and efficient phasing
procedure involving direct methods for the determination of
protein structures when atomic-resolution data are
available (better than 1.2 Å) (Foadi et al., 2000; Mcauley et
al., 2001; Yao, 2002; Foadi, 2003; Dodson & Yao, 2003).
The fragment can be as a small-idealized piece of
secondary structure (Rajakannan et al., 2004a, b;
Selvanayagam et al., 2004) or an experimental
substructure, such as a metal or a set of S, Se or similar
atoms which can be located from anomalous scattering
measurements.
ACORN then uses a combination of approaches,
most importantly dynamic density modification, to develop
a refined set of phases. Key to the procedure is the use of
a correlation factor for the weak amplitudes as a criterion of
phase quality.
Dynamic Density Modification (DDM) is designed to modify
the densities in three steps:
’ = 0
’ = tanh{0.2[/()]3/2}
’ = kn()



if <0
if >0
if ’>kn(),
It sets all negative densities to zero.
It modifies the positive densities according the ratio
/().
It truncates the modified densities to a value of kn(),
where k is a constant given by the user (default value
is 3); n is the cycle number of DDM.
The reflections are divided into three groups (strong,
medium and weak) according to their normalized structurefactor (E) values.
The strong reflections (E > 1.2) are used in the phase
refinement by the Dynamic density modification (DDM)
and Patterson superposition (SUPP) procedures.
Both strong and weak reflections (E < 0.1) are used in
Sayre-equation refinement (SER).
The medium reflections (0.1 < E < 1.2) are used to
calculate a correlation coefficient (CC) for each potential
solution of DDM.
An important component of ACORN is a CC that describes
the extent to which the magnitudes of the calculated
normalized structure factors (Ec) resemble the observed
normalized structure-factor amplitudes (Eo). A fragment in
a particular position and orientation in the unit cell will have
an associated set of structure factors and the CC will be
expressed by
CC 
where
Ec Eo
 Ec
 E c   Ε ο 
Eo
,
 = <E2> - <E2>½
Ec and CC values are calculated from the starting
fragment for all reflections to find the correct orientation
and position in molecular replacement (MR) or random MR
or for single random-atom searching.
In phase refinement Ec and CC values are calculated
from the modified map for medium reflections, which are
not used for computing the map, to indicate solutions of
DDM.
The ACORN procedure, as implemented in CCP4, is
divided into two parts, ACORN-MR and ACORN-PHASE,
as illustrated in the flow diagram.
Large
Motif
AMoRe
in CCP4
PDBSET
in CCP4
Positioned
fragment
Known
phases
Patterson
superposition
Yes
No
No
SUPP?
Standard
 helices
Molecular replacement
of random molecule replacement by search on CC
Structurefactor
calculation
Initial
phase
sets
SER?
Yes
Sayre-equation
refinement
Dynamic density
modification
Small
motif
Single randomatom search on CC
Known heavy-atom
positions
Best set of
refined phases
ACORN-MR
ACORN-PHASE
ACORN-MR, deals with finding the position of a fragment
of the structure, even a single atom, that provides an initial set
of estimated phases. This set is passed into ACORN-PHASE,
where phase refinement by a number of real-space processes
is performed.
For locating a single atom, this approach randomly
generates thousands of positions in the asymmetric unit. For
each random position, the calculated normalized structure
factor values and corresponding CCs are calculated for all
reflections. 1000 sets with highest CCs are saved as starting
points for further calculations. In most cases, the solution is
normally found in the top 100 sets. This approach can be used
to determine a native protein structure from AR data, if the
structure contains at least one heavy atom (sulphur or
heavier).
Foadi (2003) has given a detailed explanation of the
reasons for the failure of ACORN when the resolution is below
1.2 Å. At atomic resolution, two neighbouring atomic peaks will
be two separate entities and DDM will enhance both of them.
At lower resolutions, these two peaks will merge into a single
peak and DDM will just enhance it and no positive phase
refinement can be expected in this situation.
The present work overcomes the above problem at low
resolution using the fragments for seed phasing information.
The use of ACORN in solving a 57 kDa macromolecule
with atomic resolution (0.88 A) / truncated synchrotron data
(1.5Å resolution)
Micrococcus lysodeikticus catalase (Murshudov et
al., 2002)
12
17
19
18
7
2
1
6
8
5
4
14
3
8 7
6
5
4
9
10
3
11
15,16
13
1
PDB i.d.:1gwe
Total residues: 503
20
Details of the crystallographic data, helices, sheets and sets
Ab initio phasing using ACORN
ACORN was run with 5000 random single atom
trials and the 40 positions with highest CCs’ were
selected.
ACORN refined the phases from the random atom
trials using DDM and led to the solution with good
agreement of CC.
In this run, 78 cycles of DDM increased the CC for
medium reflections with E values from 0.0285 to 0.5246
in 14.2 hours of CPU time.
In this ab initio case 8 chains could be automatically
built with the ARP/wARP (Perrakis et al., 1999) followed
by REFMAC (Murshudov et al., 1999) (482 residues).
Manual model building was carried out for the missing
residues and the final Rw and Rf values are 14.0 and
16.2% respectively. The superposition using PROFIT of
the backbone atoms of this structure with the backbone
atoms of the same structure solved using conventional
technique gives the r.m.s deviation of 0.143 Å.
Details of ACORN, ARP/wARP and REFMAC results for
ab initio case
Applications of truncated data at 1.5 Å resolution
For set 23(minimum input), all sheets and one helix
(helix4) containing 76 residues were given as input to
ACORN. Here, the ACORN-PHASE option was selected for
the structure solution. The R-factor and correlation coefficient
for the medium E value reflections of the initial model are
54.2% and 0.0469, respectively. Within 56 cycles of DDM the
R-factor and correlation coefficient attained 53.9% and
0.0771 indicating a good solution.
The phases were then fed to ARP/wARP (Perrakis et al.,
1999) followed by REFMAC (Murshudov et al., 1999). After
the initial model building by ARP/wARP, the Rw and Rf values
were 44.8 and 44.4% respectively. This initial model was
refined for ten cycles of auto building along with five cycles of
REFMAC in each auto-building cycle. Finally, ARP/wARP was
able to build 212 residues. At this stage Rw and Rf values
were 28.9 and 36.3% respectively. An iterative cycle carried
out with these output phases revealed 481 residues out of
503 residues with a connectivity index of 0.97.
Manual model building was carried out in the missing
regions as densities were clear. After the manual model
building, 20 cycles of maximum-likelihood refinement were
performed using REFMAC and solvent atoms were updated
after the refinement using ARP/wARP ‘build solvent atoms’
script. The final Rw and Rf values were 13.6 and 15.6%
respectively.
The backbone of this final model was superimposed with
the structure conventionally solved by the molecular
replacement method. The root-mean square deviation was
0.176 Å and the details are shown in Table 2. The results for
sets 1-16 and 23 are also shown in Table 2.
Figs 3a to 3q describe the final models obtained after all
the sets were used for ‘seed-phasing’ information to ACORN.
Table 2 lists the ACORN statistics and the ARP/wARP
details for all these cases. The final results obtained in each
case are also mentioned in this table.
Table 2.
Details of ACORN phasing, ARP/wARP model building and
REFMAC refinement
PROGRAM
ACORN
SET 9
R-factor (%)
CC
R-factor (%)
CC
R-factor (%)
CC
R-factor (%)
CC
Large E
(L)
0.399
0.2110
0.402
0.2070
0.411
0.1871
0.417
0.1708
Medium E (M)
0.521
0.1212
0.523
0.1153
0.526
0.1044
0.529
0.0935
H 1,4-5,7,10-11,13-14 &17-20 (151a.a)
After 37 cycles of DDM
FINAL
H 1,4-5,7,10-11,14 &17-20
(145a.a)
After 39 cycles of DDM
H 1,5,7,10-11,14&17-20 (135
a.a.)
After 37 cycles of DDM
H 1,5,10-11,14&17-20 (125 a.a)
After 41 cycles of DDM
L
0.269
0.6329
0.269
0.6330
0.270
0.6331
0.271
0.6308
M
0.525
0.1199
0.526
0.1166
0.526
0.1156
0.528
0.1108
R-factor (%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
INITIAL
43.5
42.8
43.5
42.8
43.6
43.7
43.8
43.2
FINAL
14.4
18.1
13.6
17.1
15.3
19.3
14.6
18.2
ARP/wARP
Details of ARP/wARP result
SET 12
STARTING
Input
AUTOBUILDING : 10 Cycles
REFMAC : 5 Cycles for each auto building;
Side dock after 6 cycles of auto building
SET 11
SET 10
8 chains, 475 residues, missing residues 1 -7, 59,
60, 113, 114, 142, 143, 175, 176, 186, 187,195 202,388, 389,503,dummy atoms 1288,connectivity
index 0.96
R-factor (%)
Rfree
7 chains, 482 residues,
missing residues 1 -7, 59,
60, 113, 114, 142, 143,
174-176, 388, 389, 401,
402, 503, d ummy atoms
1273, connectivity index
0.97
8 chains, 481 residues, missing
residues 1 -7, 59, 60, 113, 114,
142, 143, 174,175, 186, 187,
201, 202, 388, 389, 503, dummy
atoms 1235, connectivity index
0.97
8 chains, 481 residues, missing
residues 1 -7, 59, 60, 1 42, 143,
186, 187, 201, 202, 331, 332,
388, 389, 401, 402,503, dummy
atoms1203, connectivity index
0.97
R-factor (%)
R-factor (%)
R-factor (%)
Rfree
Rfree
Rfree
Without dummy atoms made by ARP/wARP
28.4
29.4
27.3
28.2
27.6
28.9
27.2
28.3
After manual model building for missing
residues and solvent building
14.2
15.8
13.6
15.4
13.9
15.7
13.8
15.6
r.m.s. deviations of the model with backbone
atoms superposed with that of 1gwe
0.145
0.191
0.161
0.169
PROGRAM
ACORN
Input
SET 13
ARP/wARP
DETAILS OF ARP/wARP result
SET 15
SET 16
R-factor (%)
CC
R-factor (%)
CC
R-factor (%)
CC
R-factor (%)
CC
Large E (L)
Medium E (M)
0.423
0.533
0.1513
0.0821
0.424
0.534
0.1418
0.0760
0.432
0.539
0.1225
0.0614
0.437
0.542
0.1058
0.0544
H1,10-11,14&17-20 (114 a.a)
FINAL
AUTOBUILDING : 10 Cycles
REFMAC : 5 Cycles for each auto building;
Side dock after 6 cycles of auto building
SET 14
STARTING
After 42 cycles of DDM
L
0.272
M
0.529
R-factor
(%)
0.6286
0.1085
H 1,10-11,14&18-20 (103 a.a)
H 10-11,14&18-20 (91 a.a)
H 11,14&18-20 (79 a.a)
After 46 cycles of DDM
0.274
0.6283
0.531
0.1022
After 55 cycles of DDM
0.273
0.6289
0.534
0.0961
After 55 cycles of DDM
0.272
0.6275
0.536
0.0869
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
44.2
44.2
44.5
44.5
45.0
44.8
44.8
44.4
29.1
35.7
28.9
36.3
INITIAL
43.8
43.8
FINAL
14.9
19.0
6 chains, 485 residues, missing residues 1 7, 59, 60, 113, 114, 142, 143, 174, 175, 388,
389, 503, dummy atoms 1200, connectivity
index 0.98
15.2
19.1
8 chains, 474 residues, missing
residues 1 -9, 59 60, 1 42, 143,
174-180, 186, 187, 201, 202,
258, 259, 388, 389, 503,dummy
atoms 1250, connectivity index
0.97
14.5
18.1
7 chains, 482 residues,
missing residues 1 -8, 59,
60, 113, 114, 142, 143,
186,187, 201, 202, 503,
dummy atoms 1224,
connectivity index 0.97
ARP/wARP
AUTOBUILDING : 10 Cycles
REFMAC : 5 Cycles for each auto building;
Side dock after 6 cycles of auto building
Details of ARP/wARP result
After manual model building for missing
residues and solvent building
r.m.s. deviations of the model with
backbone atoms superposed with that of
1gwe
20 chains, 3 10 residues,
connectivity index 0.88
0.146
22 chains, 212 residues,
connectivity index 0.79
R-factor (%)
Rfree
R-factor
(%)
Rfree
29.1
35.8
28.9
36.3
13.0
16.6
8 chains, 481 residues,
missing residues 1-7, 59, 60,
142, 143, 186, 187, 258, 259,
388, 389, 401, 402, 503,
dummy atoms 1337,
connectivity index 0.97
Without dummy atoms made by ARP/wARP
SET 23
R-factor
CC
(%)
0.437
0.1143
0.542
0.0469
Sheets (1-8) & H4 (76
a.a)
After 56 cycles of DDM
0.273
0.6211
0.539
0.0771
R-factor
Rfree
(%)
14.0
17.6
7 chains, 481 residues,
missing residues 1-9, 59,
60, 142, 143, 174, 175,
186, 187, 388, 389, 401,
402, 5 03, dummy
atoms 1287, connectivity
index 0.97
R-factor
Rfree
(%)
R-factor
(%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
27.0
28.2
28.0
29.3
27.3
28.3
27.2
28.1
27.1
28.2
13.3
14.9
15.0
16.9
14.3
15.8
14.6
16.2
13.6
15.6
0.182
0.146
0.204
0.176
Fig. 1
PDB-id: 1GWE
Total:503residues
Fig. 3b
SET 2
Input: 184 residues
Auto Built: 470 residues
Fig. 3e
SET 5
Input: 172 residues
Auto Built: 477 residues
Fig. 3h
SET 8
Input: 157 residues
Auto Built: 484 residues
Fig. 2
Ab initio
Auto Built: 482 residues
Fig. 3c
SET 3
Input: 181 residues
Auto Built: 474 residues
Fig. 3f
SET 6
Input: 167 residues
Auto Built: 472 residues
Fig. 3i
SET 9
Input: 151 residues
Auto Built: 475 residues
Fig. 3a
SET 1
Input: 187 residues
Auto Built: 476 residues
Fig. 3d
SET 4
Input: 177 residues
Auto Built: 482 residues
Fig. 3g
SET 7
Input: 162 residues
Auto Built: 479 residues
Fig. 3j
SET 10
Input: 145 residues
Auto Built: 482 residues
Fig. 3k
SET 11
Input: 135 residues
Auto Built: 481 residues
Fig. 3n
SET 14
Input: 103 residues
Auto Built: 474 residues
Fig. 3l
SET 12
Input: 125 residues
Auto Built: 481 residues
Fig.3o
SET 15
Input: 91 residues
Auto Built: 482 residues
Fig. 3q
SET 23
Input: 76 residues
Auto Built: 481 residues
Fig. 3m
SET 13
Input: 114 residues
Auto Built: 485 residues
Fig. 3p
SET 16
Input: 79 residues
Auto Built: 481 residues
Seed phasing using Cα atoms
Only the 503 Cα atoms from the known structure
were used for seed phasing to ACORN with the
truncated data extending to 1.3 Å resolution. Successful
model could be built with 474 amino acids (a.a), the
backbone atoms of which had an r.m.s deviation 0.132 Å
with the actual structure (1gwe).
To mimic the above ‘seed feeding’ in real situations,
mean positional errors (MPE, hereafter) of 0.1, 0.2 Å
were introduced for the above Cα atoms using
MOLEMAN (Kleywegt, 1992-2004). Successful model
could be built with 483, 481 a.a corresponding to input
fragments with MPE of 0.1 and 0.2 Å respectively. The
backbone atoms of these had an r.m.s deviation of
0.169, 0.163 Å respectively with the actual structure
(1gwe).
Results of ACORN and ARP/wARP using only Cα
atoms (1gwe)
Resolution 20-1.3 Å
Cα atoms alone
PROGRAM
ACORN
STARTING
Large E (L)
R-factor (%)
CC
0.1 Å
R-factor (%)
CC
0.2 Å
R-factor (%)
CC
0.446
0.0607
0.445
0.0640
0.446
0.0543
Medium E (M)
0.551
0.0368
0.551
0.0348
0.552
0.0303
Input (1 atom /a.a - ~13% of the total structure)
503 atoms
503 atoms
After 123 cycles of DDM
FINAL
L
After 121 cycles of DDM
After 165 cycles of DDM
0.6542
0.255
0.6520
0.257
0.6452
0.503
0.1944
0.505
0.1904
0.508
0.1826
R-factor (%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
INITIAL
39.9
40.6
40.5
40.6
40.7
40.4
FINAL
13.3
16.8
14.7
18.4
14.5
18.2
ARP/wARP
Details of ARP/wARP result
503 atoms
0.254
M
AUTOBUILDING : 10 Cycles
REFMAC : 5 Cycles for each auto building;
Side dock after 6 cycles of auto building
Mean Positional Error (MPE) of Cα atoms
9 chains, 474 residues, missing residues 1-7, 3440,59,60,113,114,142,143,174,175,186,187,388,38
9,401, 402,503, dummy atoms 1389, connectivity
index 0.96
R-factor (%)
Rfree
7 chains, 483 residues,
missing residues 1-7, 59, 60,
113, 114, 142, 143, 186,
187, 331, 332, 388 ,389,
503, dummy atoms 1225,
connectivity index 0.97
8 chains, 481 residues,
missing residues 1-7, 59,
60, 113, 114, 142, 143, 175
,176, 201, 202, 331, 332,
388 ,389, 503, dummy
atoms 1247, connectivity
index 0.97
R-factor (%)
R-factor (%)
Rfree
Rfree
Without dummy atoms made by ARP/wARP
28.2
28.9
26.9
28.1
27.5
28.5
After manual model building for missing
residues and solvent building
15.1
17.1
14.2
16.0
14.4
16.1
r.m.s. deviations of the model with backbone atoms
superposed with that of 1gwe
0.132
0.169
0.163
PDB i.d. : 1gwe
Total residues:503
Input: 0.1Angstrom error at calpha atoms
Auto built: 483 residues
Input: Calpha atoms (503)
Auto built: 474 residues
Input: 0.2Angstrom error at calpha atoms
Auto built: 481 residues
Seed phasing using 120 a.a as polyala
The first 120 a.a from the actual structure were
treated as polyala model and the above procedures were
carried out to obtain the final model. Results are detailed
in Table.
With the 120 residues as polyala model, ARP/wARP
was able to build 111 residues in 15 chains when the
above procedures were followed. An iterative cycle
carried out with this output as input revealed 480
residues out of 503 residues with a connectivity index of
0.98. In the case of first 120 residues of polyala model
with 0.1 Å MPE, ARP/wARP initially built only 6948
dummy atoms. Two iterative cycles carried out with this
as input finally built 481 residues. These two models
have an r.m.s deviation of 0.176, 0.173 Å respectively
with the backbone atoms of the actual structure (1gwe).
Results of ACORN and ARP/wARP using polyala model
(5atoms/a.a) (1gwe)
Resolution 20 – 1.5 Å
PROGRAM
First 120 residues of polyala
ACORN
STARTING
Large E (L)
Medium E (M)
FINAL
ARP/wARP
AUTOBUILDING : 10 Cycles
REFMAC : 5 Cycles for each auto building;
Side dock after 6 cycles of auto building
L
M
CC
0.0915
0.0525
After 56 cycles of DDM
0.276
0.6196
0.537
0.0807
R-factor (%)
Rfree
45.1
44.4
45.1
44.9
FINAL
32.3
42.7
24.5
45.1
15
chains,
111
residues,
connectivity index 0.88, dummy
atoms 4437
0 chains, 0 residues, connectivity index 0.00, dummy
atoms 6948
R-factor (%)
Rfree
R-factor (%)
Rfree
INITIAL
32.3
42.8
24.5
45.2
FINAL
13.1
16.6
26.8
35.8
6 chains, 480 residues, missing
residues 1-9, 59, 60, 110-114,
142, 143, 174, 175, 388, 389, 503,
dummy atoms 1354, connectivity
index 0.98
Details of ARP/wARP result
ARP/wARP
AUTOBUILDING : 10 Cycles
REFMAC : 10 Cycles for each auto building;
Side dock after 6 cycles of auto building
After 55 cycles of DDM
0.274
0.6195
0.540
0.0775
R-factor (%)
Rfree
INITIAL
DETAILS OF ARP/wARP result
ARP/wARP
AUTOBUILDING : 10 Cycles
REFMAC : 10 Cycles for each auto building;
Side dock after 6 cycles of auto building
R-factor (%)
0.439
0.541
First 120 residues of polyala with a mean positional
error (MPE) of 0.1 Å
R-factor (%)
CC
0.442
0.0853
0.541
0.0499
INITIAL
FINAL
Details of ARP/wARP result
Without dummy atoms made by ARP/wARP
After manual model building for missing residues and
solvent building
r.m.s. deviations of the model with backbone atoms
superposed with that of 1gwe
20 chains, 282 residues, dummy atoms 3271,
connectivity index 0.85
R-factor (%)
Rfree
26.9
35.8
13.6
17.5
8 chains, 481 residues, missing residues 1-7, 39, 40, 59,
60, 142, 143, 174, 175, 186, 187, 388, 389, 401, 402,
503, dummy atoms 1201, connectivity index 0.97
R-factor (%)
Rfree
R-factor (%)
Rfree
27.3
28.0
27.1
28.0
14.0
15.7
14.0
15.5
0.176
0.173
PDB i.d. : 1gwe
Total residues:503
Input: First 120 a.a as polyala model
Auto built: 480 residues
Input: First 120 a.a as polyala model after
introducing the MPE of 0.1Angstrom
Auto built: 481 residues
STEREO VIEW OF THE ELECTRON DENSITY (2FO-FC|) MAP SUPERPOSED
WITH FINAL MODEL (Input: Polyala model for the first 120a.a with a MPE of 0.1 Å)
STEREO VIEW OF THE FINAL ELECTRON DENSITY (2FO-FC|)
MAP STARTING WITH THE POLYALA MODEL OF FIRST 120A.A
WITH MPE OF 0.1 Å
FINAL ELECTRON DENSITY (2FO-FC|) MAP FOR POLY ALA MODEL
ELECTRON DENSITY (2FO-FC|) MAP FOR HEME GROUP IN
POLYALA MODEL
Seed phasing using Ncap, Ccap and Middle
portions of helices/sheets
Instead of feeding the entire helices or
sheets [Selvanayagam et al., 2004 (a minimum
of 76 residues were found to be sufficient for
seed phasing with 1.5 Å truncated data to solve
the three dimensional structure of catalase)]
either the N cap/C cap regions or the mid portion
in the helices or sheets could also be fed as
input for seed phasing. Successful model can be
built in these cases also. The results obtained
are listed in Table.
Results of ACORN and ARP/wARP using Ncap, Ccap and Middle portions
of helices/sheets (1gwe)
Resolution 20-1.5 Å
Ncap region of helices/sheets
PROGRAM
ACORN
STARTING
(L)
Large E
Medium E (M)
R-factor (%)
CC
R-factor (%)
CC
R-factor (%)
CC
0.435
0.1118
0.437
0.1076
0.434
0.1319
0.539
0.0569
0.542
0.0553
0.538
Details of ARP/wARP result
After 52 cycles of DDM
After 53 cycles of DDM
After 51 cycles of DDM
L
0.275
0.6270
0.271
0.6307
0.272
0.6248
M
0.534
0.0901
0.533
0.0946
0.533
0.0926
R-factor (%)
Rfree
R-factor (%)
Rfree
R-factor (%)
Rfree
INITIAL
44.5
44.8
44.4
44.9
44.7
44.5
FINAL
14.9
18.6
14.5
18.2
15.2
19.1
FINAL
ARP/wARP
AUTOBUILDING : 10 Cycles
REFMAC : 5 Cycles for each auto
building;
Side dock after 6 cycles of auto building
0.0623
76 a.a
76 a.a
76 a.a
Input
Middle region of
helices/sheets
Ccap region of helices/sheets
10 chains, 470 residues, missing residues 1-7, 39, 40, 59,
60, 113, 114, 142, 143, 174, 175, 176, 195-202, 331, 332,
388, 389, 401, 402, 503, dummy atoms 1308, connectivity
index 0.95
R-factor (%)
Rfree
9 chains, 474 residues, missing
residues 1-7, 39-40, 59, 60, 142,
143, 174, 175, 186, 187, 201, 202,
331, 332, 388, 389, 503, dummy
atoms 1277, connectivity index
0.96
Rfree
R-factor (%)
8 chains, 479 residues, missing
residues 1-7, 60, 110-114, 142,
143, 175, 176, 186, 187, 388,
389, 503, dummy atoms 1230,
connectivity index 0.97
R-factor (%)
Rfree
Without dummy atoms made by
ARP/wARP
28.9
29.3
28.3
29.3
27.8
28.7
After manual model building for
missing residues and solvent building
13.3
16.2
14.6
17.4
13.0
15.8
r.m.s. deviations of the model with
backbone atoms superposed with that of
1gwe
0.218
0.183
0.151
Input: Ncap region of helices/sheets(76 a.a)
Auto Built: 470 residues
Input: Ccap region of helices/sheets(76 a.a)
Auto Built: 474 residues
Input: Middle region of helices/sheets(76 a.a)
Auto Built: 479 residues
Black shaded regions correspond
to the input residues from 1gwe
Conclusion
• Based on the published work and the work being carried
out by our group (Rajakannan et al., 2004a; 2004b), it
has now become very clear that very little information
(15%) is needed to determine the structure of a protein
using ACORN.
• Ours is the first case of ACORN applications using seedphasing information to solve even larger molecular
weight protein (57 kDA) when the resolution extends to
1.5 Å.
• Among the multiple solutions, the correct solutions can
be obtained in all trials with high reliability by the working
of correlation coefficient and hence high resolution and
fairly complete diffraction data enable one to solve a
protein ab initio, in a relatively short amount of time.
• ACORN has the great potential to establish itself
as program for high-throughput structure
determination.
• Currently, in order to extend the applicability of
ACORN to lower resolutions, the seed phasing
has been obtained from the native structure itself
(as the structure had already been solved by
traditional
macromolecular
crystallographic
methods). Data mining approach to feed
fragments using the PDB entries is in progress.
References
Banumathi, S., Rajakannan, V., Velmurugan, D., Dauter, Z., Dauter, M., Tsai, M.
D. & Sekar, K. (2002). Japanese Crystallographic Society Meeting, Poster,
P3-II-27, 123.
Collaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760763.
Dodson, E. J. & Yao, J. -X. (2003). Crystallogr. Rev. 9, 67-72.
Foadi, J. (2003). Crystallogr. Rev. 9, 43-65.
Foadi, J., Woolfson, M. M., Dodson, E. J., Wilson, K. S., Yao, J. -X. & Chao-de,
Z. (2000). Acta Cryst. D56, 1137-1147.
Kleywegt, G. J. (1992-2004). Uppsala University, Uppsala, Sweden.
Unpublished program.
McAuley, K. E., Yao, J. –X., Dodson, E. J., Lehmbeck, J., Ostergaard, P. R. &
Wilson, K. S. (2001). Acta Cryst. D57, 1571-1578.
Murshudov, G. N., Lebedev, A., Vagin, A. A., Wilson, K. S. & Dodson, E. J.
(1999). Acta Cryst. D55, 247-255.
Murshudov, G. N., Grebenko, A. I., Brannigan, J. A., Antson, A. A., Barynin, V.
V., Dodson, G. G., Dauter, Z., Wilson, K. S. & Melik-Adamyan, W. R. (2002).
Acta Cryst. D58, 1972-1982.
Perrakis, A., Morris, R. M. & Lamzin, V. S. (1999). Nature Struct. Biol. 6, 458463.
Rajakannan, V., Velmurugan, D., Yamane, T., Dauter, Z., Dauter, M., Tsai, M. D.
& Sekar, K. (2002). Japanese Crystallographic Society Meeting, Poster, P3I-22, 84.
Rajakannan,V., Yamane, T., Shirai, T., Kobayshi, T., Ito, S. &
Velmurugan, D. (2003). International Symposium on Diffraction
Structural Biology, Tsukuba, Japan, 28-31 May 2003, Poster P-085.
Rajakannan, V., Yamane, T., Shirai, T., Kobayshi, T. Ito, S. &
Velmurugan, D. (2004a). J. Synchrotron Rad. 11, 64-67.
Rajakannan, V., Selvanayagam, S., Yamane, T., Shirai, T., Kobayshi, T.,
Ito, S. & Velmurugan, D. (2004b). J. Synchrotron Rad. 11, 358-362.
Selvanayagam, S., Velmurugan, D., Yamane, T. (2004). Asian
Crystallographic Association Meeting (AsCA’04) Poster(P0165).
Velmurugan, D., Rajakannan, V., Yamane, T., Dauter, Z. & Sekar, K.
(2002). Japanese Crystallographic Society Meeting, Poster, P3-II26, 122.
Yao, J. -X. (2002). Acta Cryst. D58, 1941-1947.