Transcript Document
FASTER WAY OF OBTAINING THE
THREE
DIMENSIONAL STRUCTURE OF
METALLO PROTEINS/ENZYMES
By
Prof. D. VELMURUGAN
DEPARTMENT OF CRYSTALLOGRAPHY &
BIOPHYSICS
UNIVERSITY OF MADRAS
GUINDY CAMPUS
CHENNAI – 600 025
Abstract:
The recent PDB search reveals a total of
35,026 entries and a deposit of 143 structures
during the first week of Feb,2006. The deposition
of the increased number of structures in the
PDB in recent times is due to the advancement in
the methods, experimental techniques and also in
the automation in every stage of Macromolecular
Crystallography. Many macromolecular structures
are recently solved using Single wavelength
Anomalous Diffraction (SAD) technique rather
than using the Multi wavelength Anomalous
Diffraction (MAD) technique. With the recent
technological developments in the detector
technology, even small differences in intensities
between the Bijvoet pairs could be detected during
the Synchrotron data collection and hence SAD
technique is becoming more popular. This procedure
also reduces the data collection time in synchrotron
beam lines to about 2/3rd. Radiation Damage is also
avoided.The presentation will cover the basics of
MAD and SAD techniques and also will demonstrate
the high throughput crystallographic application of
the SAD technique in two macromolecules which is
essential in Structural Genomics.
Introduction
Nearly 35,026 protein structures are
deposited in the Protein Data Bank (PDB) and more than
2,00,000 sequence (SWISS-PROT) entries exist for
which the three dimensional structures are not available.
One of the main interests in the molecular
biosciences is in understanding structure function
relations and X-ray crystallography plays a major role in
this.
X-ray crystallography has proved a very
versatile
method,
with
most
globular
macromolecules proving to be crystallizable, and
with no limitations on the size and complexity of
the macromolecules.
In Structural Genomics, one is interested in
determining the structure in the fastest way to
understand new folds and this has opened up the
“High Throughput Crystallography”. An
understanding of the three-dimensional structure
correlates the function of the molecule.
The High Throughput Crystallography
Consortium was developed to refine and
extend the powerful software tools that drive
forward the development and validation of
rapid
methods
for
X-ray
structure
determination, protein model building,
refinement and structure validation.
X-ray crystallography has become a
central tool in modern drug and target
discovery, providing important insights into
molecular interactions and biological function.
The burgeoning of many structural genomics
initiatives requires that many hundreds, perhaps
thousands of macromolecular structures are
determined rapidly and reliably. Increasing attention
is thus being focused on the use of automation in all
aspects of macromolecular structure determination.
Progress is being made in the areas of
automation of sample changing
and sample
characterization and methods have been available for
some time that address the automation of phasing and
model building procedures.
However, the automatic production of usable
experimental data to feed these later processes
remains problematic.
Direct methods are highly successful in solving
small molecular crystal structures for which data are
always available at atomic resolution (AR). These
methods fail when applied to macromolecules due to
the poor validity of the probabilistic estimates of
phase relationships in this situation where the total
number of atoms in the unit cell becomes very large.
The diffraction data available for macromolecular
crystals are not usually at an atomic resolution. For
these reasons, direct methods cannot be used as such
to solve macromolecules.
Recently there has been tremendous interest
in the use of direct methods for phase
determination of macromolecules. This surge of
interest has primarily resulted from two factors: the
ability to obtain atomic resolution data in favorable
cases and the development of powerful phasing
methods including traditional direct methods so
called half-baked and combinations of direct
methods with isomorphous replacement and/or
anomalous scattering.
Macromolecular crystallography has now
evolved to such an extent that structural genomics
projects aiming at rapidly solving a large number of
new structures in a short time are actively and
successfully pursued in many laboratories. This is
possible owing to the advances taken place in the
data-collection facilities such as more intense X-ray
sources, in particular dedicated synchrotron beam
lines, highly efficient two-dimensional detectors in
the form of imaging plates and, more recently,
charged-couple devices, and cryogenic nitrification to
alleviate the effects of radiation damage and extend
the resolution of data accessible.
With the above advances, more data sets
appear to be coming from AR data. The above
possibility of getting AR data even for
macromolecules prompted the direct methods
practitioners to make attempts to extend the direct
methods
for
macromolecular
structure
determination.
Acronyms for phasing techniques
•
•
•
•
•
•
•
MR
SIR
MIR
SIRAS
MIRAS
MAD
SAD
Anomalous scattering
Electrons scatter as if they were ‘free’ electrons and the
scattering factor is defined as
f = f0’
which is proportional to the electron number of the atom.
However, electrons are not really ‘free’ especially for atoms
with large atomic numbers. These atoms may also absorb Xray of specific wavelengths at and near their absorption
edges, leading to anomalous scattering.
Many heavy atoms have absorption edges within the normally
used X-ray wavelengths for crystallography. Absorption edges
for light atoms such as C, N and O are not near the X-ray
wavelengths used in crystallography, so these atoms do not
contribute to anomalous scattering.
Anomalous scattering
In the presence of anomalous scattering, atomic scattering
factor
f = f0 + f’ + if’’
f0 = is the normal scattering factor
It drops with resolution and is proportional to the atomic
number.
f’ and f’’ are anomalous scattering factors; f’ can be positive or
negative, f’’ is 90° ahead in phase relative to f0.
They are wavelength dependent but do not change with
resolution.
Anomalous scattering leads to a breakdown of Friedel‘s law
Ihkl Ihkl
The main phasing technique
in structural genomics
can
be improved by making use of
Synchrotron
High intensity
RadiationDirect
+ Methods
and
tunable
Multi-wavelength
Systematic study of protein structures
wavelength
Anomalous
basedSelenomethionine
on genome
sequencing
Ethan Aof
Merritt
©1996-2000
for better understanding
life
process
Diffraction
enrichment
+
[email protected]
Replacing S
and improving
human health
Biomolecular Structure Center at UW
atoms with
Se atoms
MAD
The multiwavelength anomalous dispersion
(MAD) method has risen to a position of preeminence amongst experimental phasing methods
and it is now a straightforward and widely accepted
technique for producing de novo phase information
for use in macromolecular structure determination.
Advantages of MAD
• All data is collected from one crystal
– Perfect isomorphism
• Fast
• Easily interpretable electron density maps obtained
right away.
However, the use of MAD method needs
special equipment, such as energy-dispersive
fluorescence detectors on beam lines and the
requirement for careful accurate data collection
at a number (typically three) of wavelengths
which means that MAD experiments place great
demands
on
instrumentation
reliability,
reproducibility and stability. The potentially
more serious problem for this technique is
radiation damage, which can severely limit the
amount of data collected from the crystal
sample.
As a result of this drawback, there has recently
been a great deal of interest in using singlewavelength anomalous diffraction (SAD or SAS) data
in the elucidation of macromolecular structures, with
investigations showing that the SAD technique may be
applied to many diverse problems, ranging from weak
anomalous signals to highly complex substructures.
SAD experiment is straight forward and data can be
collected in the standard way.
In principle, the SAD method is used with data
collected close to an absorption edge of the anomalous
scatterer in the sample under investigation.
SAD
Single-wavelength anomalous diffraction (SAD)
phasing has become increasingly popular in protein
crystallography.
Two main steps:
1) obtaining the initial phases
2) improving the electron density map
calculated with initial phases.
The essential point is to break the intrinsic phase
ambiguity.
Two kinds of phase information enables the
discrimination of phase doublets from SAD data prior
to density modification.
From heavy atoms (expressed by Sim distribution)
From direct methods phase relationships (expressed
by Cochran distribution)
Many protein structures have been solved by the
SAD method with heavy atoms such as Se, Pt, Au, Hg,
etc. using synchrotron X-rays with wavelengths near
their absorption edge.
The use of S atoms for SAD phasing is
especially attractive as S atoms are present in almost
all proteins (as methionine or cysteine residues) and
thus neither modification such as SeMet substitution
nor heavy-atom soaking is necessary for structure
analysis.
SAD phasing relies on the presence of
‘anomalously’ scattering atoms that cause the violation
of Friedel’s law. The differences in Bijvoet-related
intensities, the so-called anomalous differences, are
used for substructure solution and subsequent phasing.
These differences are expected to be only a small
fraction of the total signal for each reflection, accurate
measurements and statistical treatment of the errors
are vital for a successful structure – solution process.
Once experimental intensity data have been
collected and processed, in the majority of cases
structure determination using the SAD technique
proceeds via a three-step process. Firstly, the
determination of the positions of the anomalous
scatterers is carried out; phases are then developed in
order to produce electron-density maps and in the final
stage, these are interpreted using either manual or
automatic methods to produce a starting model for
refinement procedures.
Description of the program
Information on anomalous scattering is important
for the determination of protein structures. However,
the one-wavelength anomalous-scattering (OAS)
method yields two possible solutions to each reflection
which is known as the problem of phase ambiguity. If a
method can be found to resolve the ambiguity, the OAS
method would be useful technique in protein
crystallography, since it is possible to solve a protein
structure by either skipping the step of heavy-atomderivative preparation if it contains suitable anomalous
scatterers, or using only a heavy-atom derivative which
may not be isomorphous to the native protein.
Attempts that have been made to resolve the
phase ambiguity arising from the OAS technique by
direct methods since 1960’s have succeeded in
deriving a large number of three-phase structure
invariants from the error-free data of a model protein
structure.
The phase problem is reduced to a sign
problem once the anomalous scatterers or the
replacing heavy atom sites are located.
OASIS is a computer program and it works
on a direct-method procedure to break the phase
ambiguity intrinsic to one-wavelength anomalous
scattering or single isomorphous replacement data.
All Friedel pairs (including centric reflections)
were evaluated. It adopts the CCP4 format and has
been written in Fortran 77. The X-ray diffraction
data and heavy atom site are the inputs for this
program. The resulting phase sets are further
subjected into density modification.
Enter
E, ’,
Sigma-2 relationships
P+ = 0.5
best and FOM (mh)
Enough
cycles?
Yes
No
P+(h)
Flowchart of program OASIS
Output
The structure solution program SHELXD is
useful for locating the heavy atoms or anomalous
scatterers from SIR, SAD, SIRAS or MAD data. It
is iterative dual-space direct methods based on
phase refinement in reciprocal space and peak
picking in real space. SHELXD locates relatively
large numbers of anomalous scatterers efficiently
from MAD or SAD data. Truncation of the data at a
particular resolution in the range 3.0 - 3.5 Å, can be
critical to success. The efficiency can be improved
by roughly an order of magnitude by Pattersonbased seeding instead of starting from random
phases or sites.
The program SHELXE can read the heavy atom sites
written by SHELXD and estimates the native phases and
corresponding weights (figures of merit).
SHELXE outputs the phases in an XtalView format.
The map can be viewed using iterative graphics of the
phases which can be improved by density modification.
The phases obtained from SHELXE and OASIS
are of superb quality to allow automated model
building to be carried out using APR/wARP followed
by the refinement program REFMAC.
Attempts are here made in extending the
applications to (i) the high throughput structure
elucidation with 1.7 Å resolution anomalous scattering
synchrotron data of thermolysin of approximately 34
kDa molecular weight and also for 2 Å and 2.1 Å
truncated datas obtained from it using one zinc and
seven heavy atom positions and (ii) 1.45 Å resolution
anomalous scattering synchrotron data of glucose
isomerase of approximately 44 kDa molecular weight
and also for 1.9 Å and 2.1 Å truncated datas obtained
from it using one manganese and eleven heavy atom
position positions. (iii) 1.7 Å resolution anomalous
scattering lab source Cuk data of glucose isomerase
of approximately 44 kDa molecular weight and also for
2.1 Å and 2.2 Å truncated datas obtained from it using
one manganese and eleven heavy atom positions. All the
computations mentioned here are carried out using the
Pentium IV PC.
The flowchart of the present work is shown below.
Anomalous scattering data
Substructure determination using SHELXD
Phasing with SHELXE/OASIS
Automatic Model building
Satisfactory
Yes
No
Get partial structure
Output
Thermolysin (1.7 Å Synchrotron Data)
PDB i.d. : 1FJQ
To ta l re sidue s: 316
The diffraction data were collected at a
temperature of 100 K on the X9B synchrotron
beamline at the National Synchrotron Light Source
(Brookhaven National Laboratory, USA) using the
ADSC Quantum4 CCD detector. This enzyme contains
316 residues, one Zn site and four calcium ions.
The position of the anomalous scatterers in this
enzyme (Zn) was located by direct methods program
SHELXD. It gives three positions with a CC value of
51.52.
SHELXD output
REM TRY
38 CC 51.52 CC(weak) 32.67 TIME
127 SECS
ZN01 1 0.880539 0.549049 0.054595 1.0000 0.2
ZN02 1 0.907318 0.668282 -0.112321 0.0870 0.2
ZN03 1 0.664932 0.547234 -0.001291 0.0563 0.2
OASIS and SHELXE input peak
OASIS was run for the top most peak obtained from
SHELXD. The input coordinate file was written in the
heavy atom format using the coordinate option in CCP4.
The density-modification program DM from the CCP4
suite was used for the phase refinement. The program was
running under default control in the recommended mode,
which performs solvent flattening, histogram matching and
multi-resolution density modification. The automated
model building was carried out using ARP/wARP for these
modified phases. ARP/wARP starts with Rw and Rf values
of 45.2 and 46.3%, respectively. The first 50 cycles of
ARP/wARP was able to build 75 out of 316 residues in 9
chains with a connectivity index of 0.76. At this stage, Rw
and Rf values are 28.7 and 47.9%, respectively.
An iterative cycle of ARP/wARP was carried out
with this as input which revealed 185 residues out of 316
residues with a connectivity index of 0.83. Another one
iterative ARP/wARP was carried out with this output as
input which revealed 309 residues out of 316 residues in 3
chains with a connectivity index of 0.98. At this stage, the
Rw and Rf values are 16.9 and 21.8%, respectively.
Manual model building was carried out for the missing
residues and solvent atoms were updated after the
refinement using ARP/wARP ‘build solvent atoms’ script.
The final Rw and Rf values were 18.1 and 20.5%,
respectively. The backbone of this final model was
superimposed with PDB 1FJQ. The root-mean square
deviation was 0.340 Å.
Details of OASIS and ARP/wARP results
P
D
B
i
.
d
.:1
F
J
Q
T
o
t
a
lr
e
s
i
d
u
e
s
:3
1
6
I
n
p
u
t
:
1
S
H
E
L
X
D
P
E
A
K
t
o
O
A
S
I
S
A
u
t
o
B
u
i
l
t
:
3
0
9
r
e
s
i
d
u
e
s
Time taken: 4Hours
C
N
Superposition of the C atoms of the
current model with 1FJQ (red)
FINAL MODEL SUPERPOSED WITH OASIS MAP and FINAL (2FO FC|) MAP (1 )
We also attempted the above case using
SHELXD / SHELXE / ARP/wARP / REFMAC
approach, it also built 310 residues out of 316
residues within 50 cycles of auto-building using
ARP/wARP. The time is also reduced to 2 hours in
this approach.
P
D
B
i
.
d
.:1
F
J
Q
T
o
t
a
lr
e
s
i
d
u
e
s
:3
1
6
I
n
p
u
t
:
1
S
H
E
L
X
D
P
E
A
K
t
o
S
H
E
L
X
E
A
u
t
o
B
u
i
l
t
:
3
1
0
r
e
s
i
d
u
e
s
Time taken: 2Hours
r.m.s.d 0.307 Å
C
N
Superposition of the C atoms of the
current model with 1FJQ (red)
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Truncated data of 2 Å resolution of this
enzyme was prepared from SCALEPACK2MTZ
option in CCP4 using 1.7 Å data and SHELXD
gave three positions with a CC value of 54.20. The
top most peak was given to SHELXE for phasing
and the CC has increased to 69.45. The phases
were then fed to ARP/wARP and REFMAC. Ten
cycles of auto-building along with five cycles of
REFMAC in each auto-building cycle were
performed. Finally ARP/wARP was able to build
311 out of 316 residues in four chains. At this
stage, the Rw and Rf values without dummy atoms
were 25.5 and 27.1%, respectively.
SHELXD output for 2 Å data
ZN01 1 0.119400 0.450890 0.054199 1.0000 0.2
ZN02 1 0.139931 0.384232 -0.035821 0.1933 0.2
ZN03 1 0.093246 0.331573 -0.112710 0.0715 0.2
SHELXE input peak
Manual model building was carried out for
the missing residues and 20 cycles of
maximum-likelihood
refinement
were
performed using REFMAC and solvent atoms
were updated after the refinement using
ARP/wARP ‘build solvent atoms’ script. The
final Rw and Rf values were 15.4 and 21.0%,
respectively. The backbone of this final model
was superimposed with PDB 1FJQ. The rootmean square deviation was 0.332 Å.
Details of SHELXE and ARP/wARP results for 2 Å data
P
D
B
i
.
d
.:1
F
J
Q
T
o
t
a
lr
e
s
i
d
u
e
s
:3
1
6
Input: 1 SHELXD peak to SHELXE for 2 Å data
Auto Built: 311 residues
Time taken: 2Hours
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Our attempt to come out with a good model
with a truncated data at 2.1 Å resolution using one
zinc position was a failure. But we increased the
phasing power by using seven heavy atom
positions (1Zn +4Ca+2S) in SHELXD / SHELXE /
ARP/wARP / REFMAC approach instead of using
one zinc position.This procedure built 313 residues
out of 316 residues in 200 cycles of auto-building
using ARP/wARP.
SHELXE input for 2.1 Å truncated data
ZN01
CA02
CA03
CA04
CA05
S006
S007
1 0.880127 0.549591 0.054520
2 0.560875 0.433609 0.124032
2 0.858528 0.615112 -0.034950
2 0.784294 0.489799 -0.082435
2 0.867302 0.625244 -0.066635
3 0.717163 0.460991 0.014268
3 0.906685 0.668312 -0.111915
1.0000 0.2
0.1940 0.2
0.1858 0.2
0.1517 0.2
0.1493 0.2
0.1153 0.2
0.0905 0.2
Details of SHELXD, SHELXE and ARP/wARP results for 2.1 Å data
P
D
B
i
.
d
.:1
F
J
Q
T
o
t
a
lr
e
s
i
d
u
e
s
:3
1
6
Input: 7 SHELXD peak to SHELXE for 2.1 Å data
Auto Built: 313 residues
Time taken: 5Hours
C
N
Superposition of the C atoms of the
current model with 1FJQ (red)
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Glucose isomerase (388a.a) (1.45 Å Synchrtotron data )
PDB i.d: 1OAD (Two Molecules)
Violet – Mn and Orange - Mg
Ramagopal et al., 2003 (Acta Cryst. D59, 868875) have presented the SAD phasing details of
glucose isomerase. In this paper they focused on the
SAD phasing with MLPHARE and DM for using
manganese position. Here we focus on the SAD
phasing with SHELXE and OASIS. The enzyme
contains 388 amino acids and two metal sites, one
occupied by Mn2+ ion and the other by Mg2+. The
data was collected at a wavelength of 0.98 Å and
belongs to I222 space group. The k X-ray
absorption edge of manganese lies at 1.90 Å and at
the wavelength used in this experiment the
imaginary component (f”) of manganese is 1.3
electron units.
The strongest anomalous scattering is
provided by Mn, especially at shorter wavelengths
where the anomalous effect of sulfur is very small.
The first step in all phasing procedures based
on the anomalous diffraction effect is the solution
of the partial structure of anomalous scatterers.
The location of the anomalous scatterers in this
enzyme (Mn2+) was performed by direct methods
program SHELXD.
SHELXD gives three
positions with a CC value of 29.69.
SHELXD output
REM TRY
76 CC 29.69 CC(weak) 19.16 TIME
119 SECS
MN01 1 0.583054 0.133270 0.066371 1.0000 0.2
MN02 1 0.631714 0.147301 0.080120 0.2927 0.2
MN03 1 0.612625 0.175293 0.241702 0.2350 0.2
OASIS and SHELXE input peak
OASIS was run for the top most peak
obtained from SHELXD. Density modification
using the CCP4 program DM was then applied to
the resulting phase sets. The automated model
building was carried out using ARP/wARP for
these modified phases. Finally ARP/wARP was
able to build 385 out of 388 residues in two chains
with a connectivity index of 0.99. At this stage, the
Rw and Rf values are 16.9 and 20.3%, respectively.
Manual model building was carried out for the
missing residues and solvent atoms were updated after
the refinement using ARP/wARP ‘build solvent atoms’
script. The final Rw and Rf values were 17.5 and
19.3%, respectively. The backbone of this final model
was superimposed with P21212 form of this enzyme.
The root-mean square deviation was 0.170 Å.
Details of OASIS and ARP/wARP results
PDB i.d: 1OAD (first molecule)
I
n
p
u
t
:
1
S
H
E
L
X
D
p
e
a
k
t
o
O
A
S
I
S
A
u
t
o
b
u
i
l
t
:
3
8
5
r
e
s
i
d
u
e
s
Total residues: 388 a.a
Time taken: 2Hours
C
N
Superposition of the C atoms of the
current model with 1OAD (red)
FINAL MODEL SUPERPOSED WITH OASIS MAP and FINAL (2FO -FC|) MAP (1 )
We also attempted SHELXD / SHELXE /
ARP/wARP / REFMAC approach, it also built 384
residues out of 388 residues within 50 cycles of
auto-building.
PDB i.d: 1OAD (first molecule)
Total residues: 388 a.a
I
n
p
u
t
:
1
S
H
E
L
X
D
p
e
a
k
t
o
S
H
E
L
X
E
A
u
t
o
b
u
i
l
t
:
3
8
4
r
e
s
i
d
u
e
s
Time taken: 2Hours
r.m.s.d 0.184 Å
C
N
Superposition of the C atoms of the
current model with 1OAD (red)
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Truncated data of 1.9 Å resolution of this
enzyme was prepared from SCALEPACK2MTZ
option in CCP4 using 1.45 Å data and SHELXD
gave three positions with a CC value of 31.95. The
top most peak was given to SHELXE for phasing
and the CC has increased to 68.26. The phases
were then fed to ARP/wARP and REFMAC. Ten
cycles of auto-building along with five cycles of
REFMAC in each auto-building cycle were
performed. Finally ARP/wARP was able to build
383 out of 388 residues and 328 water atoms in two
chains. At this stage, the Rw and Rf values are 16.8
and 21.8%, respectively.
The map also showed the densities in the
missing region, so the manual model building was
carried out (using Xtalview) for the missing
residues. After the manual model building, the water
atoms were checked and included if necessary and
25 cycles of maximum-likelihood refinement were
performed using REFMAC. The final Rw and Rf
values were 16.4 and 19.5%, respectively. The
backbone of this final model was superimposed with
the one in P21212 space group of this enzyme. The
root-mean square deviation was 0.170 Å.
Details of SHELXE and ARP/wARP results
PDB i.d: 1OAD (first molecule)
Total residues: 388 a.a
Input: 1SHELXD peak to SHELXE
Auto Built: 383 a.a
N
C
Superposition of the C atoms of the
current model with 1OAD (red)
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Our attempt to come out with a good model
with a truncated data at 2.1 Å resolution using one
manganese position was a failure. But we increased
the phasing power by using eleven heavy atom
positions (1Mn +1Mg+9S) in SHELXD / SHELXE
/ ARP/wARP / REFMAC approach instead of using
one manganese position.This procedure built 384
residues out of 388 residues in 125 cycles of autobuilding using ARP/wARP.
SHELXE input
MN01
MG02
S003
S004
S005
S006
S007
S008
S009
S010
S011
1
2
3
3
3
3
3
3
3
3
3
0.416443
0.372391
0.387909
0.294594
0.432205
0.329697
0.435989
0.499664
0.634033
0.273483
0.218407
0.366409 0.066588
0.355301 0.081381
0.322624 0.241415
0.311020 0.104689
0.348717 0.182898
0.253654 0.230498
0.407997 0.027930
0.441917 0.222937
0.274239 0.155318
0.362343 -0.044306
0.479584 0.250876
1.0000
0.3145
0.2621
0.2541
0.2446
0.2331
0.2176
0.1980
0.1604
0.1067
0.0840
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
Details of SHELXD, SHELXE and ARP/wARP results for 2.1 Å data
PDB i.d: 1OAD (first molecule)
Total residues: 388 a.a
Input: 11SHELXD peaks to SHELXE
Auto Built: 384 a.a
N
C
Superposition of the C atoms of the
current model with 1OAD (red)
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
SAS application to lab source CuK data of Glucose
Isomerase at 1.7 Å resolution data and 2.2 Å
truncated data
The data was collected at a wavelength of
1.54178 Å and belongs to I222 space group. The
location of the anomalous scatterers in this
enzyme (Mn2+) was performed by direct methods
program SHELXD.
SHELXD gives three
positions with a CC value of 25.64.
MN01 1 0.083282 0.633141 0.065877 1.0000 0.2
SHELXE Input Peak
The top most peak was given to SHELXE for
phasing. It ended with a CC value of 63.78. The
phases were then fed to ARP/wARP and REFMAC.
After the initial model was refined, ten cycles of
auto-building along with five cycles of REFMAC in
each auto-building cycle were performed. Finally
ARP/wARP was able to build 384 out of 388
residues in two chains with 384 water atoms. At this
stage, the Rw and Rf values are 17.7 and 21.0%,
respectively. Without these water atoms Rw and Rf
values are 24.6 and 26.4%, respectively. The map
also showed the densities in the missing region, so
the manual model building was carried out for the
missing residues.
After the manual model building, water atoms
were checked and included if necessary. 25 cycles of
maximum-likelihood refinement were performed
using REFMAC. The final Rw and Rf values were
17.9 and 19.7%, respectively. The backbone of this
final model was superimposed with the one in
P21212 space group of this enzyme. The root-mean
square deviation was 0.150 Å.
Details of SHELXE and ARP/wARP results
PDB i.d: 1OAD (first molecule)
Total residues: 388 a.a
Input: 1SHELXD peak to SHELXE
Auto Built: 384 a.a
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Truncated data of 2.1 Å resolution of this
enzyme was prepared from SCALEPACK2MTZ
option in CCP4 using 1.7 Å data and SHELXD
gave three positions with a CC value of 29.23. The
top most peak was given to SHELXE for phasing
and the CC has increased to 61.92. The phases
were then fed to ARP/wARP and REFMAC. Ten
cycles of auto-building along with five cycles of
REFMAC in each auto-building cycle were
performed. The first 50 cycles of ARP/wARP was
able to build 328 out of 388 residues in 11 chains
with a connectivity index of 0.94. At this stage, Rw
and Rf values are 22.2 and 33.6%, respectively.
An iterative cycle of ARP/wARP was carried out
with this as input which revealed 384 residues out of 388
residues in 2 chains with a connectivity index of 0.99 and
301 water atoms. At this stage, the Rw and Rf values are
16.3 and 20.7%, respectively. Without these water atoms
Rw and Rf values are 21.6 and 24.5%, respectively. After
the manual model building, water atoms were checked and
added if necessary. 25 cycles of maximum-likelihood
refinement were performed using REFMAC. The final Rw
and Rf values were 16.9 and 20.5%, respectively. The
backbone of this final model was superimposed with the
one in P21212 space group of this enzyme. The root-mean
square deviation was 0.195 Å.
Details of SHELXE and ARP/wARP results
PDB i.d: 1OAD (first molecule)
Total residues: 388 a.a
Input: 1SHELXD peak to SHELXE
Auto Built: 384 a.a
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Our attempt to come out with a good model
with a truncated data at 2.2 Å resolution using one
manganese position was a failure. But we increased
the phasing power by using eleven heavy atom
positions (1Mn +1Mg+9S) in SHELXD / SHELXE
/ ARP/wARP / REFMAC approach instead of using
one manganese position.This procedure built 384
residues out of 388 residues in 50 cycles of autobuilding using ARP/wARP.
SHELXE input
MN01
MG02
S003
S004
S005
S006
S007
S008
S009
S010
S011
1 0.083519
2 0.065681
3 0.113022
3 0.203300
3 0.166092
3 0.128403
3 0.062347
3 -0.133377
3 -0.005554
3 0.276901
3 0.227646
0.133224 0.065645
0.148941 0.185435
0.172218 0.241358
0.191154 0.103998
0.245491 0.235132
0.146370 0.079333
0.089096 0.030566
0.222557 0.152354
0.061157 0.217989
0.044327 0.137209
0.123695 -0.044698
1.0000
0.3804
0.3523
0.3331
0.3165
0.3109
0.3089
0.2929
0.2423
0.1962
0.1891
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
0.2
PDB i.d: 1OAD (first molecule)
Total residues: 388 a.a
Input: 11SHELXD peaks to SHELXE
Auto Built: 384 a.a
N
C
Superposition of the C atoms of the current
model: 1.7 Å (blue) and 2.2 Å (magenta) with
1OAD (red)
FINAL MODEL SUPERPOSED WITH SHELXE MAP and FINAL (2FO -FC|) MAP (1 )
Conclusion
The work emphasizes the applicability of the
SAS technique to solve a macromolecular structure
when data extends to 2.2 Å resolution. Many
proteins host light metals such as calcium,
manganese, potassium etc. as cofactors or recruit
them as stabilizing agents. These metals may
provide an opportunity to by pass the preparation
of heavy-atom derivatives or the incorporation of
selenomethionine residues into native sequences
and allow de novo crystal structure determination.
The above results demonstrate that the direct
method is capable of discriminating the correct
phase in a bimodal distribution of a protein
reflection
by
exploiting
single-wavelength
anomalous scattering diffraction data which extends
to modest resolution. The combination of SAS data
and direct methods is a powerful approach for
resolving
phases
for
protein
structure
determination; its wider adoption would result in a
major saving of synchrotron-radiation experimental
time.
This work also adds substantial evidence that
even with single-wavelength anomalous scattering
data a macromolecular structure can be solved with
the existing sophisticated programs with the
knowledge of just one anomalous scatterer. This result
also suggests that an even smaller anomalous signal
contained in the real data is sufficient for the solution
of moderately large macromolecular structures by the
SAS approach. The SAS method could therefore play
an important part in the high-throughput completely
automatic procedures currently being planned for
structural genomics initiatives.
Acknowledgements
DV acknowledges S. Selvanayagam, Senior Research
Scholar, DCB for his substantial contribution in the above work
and thanks Prof. Z. Dauter, USA for providing anomalous
scattering data. Funding from Department of Science and
Technology (DST) and Department of Biotechnology (DBT),
Govt. of India to this work is great fully acknowledged. Prof. T.
Yamane, Department of Biotechnology, School of Engineering,
Nagoya University, Japan, is acknowledged for providing many
data sets.
THANK YOU