Making Pipelines with Phaser

Download Report

Transcript Making Pipelines with Phaser

Phaser Progress Report (MR)
Airlie McCoy
CCP4 Developers’ meeting
17th-19th March 2008
Phaser for Molecular Replacement
Results from last
year
• What we are doing
at the moment
• Plan…
•
Phaser-2.1.1
•
•
•
Default packing criteria relaxed to allow a small number
of clashes
Default packing distance increased to identify
intercalation of helices
Composition can be estimated from solvent content
•
•
•
•
•
12.11.2007
Default composition corresponds to 50% solvent
If automated MR search finds some, but not all,
components, the partial solution is output
Output of PDB hybrid-36 atom numbers and col(21-22)
chainids for largest structures
Improved SAD phasing
SAD phasing starting from MR partial model
Distribution
•
The Phaser-2.1.1 module is available through
the CCP4 automated downloads page
•
•
Thanks to Martyn
Thanks to Francois for the Windows installer
Also distributed in phenix-1.3b
• Already up to phaser-2.1.3 due to a few
bugfixes on phaser-2.1 series
•
•
•
Now working on Phaser-2.2
Changed from cvs to svn: it is much better!
Current Developments…
•
…by the current developers
•
Gabor Bunkoczi
•
•
•
Automatic detection of model symmetry
Ongoing project of developing MR pipeline (phenix)
Robert Oeffner
•
•
Parallelization
Minimizer
Symmetry of MR search model
Symmetric model
•
Multiple identical solutions
Waste of time
Automatic detection of point group symmetry
•
•
User friendly (black box)
Faster and less error-prone than user defined input
Model
Other
RNA/DNA
Prune
solutions
Model
symmetry
Protein
A
E
B
C
Symmetry
operation
Symmetry
operation
F
D
Symmetry
operation
Point group
Point group
Symmetry of MR search model
•
Algorithm accounts for
•
•
breaks in chains
unequal number of residues in chains
All chains in the model must conform to the
point group symmetry
• Prevents branching of tree search
• c.f. Crystallographic symmetry
•
•
•
30% of the structures in the pdb use crystallographic
symmetry to build the biologically active oligomer
Restricts search space
Parallelization
•
Uses OPENMP
•
•
•
Portable, including linux and windows
Distributed with gcc 4.2 and above
Uses compiler directives
•
•
Code can be compiled without OPENMP
Biggest GOTCHA is the “race condition”
•
•
A race condition is what happens when two
threads of execution attempt to modify
something at the same time
Can be very difficult to detect and locate
•
Not always obvious e.g. variable++
Race condition
CLOCK
THREAD 0
1
load v (v=0)
2
incr v (v=1)
3
Swapped out
THREAD 1
load v (v=0)
4
incr v (v=1)
5
store v (v=1)
6
store v (v=1)
Swapped out
Parallelization
•
Must be done at a level where the overhead of
splitting the jobs onto different processors
doesn’t negate the effort of doing the spitting
•
•
Rice function sum over reflections is unsuitable
Phaser threaded where there is a “progress bar”
•
•
•
•
•
LLG calculation in brute force RF and TF
LLG rescoring of top fast RF and TF solutions
LLG for 500 random points in RF and TF (for Z-score)
Elmn calculations for model and data
Ensembling
Parallelization
•
Following discussion on ccp4-dev (initiated by
Kevin), the parallelization is accessed via
•
•
•
•
phaser --jobs 4
Or phaser -j 4
(Or keyword JOBS 4)
I can’t live without it now!
•
•
But, sorry folks, we are not confident enough to
release it just yet
Undetected race conditions?
To-do list
MR pipeline
(ongoing)
• Improvements to
minimizer
• Relative overall
B-factor refinement
for models in an
MR solution
• etc…
•