No Slide Title
Download
Report
Transcript No Slide Title
PISA
Protein Interfaces, Surfaces and Assemblies
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
Eugene Krissinel
[email protected]
CCP4 & EBI-MSD
PISA is a tool for the assessment of macromolecular
interactions using data provided by protein
crystallography.
Scope of tasks addressed by PISA:
•
•
•
•
•
•
•
•
•
identification and prediction of multimeric states
analysis of structure-function relationship
analysis and prediction of macromolecular interactions
analysis of macromolecular complexation and crystallisation
properties of macromolecular interfaces
search for interface/structure/assembly homologues
active site recognition and analysis
macromolecular surface analysis
other
Project started in 2004, supported by BBSRC research grant 721/B19544
PISA today
Web-service hosted by EBI-MSD at
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
• provides PISA analysis for all PDB entries and database searches
• allows upload of PDB and mmCIF files for interactive PISA analysis
• provides XML download of multimer data, which is used in server
applications (BALBES) on molecular replacement
• works on aminoacid, nucleic acid and ligand structures
• more than 140,000 external queries served since the release
• more than 1700 users
• has a command-prompt stand-alone version
PISA basics
PISA is based on chemical thermodynamics:
Gdiss Gint TS 0
for stable structures in the standard state.
Gdiss cannot be calculated exactly. PISA uses semiempirical
models with parameters calibrated to available experimental
data on multimeric states.
Precision of free energy estimates in PISA:
Success rate of PQS prediction:
±5 kcal/mol
80-90%
Last year activity – nucleic acids and ligands
Extension to include protein-DNA/RNA and ligand interactions
•
•
•
Derivation and calibration of interaction parameters
Database of ligand interactions (~6000 entries
parameterized on atomic level)
Tools for database update and semi-automatic calculation
of protein-ligand interactions
Core algorithm completely rewritten in order to:
•
•
implement changes needed to adopt protein-DNA/RNA and
ligand interactions
optimize and speed-up the calculations
Last year activity – ligand control
Control over ligand processing:
•
Possibility to exclude certain
ligands from processing
•
Choice of ligand processing
modes:
Automatic
Fix all ligands
Free all ligands
Last year activity – adaptation for MSD&PDB
Interface and presentation improvements at request of
PDB/MSD curation teams:
•
•
•
•
•
Consistent identification of symops in PISA pages
Adoption of PDB@RCSB symop nomenclature
Automatic generation of REMARK 350
Optimization of final assembly positions
Reporting on redundant assemblies (especially when ASU
contains a fractional number of assemblies >1)
PISA is now employed by both MSD and PDB@RCSB as a
mandatory processing tool for all depositions
Last year activity - PISA database
PISA database searches by
•
•
•
•
•
•
•
•
•
•
•
•
Multimeric state
Symmetry number
Space group
Homomeric type
Salt bridges
Disulphide bonds
List of ligands
List of keywords
Dissociation energy
Assembly ASA
Assembly BSA
Percent BSA
Last year activity - standalone PISA
Command-prompt, stand-alone PISA for inclusion into CCP4
•
•
•
•
•
•
Contains only data-processing part of “big” PISA, i.e. no database
For technical reasons, there are code differences from “big” PISA
Functionally identical to the corresponding parts of “big” PISA
Mimics web-page output of “big” PISA in plain text
Provides same XML output as “big” PISA
Works as a local server:
Maintains sessions
Data processing separated from data retrieval
•
Visualization using Rasmol
Standalone PISA example
Last year activity – tune-up and polishing
Last percent of improvement takes 99% of all efforts
Literally hundreds of small problems solved on everyday basis.
Examples:
•
•
•
•
•
•
•
Inference of correct orthogonalisation codes
Choice of margins for identification of parallel monomeric units
Symmetry number calculations: superposition margins and unit
enumeration order
Identification and proper treatment of overlapping symmetry mates with
fractional occupancy
Unique labels for the download/visualisation data and wait pages to avoid
caching on remote servers
Catching up with EBI systems update
PISA is roughly 60,000 C++ statements and small bugs are most probably
still there
6 releases over year
Future plans
PISA is systematically underperforming on FABs.
Possible reasons:
• Neglect of electrostatic interactions
• Neglect of entropy absorbance in flexible complexes
Both are very difficult problems to address
Last percent of improvement takes 99% of efforts
Future plans
Analysis of “custom” assemblies
• Allow for input without crystallographic data
• Effectively inclusion of NMR entries as well
Detection of “custom” assemblies
• Allow for report on specific assemblies otherwise missed as unstable
Automatic prediction of macromolecular interactions
and assemblies by homologue search in PISA
database
Assessment of crystal “quality”
• Identification of fake PDB entries and depositions
Fake PDBs
2i07
2icc 2ice 2icf 2hr0
BSA
20%
51%
24%
24%
10%
Interfaces per chain
9.56
7
6.12
8.33
3.5
8
7
4
4
2
Yes
Yes
Yes
Yes
Yes
Min. interfaces / chain
Connected crystal