No Slide Title

Download Report

Transcript No Slide Title

PISA
Protein Interfaces, Surfaces and Assemblies
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
Eugene Krissinel
[email protected]
CCP4 & EBI-MSD
PISA is a tool for the assessment of macromolecular
interactions using data provided by protein
crystallography.
Scope of tasks addressed by PISA:
•
•
•
•
•
•
•
•
•
identification and prediction of multimeric states
analysis of structure-function relationship
analysis and prediction of macromolecular interactions
analysis of macromolecular complexation and crystallisation
properties of macromolecular interfaces
search for interface/structure/assembly homologues
active site recognition and analysis
macromolecular surface analysis
other
Project started in 2004, supported by BBSRC research grant 721/B19544
PISA today
Web-service hosted by EBI-MSD at
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
• provides PISA analysis for all PDB entries and database searches
• allows upload of PDB and mmCIF files for interactive PISA analysis
• provides XML download of multimer data, which is used in server
applications (BALBES) on molecular replacement
• works on aminoacid, nucleic acid and ligand structures
• more than 140,000 external queries served since the release
• more than 1700 users
• has a command-prompt stand-alone version
PISA basics
PISA is based on chemical thermodynamics:
Gdiss  Gint  TS  0
for stable structures in the standard state.
Gdiss cannot be calculated exactly. PISA uses semiempirical
models with parameters calibrated to available experimental
data on multimeric states.
Precision of free energy estimates in PISA:
Success rate of PQS prediction:
±5 kcal/mol
80-90%
Last year activity – nucleic acids and ligands

Extension to include protein-DNA/RNA and ligand interactions
•
•
•

Derivation and calibration of interaction parameters
Database of ligand interactions (~6000 entries
parameterized on atomic level)
Tools for database update and semi-automatic calculation
of protein-ligand interactions
Core algorithm completely rewritten in order to:
•
•
implement changes needed to adopt protein-DNA/RNA and
ligand interactions
optimize and speed-up the calculations
Last year activity – ligand control

Control over ligand processing:
•
Possibility to exclude certain
ligands from processing
•
Choice of ligand processing
modes:
 Automatic
 Fix all ligands
 Free all ligands
Last year activity – adaptation for MSD&PDB

Interface and presentation improvements at request of
PDB/MSD curation teams:
•
•
•
•
•

Consistent identification of symops in PISA pages
Adoption of PDB@RCSB symop nomenclature
Automatic generation of REMARK 350
Optimization of final assembly positions
Reporting on redundant assemblies (especially when ASU
contains a fractional number of assemblies >1)
PISA is now employed by both MSD and PDB@RCSB as a
mandatory processing tool for all depositions
Last year activity - PISA database

PISA database searches by
•
•
•
•
•
•
•
•
•
•
•
•
Multimeric state
Symmetry number
Space group
Homomeric type
Salt bridges
Disulphide bonds
List of ligands
List of keywords
Dissociation energy
Assembly ASA
Assembly BSA
Percent BSA
Last year activity - standalone PISA

Command-prompt, stand-alone PISA for inclusion into CCP4
•
•
•
•
•
•
Contains only data-processing part of “big” PISA, i.e. no database
For technical reasons, there are code differences from “big” PISA
Functionally identical to the corresponding parts of “big” PISA
Mimics web-page output of “big” PISA in plain text
Provides same XML output as “big” PISA
Works as a local server:
 Maintains sessions
 Data processing separated from data retrieval
•
Visualization using Rasmol
Standalone PISA example
Last year activity – tune-up and polishing

Last percent of improvement takes 99% of all efforts

Literally hundreds of small problems solved on everyday basis.
Examples:
•
•
•
•
•
•
•

Inference of correct orthogonalisation codes
Choice of margins for identification of parallel monomeric units
Symmetry number calculations: superposition margins and unit
enumeration order
Identification and proper treatment of overlapping symmetry mates with
fractional occupancy
Unique labels for the download/visualisation data and wait pages to avoid
caching on remote servers
Catching up with EBI systems update
PISA is roughly 60,000 C++ statements and small bugs are most probably
still there
6 releases over year
Future plans

PISA is systematically underperforming on FABs.
Possible reasons:
• Neglect of electrostatic interactions
• Neglect of entropy absorbance in flexible complexes


Both are very difficult problems to address
Last percent of improvement takes 99% of efforts
Future plans

Analysis of “custom” assemblies
• Allow for input without crystallographic data
• Effectively inclusion of NMR entries as well

Detection of “custom” assemblies
• Allow for report on specific assemblies otherwise missed as unstable

Automatic prediction of macromolecular interactions
and assemblies by homologue search in PISA
database

Assessment of crystal “quality”
• Identification of fake PDB entries and depositions
Fake PDBs
2i07
2icc 2ice 2icf 2hr0
BSA
20%
51%
24%
24%
10%
Interfaces per chain
9.56
7
6.12
8.33
3.5
8
7
4
4
2
Yes
Yes
Yes
Yes
Yes
Min. interfaces / chain
Connected crystal