Transcript Slide 1

C
E
N
T
R
E
F
O
R
I
N
T
E
G
R
A
T
I
V
E
B
I
O
I
N
F
O
R
M
A
T
I
C
S
V
U
Experimentally solving protein
structures and protein-protein
interactions
Lecture 21
Introduction to Bioinformatics
2007
Today’s lecture
1. Experimental techniques for determining
protein tertiary structure
2. Protein interaction and docking
i. Ribosome example
ii. Zdock method
3. Molecular motion simulated by molecular
mechanics
If you throw up a stone, it is
Physics.
If you throw up a stone, it is
Physics. If it lands on your head,
it is Biophysics.
If you throw up a stone, it is
Physics. If it lands on your head,
it is Biophysics.
If you write a computer program,
it is Informatics.
If you throw up a stone, it is
Physics. If it lands on your head,
it is Biophysics.
If you write a computer program,
it is Informatics. If there is a bug
in it, it is Bioinformatics
Experimentally solving protein
structures
Two basic techniques:
1. X-ray crystallography
2. Nuclear Magnetic Resonance (NMR)
tchniques
1. X-ray crystallography
Crystallization
Purified
protein
Phase problem
Crystal
X-ray
Diffraction
Electron
density
Biological interpretation
3D structure
Protein crystals
• Regular arrays of protein molecules
• ‘Wet’: 20-80% solvent
• Few crystal contacts
• Protein crystals contain active protein
• Enzyme turnover
• Ligand binding
Example of crystal packing
Examples of crystal packing
Acetylcholinesterase
~68% solvent
2 Glycoprotein I
~90% solvent
(extremely high!)
Problematic proteins (no crystallisation)
• Multiple domains
Flexible
• Similarly, floppy ends may
hamper crystallization:
change construct
• Membrane proteins
• Glycoproteins
hydrophilic
Lipid
bilayer
hydrophobic
hydrophilic
Flexible and heterogeneous!!
Experimental set-up
• Options for wavelength:
– monochromatic, polychromatic
– variable wavelength
Liq.N2 gas stream
X-ray source
beam stop
detector
goniometer
Diffraction image
Diffuse scattering
(from the fibre loop)
Water ring
Direct beam
Beam stop
reciprocal lattice
(this case hexagonal)
Reflections (h,k,l) with I(h,k,l)
Increasing resolution
The rules for diffraction: Bragg’s law
• Scattered X-rays reinforce each other
only when Bragg’s law holds:
Bragg’s law: 2dhkl sin q = nl
Phase Problem
• Determining the structure of a molecule in a
crystalline sample requires knowing both the
amplitude and the phase of the photon wave
being diffracted from the sample
• X-rays which are emitted start out with dispersed
phases, and so the phases get lost
• Unfortunately, phases contribute more to the
informational content of a X-ray diffraction
pattern than do amplitudes. It is common to refer
to phaseless X-ray data as having "lost phases“
• Luckily, several ways to recover the lost phases
have been developed
Building a protein model
• Find structural elements:
– -helices, -strands
• Fit amino-acid sequence
Building a protein model
• Find structural elements:
– -helices, -strands
• Fit amino-acid sequence
Effects of resolution on electron density
d=4Å
Note: map calculated with perfect phases
Effects of resolution on electron density
d=3Å
Note: map calculated with perfect phases
Effects of resolution on electron density
d=2Å
Note: map calculated with perfect phases
Effects of resolution on electron density
d=1Å
Note: map calculated with perfect phases
Refinement process
• Bad phases
 poor electron density map
 errors in the protein model
• Interpretation of the electron density map
 improved model
 improved phases
 improved map
 even better model
… iterative process of refinement
Validation
• Free R-factor (cross validation)
– Number of parameters/
observations
• Ramachandran plot
• Chemically likely (WhatCheck)
– Hydrophobic inside,
hydrophilic outside
– Binding sites of ligands,
metals, ions
– Hydrogen-bonds satisfied
– Chemistry in order
• Final B-factor (temperature) values
2. Nuclear Magnetic Resonance (NMR)
800 MHz NMR spectrometer
Nuclear Magnetic Resonance (NMR)
• Pioneered by Richard R. Ernst, who won a Nobel Prize in chemistry in
1991, FT-NMR works by irradiating the sample, held in a static external
magnetic field, with a short square pulse of radio-frequency energy
containing all the frequencies in a given range of interest.
• The polarized magnets of the nuclei begin to spin together, creating a radio
frequency (RF) that is observable. Because the signals decays over time,
this time-dependent pattern can be converted into a frequency-dependent
pattern of nuclear resonances using a mathematical function known as a
Fourier transformation, revealing the nuclear magnetic resonance
spectrum.
• The use of pulses of different shapes, frequencies and durations in
specifically-designed patterns or pulse sequences allows the
spectroscopist to extract many different types of information about the
molecule.
Nuclear Magnetic Resonance (NMR)
• Time intervals between pulses allow—among other things—
magnetization transfer between nuclei and, therefore, the detection of
the kinds of nuclear-nuclear interactions that allowed for the
magnetization transfer.
• Interactions that can be detected are usually classified into two kinds.
There are through-bond interactions and through-space interactions.
The latter usually being a consequence of the so-called nuclear
Overhauser effect (NOE). Experiments of the nuclear-Overhauser
variety may establish distances between atoms.
• These distances are subjected to a technique called Distance
Geometry which normally results in an ensemble of possible structures
that are all relatively consistent with the observed distance restraints
(NOEs).
• Richard Ernst and Kurt Wüthrich —in addition to many others—
developed 2-dimensional and multidimensional FT-NMR into a powerful
technique for the determination of the structure of biopolymers such as
proteins or even small nucleic acids.
• This is used in protein nuclear magnetic resonance spectroscopy.
Wüthrich shared the 2002 Nobel Prize in Chemistry for this work.
2D NOESY spectrum
Gly
Val
Gly
Leu
Ser
Thr
Phe
Asp
Asn
Asp
• Peptide sequence (N-terminal NH not observed)
• Arg-Gly-Asp-Val-Asn-Ser-Leu-Phe-Asp-Thr-Gly
NMR structure determination:
hen lysozyme
• 129 residues
– ~1000 heavy atoms
– ~800 protons
1.2 104
• NMR data set
• 80 structures calculated
• 30 low energy
structures used
4
8000
Total energy
– 1632 distance restraints
– 110 torsion restraints
– 60 H-bond restraints
1 10
6000
4000
2000
0
10
20
30
40
50
Structure number
60
70
Solution Structure Ensemble
• Disorder in NMR ensemble
– lack of data ?
– or protein dynamics ?
Problems with NMR
• Protein concentration in sample needs to
be high (multimilligram samples)
• Restricted to smaller sized proteins
(although magnets get stronger)
• Uncertainties in NOEs introduced by
internal motions in molecules (preceding
slide)
X-ray and NMR
summary
• Are experimental techniques to solve
protein structures (although they both
need a lot of computation)
• Nowadays typically contain many
refinement and energy-minimisation steps
to optimise the structure (next topic)
X-ray and NMR
summary (Cntd.)
• X-ray diffraction
– From crystallised protein sample to electron
density map
• Structure descriptors: resolution, R-factor, B-factor
• Nuclear magnetic resonance (NMR)
– Based on atomic nuclear spin
– Produces set of distances between residues
(distance restraints)
– Distances are used to build protein model using
Distance Geometry (a technique to build a
protein structure using a set of inter-residue
distances)
Protein binding and protein-protein
interactions
• Complexity:
– Multibody interaction
• Diversity:
– Various interaction types
• Specificity:
– Complementarity in shape and binding
properties
Protein-protein interactions
• Many proteins interact through
hydrophobic patches
• Hydrophobic patches often have a
hydrophilic rim
• The patch-rim combination is believed to
be important in providing binding
specificity
hydrophilic
hydrophobic
very
hydrophilic
PPI Characteristics
• Universal
– Cell functionality based on protein-protein interactions
• Cyto-skeleton
• Ribosome
• RNA polymerase
• Numerous
– Yeast:
• ~6.000 proteins
• at least 3 interactions each
~18.000 interactions
– Human:
• estimated ~100.000 interactions
• Network
– simplest: homodimer (two)
– common: hetero-oligomer (more)
– holistic: protein network (all)
• Contact area
Interface Area
– usually >1100 Å2
– each partner >550 Å2
• each partner loses ~800 Å2 of solvent accessible surface
area
– ~20 amino acids lose ~40 Å2
– ~100-200 J per Å2
• Average buried accessible surface area:
– 12% for dimers
– 17% for trimers
– 21% for tetramers
• 83-84% of all interfaces are flat
• Secondary structure:
–
–
–
–
50% -helix
20% -sheet
20% coil
10% mixed
• Less hydrophobic than core, more hydrophobic than exterior
Complexation Reaction
• A + B  AB
– Ka = [AB]/[A]•[B]
 association
– Kd = [A]•[B]/[AB]
 dissociation
Experimental Methods for determining PPI
• 2D (poly-acrylamide) gel electrophoresis  mass
spectrometry
• Liquid chromatography
– e.g. gel permeation chromatography
• Binding study with one immobilized partner
– e.g. surface plasmon resonance
• In vivo by two-hybrid systems or FRET
• Binding constants by ultra-centrifugation, microcalorimetry or competition
• Experiments with labelled ligand
– e.g. fluorescence, radioactivity
• Role of individual amino acids by site directed
mutagenesis
• Structural studies
– e.g. NMR or X-ray
PPI Network
http://www.phy.auckland.ac.nz/staff/prw/biocomplexity/protein_network.htm
Binding vs. Localization
strong
Obligate
oligomers
Non-obligate
weak transient
Non-obligate
triggered transient
e.g. GTP•PO4-
Non-obligate
permanent
e.g. antibody-antigen
Non-obligate
co-localised
e.g. in membrane
weak
co-expressed
and at same place
different places
Some terminology
• Transient interactions:
– Associate and dissociate in vivo
• Weak transient:
– dynamic oligomeric equilibrium
• Strong transient:
– require a molecular trigger to shift the equilibrium
• Obligate PPI:
– protomers no stable structures on their own (i.e. they
need to interact in complexes)
– (functionally obligate)
Analysis of 122 Homodimers
• 70 interfaces
single patched
• 35 have two
patches
• 17 have three
or more
Interfaces
• ~30% polar
• ~70% non-polar
Interface
• Rim is water accessible
rim
interface
Interface composition
• Composition of interface essentially the same as
core
= different surface/interface areas
Some preferences
prefer
avoid
Ribosome structure
• In the nucleolus, ribosomal RNA
is transcribed, processed, and
assembled with ribosomal
proteins to produce ribosomal
subunits
• At least 40 ribosomes must be
made every second in a yeast
cell with a 90-min generation
time (Tollervey et al. 1991). On
average, this represents the
nuclear import of
3100 ribosomal proteins every
second and the export of
80 ribosomal subunits out of the
nucleus every second. Thus, a
significant fraction of nuclear
trafficking is used in the
production of ribosomes.
• Ribosomes are made of a small
and a large subunit
Large (1) and small (2) subunit fit
together (note this figure mislabels
angstroms as nanometers)
Ribosome structure
• The ribosomal subunits of prokaryotes and eukaryotes are quite similar
but display some important differences.
• Prokaryotes have 70S ribosomes, each consisting of a (small) 30S and a
(large) 50S subunit, whereas eukaryotes have 80S ribosomes, each
consisting of a (small) 40S and a bound (large) 60S subunit.
• However, the ribosomes found in chloroplasts and mitochondria of
eukaryotes are 70S, this being but one of the observations supporting
the endosymbiotic theory.
• "S" means Svedberg units, a measure of the rate of sedimentation of a
particle in a centrifuge, where the sedimentation rate is associated with
the size of the particle. Note that Svedberg units are not additive.
• Each subunit consists of one or two very large RNA molecules (known as
ribosomal RNA or rRNA) and multiple smaller protein molecules.
Crystallographic work has shown that there are no ribosomal proteins
close to the reaction site for polypeptide synthesis. This suggests that the
protein components of ribosomes act as a scaffold that may enhance the
ability of rRNA to synthesise protein rather than directly participating in
catalysis.
• The differences between the prokaryotic and eukaryotic ribosomes are
exploited by humans since the 70S ribosomes are vulnerable to some
antibiotics that the 80S ribosomes are not. This helps pharmaceutical
companies create drugs that can destroy a bacterial infection without
harming the animal/human host's cells!
70S structure at 5.5 Å
(Noller et al. Science 2001)
70S structure
30S-50S interface
• Overall buried surface area ~8500 Å2
< 37.5 Å2
37.5 Å2 – 75 Å2
> 75 Å2
Protein-nucleic acid Interactions
Interactions in the Ribosome
Calculating interface areas
Given a complex AB:
1. Calculate Solvent Accesible Surface Area
(SASA) of A, of B, and of AB
1. SASA lost upon complex formation is
SASA(A)+SASA(B)-SASA(AB)
3. Interface area of A and of B is
(SASA(A)+SASA(B)-SASA(AB))/2
Summary protein(-protein)
interactions
• Different binding modes (transient, obligate, also
depending on (co)localisation, etc.)
• Hydrophobic patch/hydrophilic rim conferring
binding specificity
• Interfaces are physico-chemically positioned in
between surface and protein core (amino acid
composition, etc.)
• Ribosomes
– Small/large subunits, mixture of RNA and protein,
different between prokyarotic and eukaryotic cells
(exploited by administering antibiotics), ribosomal
protein complexes, protein-RNA binding