Transcript Document

Basics of protein structure and
stability IV: Anatomy of protein
structure continued
Biochem 565, Fall 2008
09/03/08
Cordes
the main-chain can hydrogen bond to
itself
there are also sidechain acceptors and
donors
the carbonyl oxygen:
main-chain hydrogen
bond acceptor
O
H
O
N
C
H2C
H
CH
CH
N
C
CH3
H
O
the amide nitrogen:
main-chain hydrogen bond donor
Hydrogen
bond
geometry
•
•
•
•
•
Hydrogen bond not really a covalent ”bond”--not much orbital overlap.
Model as an electrostatic interaction between two dipoles consisting of the H-N bond and
the O sp2 lone pair. In electrostatic theory, the optimal orientation of two such dipoles is
head-to-tail. The energy of such an arrangement should decrease as the head and tail are
brought together as long as atomic van der Waals radii are not violated (then repulsive
forces quickly take over).
“Ideal” hydrogen bond in this model would have r~3.0 Å, p=180°, b=0° and g=±60°.
Convince yourself of this.
In small molecule crystals, this is approximately what is observed, though there is a lot of
variation in the angles b and g. Thus the precise C=O…H angle parameters are not critical.
Main chain-main chain hydrogen bonds found in proteins will show various deviations from
this geometry, partly due to the topological constraints imposed by forming secondary
structures.
•
•
•
•
What is a “reasonable” hydrogen bond? Criteria for identifying hydrogen
bonds are somewhat arbitrary and many have been used. Here are a
couple of examples.
Geometric criteria: Often H-bonds are just identified by two parameters, the
O…N (acceptor-donor) distance r, and a O…H-N angle p. The angles
describing the C=O…H geometry are sometimes ignored. Typical cutoffs: p >
120° and r < 3.5 Å. (Baker & Hubbard, 1984)
Electrostatic criteria: One of the most commonly used criteria is a potential
function based on a pure electrostatic model (Kabsch & Sander, 1983). Place
partial positive and negative charges on the C,O (+q1,-q1) and N,H (+q2,-q2)
atoms and compute a binding energy as the sum of repulsive and attractive
interactions between these four atoms:
E=q1q2(1/r(ON)+1/r(CH)-1/r(OH)-1/r(CN))*f
where q1=0.42e and q2=0.20e, f is a dimensional factor (=332) to convert E to
kcal/mol, and r(AB) is the interatomic distance between atoms A and B.
A hydrogen bond is then identified by a binding energy less than some arbitrary
cutoff, e.g. E< -0.5 kcal/mol.
Note that the criteria defined above are only applicable when hydrogen atom
positions are available. Crystal structures do not have hydrogens--however,
their positions can be computed in many cases.
Identifying main-chain H-bonds in X-ray
structures of proteins
Š 3.5 Å
D'D
0-20°
O
N
DD
90-150°
torsion angle
N-DD-DD'-O
describes degree
to which acceptor oxygen
out of plane with the nitrogen donor
AA
110-180 °
DD, DD' = donor antecedent atoms
AA = acceptor antecedent atoms
X-ray structures of proteins do not in general include hydrogen atom
coordinates--get used looking at pictures of proteins without the hydrogens,
and having your mind fill them in
Presta LG & Rose GD Science 240, 1632 (1988)
Secondary structure elements in proteins
reflect the tendency of backbone to
hydrogen bond with itself in a semiordered fashion when compacted
beta-strand
(nonlocal interactions)
A secondary structure element is a
contiguous region of a protein sequence
characterized by a repeating pattern of
main-chain hydrogen bonds and
backbone phi/psi angles
alpha-helix
(local interactions)
Local backbone H-bonding: DSSP turn/helix definitions
3-turn:
‘>’
‘3’
‘3’
‘<‘
-N-C-C--N-C-C--N-C-C--N-C-CH
O N
O H
O H
O
>----------------<
notation
residues
H-bond
4-turn:
‘>’
‘4’
‘4’
‘4‘
‘<‘
notation
-N-C-C--N-C-C--N-C-C--N-C-C-N-C-C residues
H
O N
O H
O H
O H
O
>----------------------<
H-bond
Kabsch &
Sander, 1983
5-turn (just an elaboration of 3- and 4-turn.
A minimal helix is two consecutive N-turns-for a minimal four helix from residue i to i+3:
i
<--residue
>444< and
>444< overlap to give
>4444< which defines a helix
HHHH from i to i+3
‘H’ is the notation for a residue in a 4-helix.
Notice that the helix does not include the residues
involved in
the terminal H-bonds.
Longer helices are overlapping minimal helices.
the alpha-helix: repeating i,i+4 h-bonds
11
10
12
right-handed helical
region of phi-psi space
9
8
7
5
6
4
hydrogen bond
1
3
2
By DSSP definitions, which of
residues 1-12 are in the helix?
Does this coincide with the
residues in the helical region of
phi-psi space?
The a-helix, with i,i+4 hbonds, is not the only
way to have local
hydrogen bonding of the
backbone to itself.
a-helix
310 helix
p helix
The 310 helix has
hydrogen bonds
between residues i and
i+3
The p helix has
hydrogen bonds
between residues i and
i+5.
For a number of
reasons almost all
helices in proteins are
a-helices--include
backbone, side chain
steric issues, van der
Waals contacts, H-bond
geometry
these are poly-Ala,
so the gray balls on the
outside are b-carbons
from the side chains
310 and p helices have sterically allowed
conformations but not in the most favored
regions of phi-psi space
Helix nomenclature: a-helix
example
1 Hydrogen bond between
C=O
(residue i)
H-N
(residue i+4)
15 helix
2 Repeating unit:
– 5 turns
– 18 residues per repeat
185 helix
3 Loop formed between
C=O H-N
1 
– 13 atoms
– 3.6 residues per turn
3.613 helix
5
nomenclature used for
310 helix
a helices extend with approximately 1.5 Angstrom per
residue, 5.4 Angstrom per turn.
Nonlocal backbone
hydrogen bonding:
DSSP bridge, ladder and
sheet definitions
ladder= set of one or
more consecutive
bridges of identical type
sheet= set of one or
more ladders
connected by shared
residues
Kabsch &
Sander, 1983
parallel bridge:
‘x’
-N-C-C--N-C-C--N-C-CH
O H
O H
O
\ .
. /
\.
./
.\
/.
. \
/ .
H
O H
O H
H
-N-C-C--N-C-C--N-C-C‘x’
antiparallel bridge:
‘X’
-N-C-C--N-C-C--N-C-CH
O H
O H
O
. !
! .
. !
! .
. !
! .
. !
! .
O
H O
H O
H
-C-C-N--C-C-N--C-C-N‘X’
notation
residues
H-bonds
(\ and /,
or .)
residues
notations
notation
residues
H-bonds
(! or .)
residues
notations
beta strands/sheets
57
beta-strand region of
phi-psi space
56
54
53
52
51
50
49
Is this a parallel or anti-parallel sheet?
By DSSP definitions, which of res 49-57 are in
the sheet? Does this coincide with the residues
in the beta-strand region of phi-psi space?
Principal types of secondary structure found
in proteins
Repeating (f,y) values
f
a-helix
(15)
y
-63o
-42o
310 helix
(14)
-57o
-30o
Parallel b-sheet
-119o
+113o
Antiparallel b-sheet
-139o
+135o
(right-handed)
antiparallel beta-sheet parallel beta-sheet
All the most
common
secondary
structure
conformations
fit nicely
within
sterically
allowed
regions of
phi-psi space
alpha-helix
Because of the repetitive nature of secondary
structures, and particularly beta-sheets, proteins can
form fibrillar structures and aggregates
amyloid-like fibril(left) of peptide
GNNQNNY from the yeast
prion protein Sup35, and its
atomic structure (right)
fibril axis
in the case
of this fibril
the side
chains
also
hydrogen
bond to
each other
amide stacks
Nelson et al (Eisenberg lab), Nature 435:773 (2005).
for background on “polar zippers”: Perutz et al. PNAS 91:5355 (1991)
These types of fibrils important in Huntington’s disease etc
Fibrillar helical structures: the leucine zipper
Leu
Leu
GCN4 “leucine zipper” (green) bound as
a dimer (two copies of the polypeptide)
to target DNA
The GCN4 dimer is formed through
hydrophobic interactions between
leucines (red) in the two polypeptide
chains
Are main-chain H-bonds why proteins
are special?
“It would seem extraordinary that no other polymer
structures exist in which internal hydrogen bonding
can give rise to periodically ordered conformations,
but no others have been found thus far. We are
therefore forced to recognize the uniqueness of this
capacity in polypeptide chains, one which enables
them to meet the exacting and sophisticated
demands of structure and function”
--Doty P, Gratzer WB in Polyamino acids, polypeptides
and proteins, pp. 111-118, 1962, University of
Wisconsin Press
see also Honig F & Cohen FE Folding & Design 1, R17-20 (1996).
Globular proteins
•
Keep in mind, however, that if that
were all proteins could do, they
would just form regular repeating
structures. Instead many proteins
have globular structures
consisting of short secondary
structure elements connected by
loops and turns that are not
necessarily characterized by
repeating hydrogen bond
structures, but which serve in part
to reverse the direction of the
polypeptide chain.
loops and turns
b-turns in proteins:reversing the chain
direction
turn
residues
A b-turn consists of two
residues, where there is a
hydrogen bond between the
carbonyl of the residue
preceding the turn and the
amide nitrogen following the
turn. There are a number of
ways to configure the backbone
to achieve this.
direction of
polypeptide chain
four basic tight b-turns that all yield an i,i+3 hydrogen bond
[from Wilmot CM & Thornton JM J Mol Biol 203, 221 (1988)]
b-turn type pos i+1
name
f
pos i+1
y
pos i+2
f
pos i+2
y
I
-60
-30
-90
0
I’
+60
+30
+90
0
II
-60
+120
+80
0
II’
+60
-120
-80
0
Side chain
conformation
• side chains differ in
their number of degrees
of conformational freedom
(some don’t have any,
such as Ala and Gly)
•but side chains of very
different size can have
the same number of c
angles.
Side chain conformations--canonical staggered forms
Newman projections for c1 of glutamate:
O
O
C
g
b
C
d
CH2
CH2
CH
a
CO
CO
CO
c3
Hb
Cg
Cg
Hb
Hb
Hb
c2
N
Ha
N
Ha
N
Ha
c1
NH
Hb
Cg
c1 = 180°
c1 = +60°
c1 = -60°
t
g+
Hb
g–
O
glutamate
t=trans, g=gauche
name of conformation
Side chain angles are defined moving outward from the backbone, starting
with the N atom: so the c1 angle is N–Ca–Cb–Cg, the c2 angle is Ca–Cb–Cg –Cd ...
IUPAC nomenclature:
http://www.chem.qmw.ac.uk/iupac/misc/biop.html
Rotamers
•
•
•
•
a particular combination of side chain torsional angles c1, c2, etc. for a
particular residue is known as a rotamer.
for example, for leucine, if one considers only the canonical staggered
forms, there are nine (32) possible rotamers: g+g-, g+g+, g-g-, g-g+, tg+,
g+t, tg-, g-t, tt
not all rotamers are equally likely.
distribution of
for example, valine prefers
valine rotamers
its t rotamer (picture at right)
in protein structures
(from Ponder &
Richards, 1987)
CO
Hb
Cg1
N
Ha
Cg
c1=180°, trans or t
c1=0
180
360
Side chain rotamers are not limited to canonical eclipsed
forms--there are many subtly different rotamers
from
Xiang & Honig, 2001
This figure simply
shows that the
more structures
you examine, the
more different
rotamers become
apparent--so as
databases of
structure have
increased, so has
the richness of
our understanding
of side chain
conformation.
How many rotamers there are also depends on how you define whether two
conformations represent different rotamers:
An “x degree rotamer” in this figure means that at least one side chain angle
differs by x degrees: hence classifying rotamers by a 10 degree difference
standard is finer grained than classifying them by, say, a 40 degree standard