Protein Structure Determination Methods
2.2. High resolution techniques
2.2.1. X-ray crystallography
X-rays are a form o f electrom agnetic radiation. The w avelength comm only used
by crystallographers is 0.154 nm. This w avelength is used because it is comparable to
atomic dim ensions {e.g. 0.15 nm for C-C single bonds). The m ajor problem w ith all
crystallographic techniques is the initial requirem ent o f a crystal, since scattering from
an individual molecule is far too weak to be detected (M acArthur et a l, 1994). Good
crystals are grown from a protein solution under a narrow range o f conditions which are
science. Once a suitable crystal has been obtained, diffraction experiments will
determine the intensity o f the scattered waves, but further techniques are required to
elucidate their phase. Crystals concentrate the scattering exclusively in discrete
directions to produce the diffraction pattern in accordance w ith the Bragg equation:-
n l= 2 sin 8. Eq. 2.1
where n is an integer, d is the spacing between lattice planes and 0 is the angle o f
incidence at the wavelength X. The intensity o f the scattered radiation falls o ff with
increasing 0 value. There is a m inim um value o f d (d^j^) that corresponds to the
m axim um observed 0. The resolution o f a crystal structure is determ ined by the d^^.n
value because it defines the ability to distinguish betw een adjacent structural features.
The basic structural unit o f a crystal, or unit cell, is repeated infinitely in three
dimensions. The unit cell is characterised by the vectors a, b and c and the angles a , p,
y that form the edges o f a parallelepiped (Chang, 1981; M acA rthur et a l, 1994). It was
show n in 1850 by Bravais that there are only 14 unit cell types know n as the Bravais
lattices (Chang, 1981). Reconstruction o f the electron density o f the molecule requires knowledge o f the phase as well as the amplitude.
An electron density map is a contoured representation o f the electron density at
various points in a crystal structure. Electron density is highest at atoms. The map may
be calculated by Fourier summation from the experimental structure am plitudes, F^bs and
an appropriate set o f phases
P(xyz) = - ^ Z ! ^obs(hkl) c o s(2n(hx + ky + lz )~ Eq. 2.2 F hkl
where p(xyz) is the electron density at the point xyz, w hich are the fractional coordinates
m easured from the unit cell origin; hkl are integers characteristic o f a given reflection,
and V is the unit-cell volume. To solve the phase problem , diffraction from at least two
heavy-atom (for example, cobalt or uranium salts) derivatives m ust be measured.
Another method o f solving the phase problem is by using m olecular replacem ent where
the phases are calculated by fitting the know n structure o f a structurally homologous
protein to the observed intensities. Often this has been accom plished by irradiating two
ions. This ‘isomorphous replacem ent’ method requires that metal ions be incorporated
into a crystal without affecting its structure, sometimes difficult to achieve. A more
recent solution to the phase problem involves using synchrotron radiation at multiple
wavelengths. This has greatly accelerated the rate o f solving crystal structures. A
complete description o f each reflection, its position, intensity and phase, is called a
‘structure factor’. Structure factor data files are available at the PDB for approximately
25% o f the crystallographic entries. Publishing the structure factors allows other groups
to generate and exam ine the electron density map, or to try alternative refinement or
phasing methods. Once an electron-density map has been constructed, the molecular
structure is then derived using m olecular graphics software. A m olecular model o f the
sequence o f amino acids or nucleotides, which m ust be know n independently, is then
fitted into this electron density map, and a series o f refinem ents are perform ed. The
result is a set o f X, Y, Z Cartesian coordinates for every non-hydrogen atom in the
molecule (M acA rthur et al., 1994).
The initial electron density map is very approximate, so the model from this is refined until the best agreement is found between the observed structure amplitude
and those back-calculated from the model (Fcaic) an R-factor index o f agreement is
calculated. It is a measure o f the correctness o f the derived m odel summed over all
reflections :-
VIf
- fI
R = Eq. 2.3
The initial model produced from an electron density m ap may have poor
stereochem istry, implausible non-bonded contacts, and bond lengths and angles that
show excessive deviations from ideal values, i.e. the model represents an unstable
structure possessing high potential energy. A high tem perature factor suggests either
disorder or therm al motion. Disorder means that the atom occupy different positions in
different m olecules in the crystal, while ‘thermal m otion’ refers to vibration o f an atom
diffraction data. If portions o f a chain have high mobility or disorder, they produce low
and fairly uniform electron density, making it impossible to assign positions to atoms
in such portions. For this reason, it is not uncom m on to find the term ini o f a protein
chain, and perhaps loops missing from a crystallographic atomic coordinate file. The
procedure used in m odel refinement is to lower the energy by adjusting the above
parameters until they reach acceptable levels near their preferred values but still
m aintaining the experimentally determ ined conditions (M acA rthur et al., 1994).
Bacterial proteins are not glycosylated and are therefore more amenable to
crystallisation and may be used to produce recom binant unglycosylated proteins. The
m ultidom ain com plem ent components that have been crystallised so far include; C3a
(Huber et al., 1980), C-type lectin domains o f m annose binding protein (W eis et al.,
1991a, b), factor D (Narayana et al., 1991a, b) and the vW F type A domain o f
com plem ent receptor type 3 (CR3) (Lee e ta l, 1995). Recently, the crystal structure o f
two N-terminal SCR domains o f CD46 (Ick l, Casasnovas et al., 1999) and the crystal
structure o f hum an p2-glycoprotein 1 (P2GP1), a heavily glycosylated five SCR domain
plasm a m em brane-adhesion protein have been published (Iqub, Boum a et al., 1999).
O ther crystal structures are available for C lr and C ls (C hapter 4), the single domain
protein C8y o f the membrane attack complex, and the N M R structure is know n for CD59. In general crystallisation o f com plem ent proteins are problem atic because o f the
high degree o f glycosylation. For example C l inhibitor has a carbohydrate content o f
26% by weight (Perkins et ah, 1990). In addition, many com plem ent proteins are
com posed o f domains that show interdom ain m obility w hich also hinder crystalisation.
Recom binant DNA techniques are available to produce proteins at high concentration
and also allow single domains to be studied. As o f 17 D ecem ber 2002; 15,160 crystal
structures had been deposited in the Protein D ata Bank.