X-ray diffraction - Protein crystallography

1 General Introduction

1.3 Methods for characterizing protein-carbohydrate interactions

1.3.2 Protein crystallography

1.3.2.2 X-ray diffraction

X-rays are electromagnetic radiation ranging from 0.1 to 100 Å and can be produced by bombarding a metal target, most often copper, with an electron beam that is generated by a heated filament and accelerated by an electric field. As a result, a high-energy electron then collides with the metal target displacing an electron of the metal from a low-lying orbital and making an electron from a higher orbital to drop to the vacated one. This transition of the electron from an M-shell to a K-shell results in the emission of the excess electrons’ energy in the form of an X-ray photon112_.

By exposing a single crystal of a protein to a monochromated X-rays beam, the X-rays are scattered by the atoms present in the crystal and a set of diffraction spots, called reflections, is recorded on the detector originating the diffraction pattern. Each reflection spot in the diffraction pattern results from a monochromatic wave, constructively scattered by all equivalent lattice points that fulfil Bragg’s Law (Equation 1.1 and Figure 1.10),

𝑛𝑛 𝜆𝜆 = 2 𝑑𝑑. sin 𝜃𝜃 (1.1)

that describes the relationship between the angle of the incident beam 𝜃𝜃 and the wavelength 𝜆𝜆, where d is the minimum spacing between equivalent planes in the crystal, i e. the maximum resolution 𝑑𝑑𝑚𝑚𝑚𝑚𝑚𝑚= 𝜆𝜆 2 sin 𝜃𝜃⁄ 𝑚𝑚𝑎𝑎𝑥𝑥.

Figure 1.10. Bragg's Law defines the relationship between the angle of the incident radiation, θ, the distance between the planes of a crystal, d, and the wavelength of the incident radiation, λ. The waves

of incident monochromatic radiation are reflected by the parallel equidistant planes of a crystal. When the difference in optical path between the scattered waves takes a multiple of the wavelength, constructive interference occurs, and a diffraction spot is produced.

The intensity of the reflection is recorded by the detector and corresponds to the intensity of this constructive wave. The diffraction data contains information from all atoms in the structure and is obtained as a list of reflection intensities with hkl positions (𝐼𝐼ℎ𝑙𝑙𝑘𝑘)64,65,114,115, defined by the crystal

planes that originated each reflection. At this point, the unit cell parameters and space group can be determined, and a full data collection experiment is performed. Afterwards, all the collected diffraction images are integrated and the intensity and hkl position of each reflection is extracted. The power of the crystal to divert the X-ray photons is what dictates the high resolution limit of the diffraction data set and the global quality of the data set65_{. This quality is assessed by calculation}

of a series of important parameters: the value of I/σ(I), which corresponds to the signal-to-noise ratio; the completeness, which corresponds to the percentage of the reflections relative to the total number of reflections that could be measured for that crystal, and should be greater than 90%; the redundancy or multiplicity, which is the number of observations per reflection (that is, the number of times the same reflection was measured); the Rmerge (Equation 1.2) and Rp.m.i (Equation 1.3) factors, which compare the intensities measured for the various reflections, and should be as low as possible since equivalent reflections (related by symmetry) must have similar intensity values; and the CC1/2 (correlation coefficient between random half data sets116), that is

generally close to 1 at low resolution and falls to near zero at higher resolution as the intensities become weaker. Individually, these parameters do not indicate the resolution limits, but are used globally to assess the quality of a data set.

(1.2) 𝑅𝑅𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 =∑ ∑ |𝐼𝐼𝑚𝑚(ℎ𝑘𝑘𝑙𝑙) − 𝐼𝐼̅(ℎ𝑘𝑘𝑙𝑙)| 𝑚𝑚 𝑚𝑚=1 ℎ𝑘𝑘𝑘𝑘 ∑ ∑ℎ𝑘𝑘𝑘𝑘 𝑚𝑚𝑚𝑚=1𝐼𝐼𝑚𝑚(ℎ𝑘𝑘𝑙𝑙) (1.3) 𝑅𝑅𝑝𝑝.𝑚𝑚.𝑚𝑚.=∑ �1 (𝑛𝑛 − 1)⁄ ∑ |𝐼𝐼𝑚𝑚(ℎ𝑘𝑘𝑙𝑙) − 𝐼𝐼̅(ℎ𝑘𝑘𝑙𝑙)| 𝑚𝑚 𝑚𝑚=1 ℎ𝑘𝑘𝑘𝑘 ∑ ∑ℎ𝑘𝑘𝑘𝑘 𝑚𝑚𝑚𝑚=1𝐼𝐼𝑚𝑚(ℎ𝑘𝑘𝑙𝑙)

Nowadays, due to the brightness of X-rays from a synchrotron facility, which is a thousand times greater than that from a laboratory X-ray generator of fixed-wavelength, most structures are solved by using synchrotron radiation64_{. Synchrotrons have the additional advantage of providing}

tunable radiation, which can be modulated to different wavelengths of interest. Synchrotron facilities have contributed to the structural characterization of protein-carbohydrate interactions of most carbohydrate-recognising protein families117_.

1.3.2.3 3D structure determination

To solve the structure of a protein or protein-ligand complex, the electron density that surrounds all the atoms of the macromolecule in the crystal must be calculated, and structure factors and phase information are necessary, as expressed by the electron density equation (Equation 1.4):

(1.4)

𝜌𝜌(𝑥𝑥, 𝑦𝑦, 𝑧𝑧) = (1/𝑉𝑉) �|𝐹𝐹ℎ𝑘𝑘𝑘𝑘| ℎ𝑘𝑘𝑘𝑘

∙ 𝑒𝑒2𝜋𝜋𝑚𝑚𝛼𝛼ℎ𝑘𝑘𝑘𝑘_{∙ 𝑒𝑒}[−2𝜋𝜋𝑚𝑚(ℎ𝑥𝑥+𝑘𝑘𝑘𝑘+𝑘𝑘𝑙𝑙)]

where |𝐹𝐹ℎ𝑘𝑘𝑘𝑘|, known as the amplitude of the structure factor, is obtained from the intensities of

each reflection, measured experimentally (|𝐹𝐹ℎ𝑘𝑘𝑘𝑘| = �𝐼𝐼ℎ𝑘𝑘𝑘𝑘), 𝛼𝛼 is the phase angle of the scattered

wave, 𝑥𝑥𝑗𝑗𝑦𝑦𝑗𝑗𝑧𝑧𝑗𝑗 is the position of atom 𝑗𝑗 in the unit cell and 𝑉𝑉 is the volume of the unit cell.

However, from the dataset collected phase angle values cannot be obtained.This is what it is known as the Phase Problem in crystallography. While the intensities and the structure factors are directly measured in the diffraction experiment, phase angle values need to be correctly estimated in order to obtain the electron density map64,65_.

Three methods are generally used to determine the phases. Single or Multiple Isomorphous Replacement (SIR/MIR), that uses heavy atoms such as Au, Pt, and Hg usually soaked into the native crystals. Multiwavelength Anomalous Dispersion (MAD) and Single-wavelength Anomalous Dispersion (SAD), that exploit the anomalous dispersion (or scattering) effect of specific atoms, typically selenium, recurring to selenomethionyl proteins in which methionines are

replaced by selenomethionine upon protein expression. SIRAS and MIRAS are additional methods that combine Isomorphous Replacement and anomalous scattering. Ultimately, the Molecular Replacement (MR) method can be applied if the structure of the same protein or a similar one, with at least 30% amino acid sequence identity, has already been solved64,65_._{MR is}

of particular importance for protein-ligand complexes, as the unliganded-structure phases can be combined with the diffraction intensities of the crystal with the bound ligand to calculate its electron density map65_.

In document Protein-carbohydrate recognition in the biodegradation of the plant cell wall: Functional and structural studies using carbohydrate microarrays and X-ray crystallography (Page 57-60)