Normal mode analysis-interpreted THz vs NMR data: site-specific changes in protein dynamics upon ligand

Protein Dynamics

Equation 5.5 For each structure, a histogram was generated of THz frequency versus number of

5.3.3 Normal mode analysis-interpreted THz vs NMR data: site-specific changes in protein dynamics upon ligand

binding

The frequencies at which above-error changes were observed in the THz difference spectra were used to provide an NMA interpretation at residue resolution, as detailed in §5.2.2.3.4. This was achieved by picking the modes with corresponding frequencies, identifying the residues whose fluctuation is most affected by those modes, and then averaging these results over modes and structures, the latter for NE NMA only. Due to the incompleteness of the comparison NMR data, only the ten most affected residues were considered for comparison. This provides a higher chance of successfully assessing whether NMA-interpreted THz difference spectra generally agrees with NMR, despite the incomplete data. Here the common interpretation of trends in ΔS2_{across the backbone is used, i.e. representing areas of differential}

flexibility, in this case upon ligand binding.34

Results are displayed in Figure 5.7a+b, using the following colour scheme. Those residues with no ΔS2_{values are shown in red. ΔS}2_{values that are zero within}

error are shown as white, whereas those above error are represented in greyscale according to magnitude (dark = greater change). For all three NMA, i.e. SS and NE at both temperatures, the interpretation of THz difference spectra was performed on both bound and unbound structures, because the mode-RMSF relationship will differ in each whilst both will contribute to the observed difference spectrum. The ten most affected residues are displayed as black boxes in Figure 5.7a+b, by ligand and type of NMA. There are 20 predictions per NMA type per ligand, 10 for apo, 10 for holo. It is interesting to note that irrespective of the MD temperature, the NE NMA averaging results in a greater spread of predictions across the protein backbone compared to SS NMA or even the first frame of the NE NMA, for which all predictions are in the C terminus.

The alignment of these THz-NMA predictions with the NMR data is summarised in Table 5.2 according to two metrics: ‘incorrect’ and ‘hit sum’. ‘Incorrect’ represents the proportion of NMA predictions (black squares) that NMR data reveals to be incorrect, i.e. for residues with ΔS2_{that are zero within error (white squares).}

This is measured out of 20, because there are 20 predictions per NMA type per ligand: 10 for apo and 10 for holo. Due to there being two ΔS2_{columns, each white square}

NH and CH3 ΔS2 are zero within error, the score is 0.5+0.5 = 1. Therefore this value

should be minimal for this technique to demonstrate utility, because the NMA predictions represent only the top ten affected residues, which should not have ΔS2

that are zero within error. Table 5.2 demonstrates that in fact these values range from 1 to 6, wherein the value decreases, i.e. improves, as SS NMA → NE NMA 110 K → NE NMA 298 K. To aid interpretation, the expected value from random placement of predictions was calculated. Only NE NMA 298 K performs better than random. ‘Hit sum’ represents the degree to which predictions are for residues with higher ΔS2

values. The ‘hit sum’ value reports the sum of all above-error ΔS2_{values aligning with}

predictions. Therefore the higher the value, the better the prediction of residues whose site-specific dynamics are known (using NMR S2_{data) to change upon ligand}

binding. NE NMA 110 K has the highest total hit sum for both ligands, and NE NMA 298 K the worst.

Together, these metrics reveal two observations with regard to generating predictions closer to the most dynamically affected residues as observed by NMR; both that NE NMA performs better than SS NMA, and that sampling at the THz temperature of 110 K may be superior. The power of both metrics would scale with increasingly comprehensive NMR data. The mediocre performance in this preliminary assessment highlights the requirement for experimental systems with more complete NMR S2_{data for conclusive assessment of this approach as a reliable probe of site-}

specific changes in protein dynamics. Additionally, improving the NMA modelling of the system, by performing NE NMA using trajectories performed on the crystal asymmetric unit, and potentially optimising hydration, could lead to improvement of this agreement.

IBMP Hexanol r SS NE NMA 110 K NE NMA 298 K r SS NE NMA 110 K NE NMA 298 K Incorrect 3.4 6.0 4.0 3.0 2.7 5.5 3.5 1.0 Hit sum 0.46 0.59 0.28 1.46 2.32 0.74

Table 5.2 NMA-interpreted THz vs NMR: summary of agreement from Figure5.7a+b. ‘Incorrect’ reports the amount of known incorrect predictions, i.e. alignment with zero within error ΔS2_{. Given the incompleteness of the NMR dataset,}

the ‘r’ column gives the expected value if the predictions were random. ‘Hit sum’ reports the sum of all above-error ΔS2_{values aligning with predictions, the relative}

magnitude indicating the type of NMA whose predictions best align with above-error ΔS2_{. Out of all NMA types, NE NMA 298 K has the lowest incorrect, and NE NMA}

110 K has the highest total hit sum.

Figure 5.7a+b (next page) NMA-interpreted THz difference spectra: top 10 ΔRMSF residues upon ligand binding. Data are sorted by increasing residue number with secondary structure elements displayed. ΔS2_{above error are displayed in the NMR}

columns: NH S2_{and CH}

3 S2 are labelled as N and C respectively. Those residues with

no ΔS2_{values are shown in red. ΔS}2_{values that are zero within error are shown as}

white, whereas those above error are represented in greyscale according to magnitude (darker = greater change). The other columns display the 10 most affected residues derived from the difference spectrum as black boxes, details in §5.2.2.3.4. The other strips of columns are single structure (SS), NE NMA 110 K (N1) and NE NMA 298 K (N2), with apo (unbound) and holo (bound) data represented as A and H respectively. a) is for IBMP binding and b) is for hexanol binding.

Figure 5.7c+d (next page) NMA-only predictions of ΔRMSF upon ligand binding. The figure follow the same scheme as 5.7a+b Residues whose average NMA-derived RMSF changes upon ligand binding are above error are shown as black boxes in the relevant columns. These run in the order of single structure, NE NMA 110 K and NE NMA 298 K (labelled S, 1 and 2 respectively), firstly averaged over 1000 modes (1k), then averaged over 7000 modes (7k). a) is for IBMP binding and b) is for hexanol binding.

5.3.4 Changes in positional fluctuations upon ligand binding

In document Dynamics and thermodynamics of protein-ligand interactions (Page 151-155)