2. Materials and Methods
6.3 Investigating low resolution structure with small angle X-ray scattering
6.3.5 Modelling the 2HDLL:M100 complex
The modelling program MONSA was used to construct potential envelopes of the 2HDLL:M100 complex. MONSA is an ab initio modelling algorithm that uses iterative dummy atom modelling to find a solution that most closely matches the inputted SAXS data [401]. Baseline subtracted data, the proportional volumes of each species in solution, and the dmax (maximum dimension of the particle being modelled) are the only inputs for required for
MONSA. The Rg may also be supplied as an additional modelling constraint.
MONSA can be used to model complexes, by inputting multiple data sets and specifying which complex components are present in each data set. MONSA can also accommodate phases with different scattering densities. In this case, this allows the DNA and protein
components to be distinguished. For modelling the 2HDLL:M100 complex, data sets for both the complex and the DNA alone were input, for the reasons outlined at the end of the previous section.
The inherent flexibility within the complex means that as the data reaches higher q values, it will not tend towards 0 as quickly as would a highly ordered species, which impacts the modelling performance, as MONSA assumes that the data tends towards 0. To prevent this from affecting the models generated, the data was truncated to a q cut off of 0.3. The signals beyond this point are dominated by short range internal density fluctuations, which MONSA does not consider as it uses uniform density modelling for each component. As a result, truncation of the data should not impact the validity of the fits generated.
Due to uncertainty around the dmax of the complex, as well as the Rg, these parameters were
varied for different iterations of modelling. Each combination of parameters was used to generate three iterations of modelling. The resulting ensemble of models showed varying conformations of the complex. Four representative models will be discussed, as they illustrate the variations and similarities present within the ensemble (Figure 6.14).
Figure 6.14: Examples of 2HDLL:M100 models generated using MONSA.
These models were generated by restricting the Rg of 2HDLL during modelling. (A) Rg of 35;
(B) Rg of 37; (C) Rg of 39; (D) Rg of 42. The parts of the model corresponding to DNA are
shown in green and the protein component is shown in gold.
In all of these models the DNA appears to have been modelled effectively as a roughly rod- like or elongated volume. The models also place the protein in close proximity to the DNA, indicative of an interaction, but also show a substantial volume of protein that does not appear to contact the DNA. This volume potentially represents the LIM:LID part of the construct that lies between the two homeodomains of 2HDLL.
There was significant variation amongst the models in terms of how much of the protein was placed in contact with the DNA. Several models showed an extended interaction interface between the DNA and the protein, suggesting that both homeodomains bind the DNA (Figure 6.14 C and D), but others showed only a small portion of protein interacting with the DNA, which could indicate single homeodomain binding (Figure 6.14 A and B). In those models where the protein is not interacting along the length of the DNA, it is interesting to note that the protein-DNA interface is primarily localised to one end of the DNA, as opposed to the centre. This may reflect a tight interaction between Lhx3HD and the AAATTA site in the
similar fit to the data (Figure 6.15). Since all of the models fit the data equally well, it cannot be determined which conformation is more likely to represent the 2HDLL:M100 complex. Indeed, it is possible that all of these conformations are represented in solution.
Figure 6.15: Assessing the fits of 2HDLL:M100 models generated . The same
models are used as in Figure 6.14. (A) Plot of fits overlayed on experimental data for 2HDLL:M100; (B) Difference plot of fits to 2HDLL:M100; (C) Plot of fits overlayed on experimental data for M100; (B) Difference plot of fits to M100. Difference plots were generated by calculating the difference between the experimental data and the fit, then dividing by the error of the fit.
It is not possible to determine how Isl1HD is interacting with the DNA from the SAXS data
presented in this chapter. More data is required to gain further insight into the conformational and structural interplay between the protein and DNA components of this complex. As the 2HDLL:M100 complex has been shown here to be flexible in nature, it may not be possible to obtain one definitive structure of the complex.
6.4 Discussion
This chapter has sought to gain insight into the structure of the Lhx3:Isl1 DNA-binding module. Due to the flexible nature of the 2HDLL:M100 observed in SAXS, it is not yet possible to construct a definitive model of how this complex binds DNA. However, the data presented here have provided hints as to the main features of binding.
Chapter 5 showed that the Isl1 homeodomain in isolation is unable to bind DNA with high affinity or specificity. The additional SAXS data presented in this chapter shows that when brought into close proximity to Lhx3HD within fusion constructs, Isl1HD can potentially bind
to DNA. This data gives rise to the following question about the function of Isl1: if the protein only makes a small contribution to DNA-binding in the context of a larger complex, how does this affect its function? Several theories are plausible.