CHAPTER 4. STRUCTURAL AND FUNCTIONAL STUDY OF
4.3. Results
The DNA substrates used in our experiment (Table 4.1) are based on the following considerations. As discussed in Section 4.1.3, the length of repeat is strongly correlated to its instability – the longer the repeat (after a certain threshold), the more likelihood the repeat will expand. Secondary structures (slip-outs) predicted in repeat instability models are more likely to form and are more stable when the repeat is longer. The slip-out size and structure are also affected by the repeat length and construct [228], and could affect how they interact with proteins. The DNA substrates chosen, (CTG)1, (CTG)5, and (CTG)56/(CAG)54, are predicted to form different structures, and were previously shown to activate different MutLα incision
activities that are MutSβ, PCNA, and RFC dependent [194]. The study shows that a slip-out size with two or three repeats is sufficient for PCNA to effectively load on to DNA to activate
MutLα’s incision activity, and suggests that the slip-out size correlates positively with PCNA loading. However, for larger slip-outs (𝑛 > 3), the activity decreases. We postulate that while these larger slip-outs continue to facilitate PCNA loading, they could inhibit the functions of mismatch repair proteins. The structure of MutSβ-DNA and MutSβ-MutLα-DNA complex on these DNA substrates, which has not been shown, could be key to understand the decrease in incision activities on larger slip-outs. Therefore, resolving the conformations of MutSβ-DNA complex and MutSβ-MutLα-DNA complex on these substrates is an important step towards unveiling the molecular interplays between MMR and TNR. Specifically, the (CTG)1 substrate is chosen as a positive control for functional MMR as demonstrated in past studies [189, 231]. Conversely, the (CTG)0 substrate is chosen as a negative control for non-specific interactions. The (CTG)5 substrate is selected as a larger size slip-out that is predicted to exhibit aberrant processing based on previous studies on similar sized slip-out [190, 194]. Since repeat instability
occurs as n exceeds a certain threshold of about 30 repeats [218], the (CTG)56/(CAG)54 substrate is selected as a model heteroduplex with slip-out(s) embedded in a long repeat tract with
important implications in modeling how repeat expands in vivo [243].
All the analyses are carried out across 3 or more replicates of experiments, with results carefully examined across the replicates. Just as any experiment, inconsistencies do arise. The results are filtered and agglomerated based on the following factors – image quality, quality of sample deposition, and consistencies of other experiment variables. Due to the incompleteness of the analysis and ongoing effort in developing better analysis on the data (CHAPTER 3), not enough data are analyzed and more experiments may be needed to resolve existing
inconsistencies in the data. Despite the author’s best effort, the analyzed data are not consistent enough for the author to make a compelling statistical assessment. As a result, statistical analysis is omitted in this section and only the agglomerated results are shown.
4.3.1. (CTG)1
To estimate the stoichiometry of proteins in a protein complex on (CTG)1, we performed volume analysis as described in Section 3.5.3C. The volumes of protein complexes are measured and their distribution is plotted in Figure 4.8B. Since AFM volume correlates positively to molecular weight [84], the stoichiometry of proteins in a protein complex can be estimated from the volume distribution. In both nucleotide conditions (ADP or ATP), a dominant peak
representing a single MutSβ protein can be distinctively resolved at 600-800nm3 (Figure 4.8B). The volume at the peak location is similar to that of the free proteins (Figure 4.8E), which are known to exist as single-protein heterodimers [232]. This result suggests that the majority of complexes are comprised of a single MutSβ protein only, a stoichiometry of 1:1. A much smaller second peak could also be resolved at 1200-1300nm3, suggesting a fraction of the complexes are
multi-protein complexes (Table 4.3). The volumes are independent of their position on the DNA, suggesting complexes at the specific site (i.e. the slip-out) have the same size as those at the non- specific sites or DNA terminus.
We then measure the stoichiometry of protein complex on a protein-bound DNA (i.e. free DNAs are filtered) by counting the number of protein complexes on those DNAs (Figure 4.6 middle). In the presence of ADP, our data show that the majority of the protein-bound DNAs are bound by one protein complex (Figure 4.9B ADP), similar to results obtained from the
homoduplex controls (Figure 4.9A Figure E.3 ADP). In the presence of ATP, the stoichiometry increases on DNA containing (CTG)1 (Figure 4.9B ATP), which is not observed in the
homoduplex controls (Figure 4.9A, Figure E.3, ATP). These results suggest that MutSβ loading is mainly facilitated through the specific site, and the number of MutSβ complexes on the
(CTG)1 substrate increases in the presence of ATP, consistent with the prediction of the sliding clamp model (Section 4.1.1). Overall, about 15% of DNA-bound proteins are clustered in the vicinity of one another (Table 4.3). Here, clustering is defined as an event where individual complexes neighbor each other (Figure 4.5 green box) or proteins form a single multi-protein complex (Figure 4.8B, >1000nm3). Protein clustering may be indicative of coordination between proteins in the repair processing.
Next, protein positions are measured on protein-bound DNAs that have one protein bound (Figure 4.6 right). Because we use unblocked DNAs, we use the short-arm distance (the nearest distance to either end of the DNA) for our position measurement (Figure 4.7B) and therefore positions from one side (that includes the specific site) will overlay on top of
overlay of the specific binding distribution on top of the non-specific binding distribution104, but it will not affect our ability to estimate the binding specificity for the specific site (see Figure 4.7C description). In the presence of ADP, the position distribution shows a preference of binding around the specific site on (CTG)1 (~39% length, Figure 4.10B, ADP), suggesting that MutSβ has preferential binding towards the specific sites relative to the non-specific sites. In the presence of ATP, the position distribution continues to show similar preference at the specific site (Figure 4.10B, ATP). This result appears to be inconsistent with the sliding clamp model (Section 4.1.1), which predicts a drop in MutSβ’s binding affinity for the specific site upon ATP activation. However, it is possible that MutSβ’s binding affinity for the non-specific sites also lowers in the presence of ATP, making the relative preference for the specific site unchanged. We also did experiments on a longer (CTG)1 heteroduplex (Table 4.1, single-cut column). On this substrate, the preference for the specific site is notably lower in the presence of ATP than that in the presence of ADP (Figure 4.11). We performed specificity calculation on the ADP case (calculation not shown), and it is ~1000 (i.e. the binding affinity on the specific site is ~1000 fold stronger than that on the non-specific site). Because of the inconsistencies in results between the short and the long (CTG)1 heteroduplexes, more analysis and/or experiments are needed. On the short DNA substrates (Table 4.1, double-cut column), strangely, MutSβ also seems to show some preference around the same region (~40% length) on the homoduplex DNA substrate (CTG)0 when ADP is present (Figure 4.10A ADP). However, no preference for this region is found in the presence of ATP when examining the distribution on the (CTG)0
homoduplex (Figure 4.10A ATP). We also investigated the distribution on a different and longer
104 Since the short-arm length caps at half-length of the DNA, the short-arm length fraction caps at 50% as shown in Figure 4.10.
homoduplex (Table 4.1 2033bp Homoduplex), and the distribution is essentially flat (Figure E.2). These results suggest that some systematic errors may be at play when measuring the position in the presence of ADP. In position analysis (Figure 4.7C), the base line occurrence probability Pmin (i.e. the average of non-specific binding probability) can be used to estimate binding specificity (Section 3.5.3B) as manifested in the specificity equation (see Figure 4.7C description). The smaller the Pmin, the higher the binding specificity (and the preference for the specific site). In our data, despite the complication on the homoduplex control, the base line of the non-specific distribution (Figure 4.10A, B, dashed line) is lower on (CTG)1 compare to its homoduplex counterpart (CTG)0, suggesting MutSβ still favors the specific site on (CTG)1 more than the non-specific site on (CTG)0 around the same area.
To characterize the conformation of the complexes, we measured DNA bending with or without MutSβ. Hairpin sizes of 7bp-18bp has been visualized on AFM before [244] (Figure 4.13 left), however the hairpin can often be indistinguishable from natural DNA distortions and other background features. As an alternative, we measured notable kinks and their locations instead (Figure 4.13 right). Because the kinks are spotted by eye and filtered by their notability, lower-bent and unbent states are filtered out. Our preliminary data show that the kink angles on the specific site (~40°) are indistinguishable from those on the non-specific sites (Figure 4.14 left column, (CTG)1). However, since lower-bent and unbent states, which dominate the non- specific sites (unpublished data105), are filtered in our analysis, the bend angles showing on the specific site are likely less affected by the filter and are therefore more likely to capitulate the native kink angles of the slip-out. We also examine locations of sharp kinks (> 60°), and they do
105 Our past AFM data showed that the non-specific bending distribution on free DNAs is a half-Gaussian centered at 0°, i.e. on average, the DNA is unbent on non-specific locations.
not show any preference at the slip-out location (Figure 4.14 right column, (CTG)1). Since we are making a big assumption (on the filter), we are not making a conclusion on the native bend angle for (CTG)1 here and only noting it shows up as 40° in this preliminary analysis.
The DNA bending with MutSβ shows a distribution of two bent states. One of the bent state (‘high bent state’) shows that MutSβ sharply bends the DNA at 100° at the slip-out in the presence of ADP (Figure 4.12B, Specific Site, ADP), consistent with sharp bending of a 3nt slip-out seen in the crystal structure [158]. The other bent state (‘unbent state’) shows an unbend population around 0°, which is not seen in the crystal structure. On the non-specific sites and on the homoduplex DNA fragment, MutSβ only bends the DNA ~50° (Figure 4.12B Non-specific Site, Figure 4.12A, ADP), similar to what we have seen in E.coli and Taq. MutS [6]. In the presence of ATP, interestingly, DNA bending distribution shows a transition towards the lower 50° bent state on the specific site, and a transition towards an unbent state around 0° on the non- specific sites, suggesting potential conformational changes on both locations (Figure 4.12B). The change in bending states towards a decrease in DNA bending is consistent with the ‘unbent state’ revealed in past studies on sliding clamp formations, which is likely induced by ATP exchange and/or ATPase activity [5, 6, 83]. Our data are mostly consistent with sliding clamp formations on small slip-outs despite MutSβ’s continued preference for the specific site. 4.3.2. (CTG)5
On (CTG)5, volume analysis resolves a dominant species with volume consistent with a single MutSβ as well as a smaller second species composed by multi-protein complexes in both nucleotide conditions (Figure 4.8C). We also observe that the DNA is mainly occupied by one protein complex in both nucleotide conditions (Figure 4.9C). Since increased loading of MutSβ in the presence of ATP is not observed, this result suggests that the efficiency of multiple loading
on (CTG)5 could be lower than that on (CTG)1. Because we use linear unblocked DNA, we cannot rule out the possibility of multiple loading (and sliding clamp formation) on (CTG)5 based on this result (more on that in the discussion). Overall, the population of proteins that are clustered ranges from 12% (ADP) to 20% (ATP) (Table 4.3), which is similar to the level of protein clustering seen in (CTG)1. Further error analysis is required to understand the increase in clustering in the presence of ATP, however.
As discussed in the (CTG)1 section, we use the short-arm length for our position
measurement on the unblocked DNA, which would address issues with binding sites that are at symmetric locations (i.e. the position of 39% and 61% length will be both measured as 39%). The position distribution shows that (CTG)5 is recognized with very high specificity in the presence of ADP, as indicated by a significant peak at the slip-out (~39% length, Figure 4.10C ADP). The increase in specificity compared to (CTG)1 suggests that larger slip-out may facilitate tighter MutSβ binding. Interestingly, MutSβ’s specificity for the slip-out lowers in the presence of ATP as seen in the drop of binding preference to the slip-out (Figure 4.10C ATP).
The DNA bending on (CTG)5 without MutSβ is similar to that of (CTG)1, albert slightly larger at ~50°, which also shows up on the non-specific sites (Figure 4.14 left column). We attribute the increase to surface changes from different AFM depositions that could alter how much DNA kinks. Interestingly, on (CTG)5, the kinks (at least for sharp kinks) show up more on the specific site than on non-specific sites, unlike (CTG)1 (Figure 4.14 right column). Similar results are also obtained on (CAG)5 (Figure 4.14 right column, (CAG)5). These results suggest that the size of the slip-out correlates positively to the occurrence of the kinks, with larger slip- outs showing kinks more frequently than smaller slip-outs. With the assumption we made earlier (see discussion on the (CTG)1 section), and the observation of sharp kinks appearing more on the
specific site than on the non-specific sites, we think the native kink angle on (CTG)5 could be larger than that observed on (CTG)1, with a nominal value of ~50° and reaching up to 80°. Again, since this result comes from a very preliminary analysis, we are not making any conclusion on the native kink angle of (CTG)5 here.
Previously, crystal structure showed that the kink angle of MutSβ-bound DNA increases as the number of CA dinucleotide repeat increases in an insertion loop [158]. From this result, we predict the bend angle of MutSβ-bound DNA increases as the number of CTG repeat increases in the slip-out. However, the bend angles of the high bent state measured on MutSβ- bound (CTG)5 in AFM is similar to that measured on (CTG)1 (~100°, Figure 4.12C, Specific Site, ADP). A lower bent state around 60° can also be seen. Interestingly, the unbent population seen in (CTG)1 is missing in (CTG)5 (Figure 4.12B, Specific Site, ADP). In the presence of ATP, DNA bending at the slip-out shifts towards an unbent state around 0° (Figure 4.12C, Specific Site, ATP) instead of 40° seen in (CTG)1 (Figure 4.12B, Specific Site, ATP). On the non-specific sites, similar to (CTG)1 and the homoduplex controls, MutSβ induces the same 50° bend in the presence of ADP (Figure 4.12C, Non-specific Site, ADP). No change in DNA bending is seen, however, when ATP is present instead of ADP (Figure 4.12C, Non-specific Site, ATP).
Taken together, these results suggest that MutSβ may not be able to convert to the same conformation on (CTG)5 as seen on (CTG)1. Although the specificity of MutSβ for the slip-out on (CTG)5 is lower in the presence of ATP, increased loading has not been observed (pending on further experiments on circular or blocked DNA), and MutSβ does not induce the same
conformational change as seen in (CTG)1 with ATP present. If MutSβ could not form the sliding clamp on (CTG)5, the drop in specificity suggests that MutSβ, while unable to induce the same
conformational change as (CTG)1, could dissociate from the DNA directly instead of proceeding down the sliding clamp repair pathway. This suggestion may explain why (CTG)5 is refractory to processing in past finding [194].
Since the frequency of protein clustering on (CTG)5 is similar to that on (CTG)1 (Table 4.3), MutSβ clustering at large TNR slip-outs may not block repair as we initially hypothesized. Interestingly, with the addition of MutLα, we see larger complexes forming in most protein- DNA complexes, which could be the SL complexes106 (Preliminary data, Figure 4.15A-B). This result suggests that MutSβ may still be able to recruit MutLα despite its seemly incapability to induce the same conformational change as (CTG)1. This signaling-incapable SL complex may ultimately block the slip-out from being processed.
4.3.3. (CTG)56/(CAG)54
Comparing to the other two CTG-only slip-outs, the (CTG)56/(CAG)54 is a hybrid with complementary repeats on both sides – the CTG side is two repeats longer than the CAG side – and the slip-out may have flexible location, size, and number. The repeat is located on the DNA between 33%-47% of its length (Table 4.1). The hybrid has important implications in repeat expansion diseases because the length of the repeat is in the unstable zone and is prone to expand in vivo.
106 The MutSβ-MutLα-DNA experiment was done in higher concentration with chemical crosslinking (Section 4.2.2), which could promote larger complexes formation that is not seen in the MutSβ-DNA experiment (carried out in low concentration). A control MutSβ-DNA experiment done in the same condition as the MutSβ-MutLα-DNA experiment is helpful to understand the composition of these large complexes (whether they could also be MutSβ multimers instead of the SL complexes). However, even with the control, the exact composition of these complexes will not be clear since AFM could not distinguish MutLα from MutSβ. A volume analysis on these complexes could be helpful to resolve the composition of the complexes since MutSβ and MutLα have measurable difference in their molecular weight (AFM volume). In other words, the SL complex will have a different volume than that of the MutSβ multimer and this difference will show up in the volume distribution.
On the (CTG)56/(CAG)54, both the volume analysis measuring the oligomerization state of the complex (Figure 4.8D) and the stoichiometry of protein complex per DNA (Figure 4.9D) resolve a dominant species in both nucleotide conditions, suggesting MutSβ binds
(CTG)56/(CAG)54 mainly as a single-protein complex and no increase in multiple loading is observed on the linear unblocked DNA substrate in the presence of ATP, similar to (CTG)5. As with the analysis on (CTG)5, the proper assessment of sliding clamp formation and multiple loading would require further experiments on end-blocked or circular DNA. Overall, ~16% of the DNA-bound proteins are clustered on the (CTG)56/(CAG)54 substrate in both nucleotide conditions (Table 4.3), which is similar to the level of protein clustering on other DNA substrates.
A previous structural study on slipped DNA suggests that the hybrid will predominately form a single two-repeat slip-out at the center of the repeat tract [228]. The position distribution, however, reveals a broad distribution of strong binding preferences covering all over the repeat tract in the presence of ADP, not just at the center (Figure 4.10D). This result suggests that MutSβ does recognize the slip-outs formed in the hybrid, but the slip-outs’ locations are flexible and could migrate within the repeat tract, and because of that, the whole repeat tract is ‘specific’ to MutSβ. Interestingly, the specificity for the repeat area stays the same even in the presence of ATP, similar to (CTG)1.
We did not analyze the native kink angle on the (CTG)56/(CAG)54 substrate since the exact location of the slip-out is unknown. On MutSβ-bound DNAs, the distribution of DNA bend angles in the specific area is broad (Figure 4.12D, Specific Site, ADP) in the presence of ADP. The non-specific bending at a lower 50° is expected because most sites on the hybrid are non-
specific107. The broad distribution of highly bent states (>90°), however, is in striking contrast to those more distinctive states found in (CTG)1 and (CTG)5 (Figure 4.12B-C, Specific Site, ADP). This result suggests that the slip-outs formed within the repeat may exist in multiple
conformations, potentially with multiple sizes, thereby contributing to the breath of the bending distribution. In the presence of ATP, the highly bent populations shift towards lower bent states (<100°), resulting in a broad distribution with a mixture of bent and unbent states (Figure 4.12D,