defence 4.1 Introduction
A M(kD)
4.10 Structural studies and discussion 1 The structure of Csa
In order to gain a better understanding of the molecular basis of the archaeal
CASCADE the structure of recombinant Csa2 (Sso1442) from S. solfataricus was
solved by X-ray crystallography by Nathanael Lintner in the group of Martin Lawrence in Montana State University. For full crystallisation conditions, data collection and
refinement details, refer to Lintner et al. (2011). Phases were determined by multi-
wavelength anomalous diffraction on a KAu(CN)2-soaked crystal which diffracted to
2.0 Å resolution, and by collecting also a 2.0 Å resolution single wavelength dataset
from a native crystal. The protein crystallised in space group P212121 with four copies
of Sso1442 in the asymmetric unit. About 8% of the residues in each chain could not be modelled as the electron density was not defined. Structure coordinates were deposited in the protein data bank with accession code PDB ID: 3PS0. The final model consists of four Csa2 chains, but the crystal packing is not thought to represent a biologically relevant quaternary structure as it exhibits closed symmetry, incompatible with the results obtained with TEM (discussed below). The Csa2 monomer consists of three domains arranged vertically, 65 Å in length (figure 4.21 A). The central domain contains essentially an RNA-recognition motif, a ferredoxin-like fold comprised by the
four strands of a central antiparallel β-sheet (β6, β7, β1 and β8) and helices α1 and
α8, in the βαββαβ topology characteristic of this fold. This RRM motif is extended
with a connecting 13-aa loop leading to helix α9, an additional fifth strand to the
central β-sheet (β9) and helix α10 which comprises the C-terminus of the protein.
Helices α9 and α10 are located on either sides (above and underneath respectively) of
the central β-sheet, partially covering the β-sheet surface which is responsible for
RNA binding in the typical RRM fold. Moreover, the characteristic sequence motifs containing the aromatic residues responsible for RNA binding are not conserved in the
Csa2 central β-sheet (except for Tyr141), which suggests a distinct mode of RNA
recognition.
The second and third domains of the Csa2 structure are located in opposite
sides of the central RRM-like domain, and are termed “1-3” and “2-4” domains respectively. The former consists of residues 27-46 and 145-180, which form
insertions one and three into the RRM domain. Specifically, helix α1 followed by a
disordered loop and strands β2-β3 forming an antiparallel hairpin are inserted between
the N-terminal β1 and α2, and helix α7 followed by a disordered loop (absent from the
model) is inserted between β6 and β7. The “2-4” domain consists of insertions two
and four “below” the central RRM domain. In particular, residues 68-136 extending
from helix α2 are arranged into short helices α3, α4, α5, α6 and a protruding hairpin
protein. Residues 192-216 form the fourth insertion, which consists of an extended
connecting loop between β7 and the N-terminal half of helix α8.
Interestingly, the N-terminal ferredoxin fold of Pyrococcus furiosus Cas6 is
identified by the DALI structural comparison server as the closest structural neighbour of Csa2 (figure 4.21 B & C). The similarity is limited to the RRM domain, with an RMSD of 2.9 Å on 87 aligned residues. As described elsewhere, Cas6 is a metal-independent RAMP superfamily endoribonuclease with a tandem ferredoxin-like fold, with a conserved catalytic triad positioned at the opposite side of the central cleft formed by
the β-sheets of the two domains (Carte et al. 2008). This similarity contains limited
information regarding the function of Csa2, as the respective domains are surrounded by different folds in each protein and the functional sequence motifs are not conserved.
A closer inspection of the conserved residues within the Cas7 superfamily and
the Csa2 family (Cas7 type I-A, TIGR02583) in particular, reveals that the majority can be mapped on two clusters on the protein surface, the first on the 1-3 domain and the second at the interface between the RRM and the 2-4 domain (figure 4.21B). The first cluster consists of residues Asn16, Pro46, His160, Arg162 and Glu178 and is referred to as the asparagine (Asn) cluster and the second as the glycine (Gly) cluster and consists of residues His55, Gln58, Gly121, Gly122, Phe123 and Ser135. These clusters form solvent-exposed, basic patches on the concave surface of the protein crescent, and would be suitable candidates for mediating nucleic acid interactions.
The glycine cluster in particular is positioned in the “opposite side” of the β-sheet of
the RRM domain, reflecting the relative positioning of the PfuCas6 active site with the central cleft (figure 4.21 C). This supports the hypothesis that the conserved Gly cluster plays a functional role, potentially in nucleic acid recognition and binding. Furthermore, we have demonstrated that mutation of the conserved His160 to alanine indeed abolishes the ability of Csa2 to bind crRNA. Additional conserved residues are
located in the disordered α1-β2 and α8-β8 loops (Gly22, Asn23 and Asg240
respectively). The apparent flexibility of the disordered loops and the β-hairpins in the
1-3 and 2-4 domains along with the location of the conserved Asn and Gly clusters suggests their involvement in the recognition and binding of the crRNA, and the subsequent recognition and correct positioning of the target DNA (ss or ds). The nature of the conserved residues is such that it is unlikely that Csa2 exhibits a nuclease activity like Cas6, which was confirmed experimentally. With the RNA- binding surface of the RRM domain partially covered, it is difficult to suggest a path for the crRNA. It was demonstrated however by RNAse protection experiments
(Lintner et al. 2010) that the crRNA is protected completely, indicating that it is bound
Figure 4.21: Structure of SsoCsa2
(A) Cartoon representation of the Csa2 structure, illustrating the topology and connectivity of the various secondary structure elements. Domains are coloured as follows: RNA-Recognition Motif (RRM) in violet, C-terminal extension of the RRM motif in yellow, 1-3 domain in red and 2-4 domain in orange. (B) Location of conserved residues of the Csa2 family on the Csa2 structure. (C) Cartoon representation of the structure of Cas6 from Pyrococcus furiosus. The