• No results found

Emerging model for CRISPR-mediated interference in Archaea

defence 4.1 Introduction

A M(kD)

B. S solfataricus Csa2 C P.furiosus Cas

4.10.3 Emerging model for CRISPR-mediated interference in Archaea

This chapter describes the first identification and biochemical/structural characterisation of a CASCADE orthologue in Archaea. Comprised by subunits Csa2 and Cas5a (or Cas7 and Cas5 respectively), orthologues of CasC and CasD, the

native complex purified from S. solfataricus is shown to co-purify with processed

CRISPR-derived RNA. A number of transiently interacting proteins also co-purify with the Csa2-Cas5a complex indicating an accessory role, namely Cas6, Csa5 and Csa4. The conservation of these two superfamilies across CRISPR subtypes is indicative of their key role for the structure and function of CASCADE-like complexes, a hypothesis supported by the results presented here. The archaeal CASCADE demonstrates a

crRNA - dependent DNA binding activity analogous to the E. coli CASCADE, and

enables the formulation of a biochemical model for CRISPR - mediated antiviral

defence in S. solfataricus, which is relevant to all the type I CRISPR subtypes

harbouring CASCADE orthologues and type III-B systems with Cmr orthologues (figure 4.24). This model does not describe the adaptation stage in which new spacers are acquired or synthesised, as experimental information for this stage is still very limited and is poorly understood.

In the processing stage of CRISPR functioning, CRISPR loci are transcribed

normally by the S. solfataricus transcription machinery. In this context, we have

demonstrated that the leader sequence directly upstream of the CRISPR locus in S.

solfataricus acts as a canonical promoter and can direct transcription of the CRISPR

locus in vitro. Whether some form of transcriptional regulation is taking place is a

matter of ongoing research. In the bacterial E. coli system, it has been shown that

and derepressed by activator LeuO (Pul et al. 2010; Westra et al. 2010). In Thermus thermophilus the cAMP receptor protein seem to control transcription (Agari et al.

2010), while in S. solfataricus a novel Cas transcriptional regulator with a binding site

for an allosteric effector molecule has been identified in the form of Csa3 (Lintner et al.

2011). In all the archaeal in vivo systems studied up to now, CRISPR loci seemed to

be continuously transcribed and processed (Tang et al. 2002; Hale et al. 2009;

Lillestol et al. 2006, 2009), in accordance with a surveillance role in the cell, although

there is no information as to whether their transcription is up-regulated in response to an infection.

Subsequently, processing of the CRISPR transcripts into mature crRNA repeat-

spacer units is carried out by Cas6 (or equivalent processing ribonucleases like CasE or Csy4), which recognises and cleaves at a single site within the repeat sequences, generating a mature crRNA composed by a 5’ 8 nt psitag with the characteristic conserved sequence GAAA(C/G) (Kunin et al. 2007), a complete spacer sequence and a less defined 3’ handle with the remaining repeat nucleotides. We have demonstrated that the SsoCas6 is a metal-independent ribonuclease that recognises and cleaves specifically crRNA repeats at the single site indicated by the asterisk:

5’ - GAUUAAUCCCAAAAGGA*AUUGAAAG - 3’

Thus, the function of the SsoCas6 is equivalent to that of the euryarchaeal Cas6 from Pyrococcus furiosus, even though the two proteins are highly diverged. Repeat sequences in S. solfataricus and also in the other subtypes associated with Cas6 are predicted to be unstructured (Kunin et al. 2007), therefore the mode of recognition must be sequence-specific as shown for PfuCas6. Members of the Cas6 superfamily are present in CRISPR/Cas subtypes I-A, I-B, I-D and also III-A and III-B. Distinct families are found in I-E (CasE) and I-F (Csy4). Conveniently representatives of each of these three clades have been characterised, revealing the variety of molecular mechanisms utilised by these proteins to recognize and cleave their target, and their co-evolution with the respective repeat types. The fact that Cas6 does not exhibit a stable interaction with the aCASCADE or CMR complex in both systems in which it has been studied reflects this functional versatility and its ability to collaborate with multiple types of effector molecules. Further structural studies of the SsoCas6 are needed to elucidate the molecular mechanism of crRNA recognition and cleavage.

The next step in type I systems (with the exception of type I-D, where there are no CASCADE orthologues) is the incorporation of the processed crRNA in the aCASCADE, the assembly of which seems to be a crRNA-dependent process since the formation of helical aCASCADE assemblies was not observed in the absence of crRNA. The core aCASCADE complex is comprised by Csa2 and Cas5a (Cas7 and Cas5 respectively), with accessory co-purifying proteins including Cas6, Csa5 and Csa4. The complex stoichiometry is undefined, but an excess of Csa2 over Cas5a is

observed, reflecting the abundance of CasC over the other subunits in the E. coli CASCADE. Extended right-handed helical assemblies are formed by the Csa2-Cas5a-

crRNA complex in vitro, but whether this represents the physiological quaternary

structure of the complex is unknown. The aCASCADE can recognize specifically ssDNA complementary to the spacer sequence in the crRNA, and form a stable

ternary complex with the RNA/DNA heteroduplex. The E. coli CASCADE is also able to

recognize dsDNA targets via the formation of an R-loop, whereby it displaces the non- complementary strand and enables the basepairing of the crRNA with the target DNA strand. A possible recognition of target dsDNA should also be investigated for the aCASCADE, but was not carried out in the context of this thesis due to time constrains. Cas3 and the HD nuclease are presumed to be recruited to the aCASCADE-crRNA-DNA complex in order to catalyze the final degradation of target DNA. This step has not been biochemically characterised in any of the studied

systems so far, but genetic studies in E. coli demonstrated that both CASCADE and

Cas3 needed to be expressed to produce a resistant phenotype (Brouns et al. 2008). A

bioinformatics analysis of the S. solfataricus Cas3 and HD nuclease orthologues will

be presented in the subsequent chapter. Even the presence of a single spacer is shown to be sufficient to confer complete resistance to the respective extrachromosomal element, indicating that this is a rapid and effective mechanism.

The Csa2-Cas5a complex did not recognise RNA targets, but an alternative

route is potentially available in organisms that harbour type III-B Cas gene sets, like S.

solfataricus,Pyrococcus furiosus and approximately 60% of the archaea according to

the latest analysis by Makarova et al. (2011). The Cmr proteins of type III-B systems

have been shown to form stable multimeric complexes in P. furiosus (Hale et al. 2009)

and S. solfataricus (see chapter 3), which are able to perform crRNA-guided silencing

of invader RNAs in vitro (Hale et al. 2009) as described in detail in Chapter 3. Whether

this is the physiological activity of type III-B systems in vivo remains to be determined.

If this is the case, the co-existence in the same genome of two systems that target DNA and RNA invader elements differentially while sharing the same pool of CRISPR spacers provides an obvious fitness advantage and increases the efficiency of the antiviral defence. The differential processing of the crRNA that associates with the two complexes (aCASCADE and Cmr) represents a type of “labeling” of the crRNA for incorporation into one or the other system. It would be interesting to investigate whether the spacer content of the two crRNA-protein complexes is different, or whether there is a bias towards a specific CRISPR locus family. The latter event would

support the suggestion made by Lillestol et al. (2009), that individual CRISPR families

might exhibit a preference towards specific groups of viruses of extra-chromosomal elements.

Figure 4.24: Emerging model for CRISPR interference in Archaea

Stages of CRISPR processing and target interference, as deduced by the available experimental data up to date. The pathway on the left involving the aCASCADE would be available to all organisms harbouring type I systems (except perhaps type I-D), enabling the targeting of DNA extrachromosomal elements. The pathway on the right which involves the Cmr complex, would be available to organisms which contain either autonomous type III-B systems or co-existing with other CRISPR subtypes. This pathway would enable the recognition and destruction of any form of invader RNA.

Chapter 5