Single-molecule methods for quantifying transcription dynamics

2 REVIEW OF LITERATURE

3.2 Single-molecule methods for quantifying transcription dynamics

Most knowledge of transcription was gathered from biochemical and biophysical studies conducted using

in vitro techniques (McClure 1985; deHaseth et al. 1998). One classic example is the study to identify the

binding region of RNAp to the sequence of DNA (Ishihama 2000). Another is the study that identified DNA-protein binding interactions, both studies used foot printing, a method based on gel electrophoresis. However, during the last decade, the development of in vivo techniques with real-time observation has allowed to characterize and dissect transcription in the context of a living cell. In that sense, single-molecule studies have remarkably provided more advanced biological information. Also, they allow monitoring the spatial localization of macromolecules and other cellular components in live cells. These includes RNA (Golding & Cox 2004; So et al. 2011; Muthukrishnan et al. 2012; Pitchiaya et al. 2014; Lenstra et al. 2016), proteins (Yu et al. 2006; Taniguchi et al. 2010), RNAp (Bakshi et al. 2012; Stracy et al. 2015), ribosomal subunits (Sanamrad et al. 2014), transcription factors (Leon et al. 2017), plasmids (Reyes-lamothe et al. 2014), etc. This is made possible by the usage of photoactivatable or photoconvertible fluorophores fused to the target molecule. To detect accurately the fluorescent molecule, the target molecule must express at low concentration. Various strategies have been employed to lower the target molecules (Pitchiaya et al. 2014; Yu et al. 2006; Santangelo et al. 2009; Huang et al. 2009). This involves the use of super-resolution microscopy techniques, such as Photo-activation localization microscopy (PALM) and Stochastic optical reconstruction microscopy (STORM) which can achieve up to 20 nm spatial resolution (Xu & Liu 2018). In recent years, several methods have been developed to probe RNA molecules with fluorescent proteins. RNA labeling can be done in two ways: direct and indirect. Direct RNA labeling involves the usage of a chemically reactive functional group or structural motifs present in RNA and RNA modifying proteins, for fluorophore conjugation (Pitchiaya et al. 2014). Conversely, indirect labeling methods involve sequence-based complementary hybridization of RNA labels carrying a fluorescent protein with multiple specific RNA motifs (Pitchiaya et al. 2014; Raj & Oudenaarden 2008)(Levsky & Singer 2003). Indirect

RNA labeling is more popular due to the ability to tag and detect different endogenous RNAs, as well as exogenous RNA (Raj & Oudenaarden 2008).

One method using indirect labeling of RNA is fluorescence in situ hybridization (FISH). This method uses fluorescent probes that specifically bind to parts of nucleic acid with a high degree of complementary sequence (Raj & Oudenaarden 2008). Advantages of FISH allows to detect multiple RNAs at the same time and measure their spatial localization. In addition, it can quantify cell-to-cell variability in endogenous RNAs (Raj & Oudenaarden 2008; Llopis et al. 2010), which is not possible with methods such as qPCR and RNA-Seq.

Although this method provides single-molecule sensitivity in individual cells, it lacks information on spatial and temporal resolution. For example, as it involves the fixation of cells, probe hybridization to the target RNA sequence, permeabilization of the cell membrane and extensive washing of cells to remove the unbound probes (Gasnier et al. 2013), it cannot monitor individual transcription events (Huber et al. 2018), which assists in the study of the in vivo dynamics of transcription, in real-time.

3.2.1 MS2-GFP tagging system

The MS2-GFP system is one of the most sensitive real-time single-molecule methods that allow studying the in vivo dynamics of the transcription process at the single-cell level. This method was initially developed by Robert Singer and co-workers to visualize RNA in higher eukaryotic cells (Bertrand et al. 1998). Later modifications allowed its use in bacteria (Golding et al. 2005; Golding & Cox 2004). This method allows tracking RNA molecules inside live cells, as soon as they appear.

The MS2-GFP system involves the expression of two components: (i) fusion of the RNA bacteriophage MS2 coat protein to a fluorescent protein, which allows it to bind specifically and (ii) a target RNA containing tandem repeats of the MS2 stem-loop sequences. These components can be genetically engineered either into a plasmid and transformed into cells or, it can be integrated into the genome of host cells. The two components are illustrated in Figure 3.3.

In detail, the MS2 coat protein is derived from the native bacteriophage, which binds with high specificity to 19 to 21 nucleotides of RNA stem-loop structure containing the initiation codon of the phage replicase gene (Bernardi & Spahr 1972). Upon binding to a unique site in the RNA genome of the phage, the coat protein represses translation of the RNA replicase gene and guides packaging into phage particles (Peabody 1993; Querido & Chartrand 2008). Over the years, the MS2 coat protein has been engineered to fluorescent fusion proteins and bind to any RNA that has specific stem-loop sequences or motifs. Such RNAs can be used for the study of various cellular process in different organisms (Golding et al. 2005; Lenstra et al. 2016a; Fusco et al. 2003).

Figure 3.3: Schematic overview of the MS2-GFP component system. (A) The target construct carrying mRFP1 fluorescent proteins

followed by the 96 binding sites for the detection of the RNA by MS2-GFP proteins. The target construct is under the control of the PBAD promoter, which is inducible by Arabinose. (B) A reporter construct is responsible for the expression of MS2 GFP molecules (green balls), which is under the control of promoter PLaco3o1. Once the target constructs producing the RNAs, the MS2-GFP molecules bind to it, allowing it to visualize as a cluster of GFPs. (C) Example confocal image of E. coli cells expressing both target RNAs and reporter MS2-GFP molecules. Individual RNA molecules appear as bright spots when visualized by confocal microscopy. The background of the cells is due to the unbound distribution of MS2-GFP molecules in the cells’ cytoplasm.

The use of the MS2-GFP system in live E. coli cells allows to monitor the patterns of RNA localization and to study the transcription events inside the cells with single-molecule sensitivity (Golding & Cox 2004; Muthukrishnan et al. 2012; Mäkelä et al. 2013). To determine the transcription dynamics of a target gene, the reporter constructs containing the MS2 coat protein fused with GFP has to be highly expressed before the target RNA is produced. The high intracellular concentration of MS2-GFP protein guarantees that enough will bind to the target RNA containing the binding motifs, as soon as they produced. The specific binding of multiple MS2-GFP proteins to the same target RNA, create brighter fluorescent than the unbound MS2-GFP, freely diffusing inside the cell.

When visualizing the cells containing the MS2-GFP system under the confocal microscope, the target RNA bound by multiple MS2-GFP fused proteins appear as a bright spot (See Figure 3.3 C), that moves slowly inside the cells (Golding & Cox 2004; Muthukrishnan et al. 2012; Mäkelä et al. 2013). Since the target RNA is coated by MS2-GFP proteins, it is protected from natural degradation (Fusco et al. 2003). Also, the intensity of the fluorescent spots does not decrease during the measurement time (Tran et al. 2015; Kandavalli et al. 2016; Muthukrishnan et al. 2012).

Apart from the MS2-GFP system, other viral proteins have been used to tag and detect the target RNA, such as PP7 proteins, derived from PP7 bacteriophage (Larson et al. 2011; Lenstra et al. 2016b), and the ƫN peptide, derived from the ƫ bacteriophage (Daigle & Ellenberg 2007). All the above-mentioned systems

are orthogonal to each other, meaning that the MS2 coat proteins do not bind to PP7 binding site or vice versa (Lim & Peabody 2002). This orthogonal functionality aids in detecting the three different RNA at the same time or three different regions of single RNA (Hocine et al. 2013).

The advantage of using the MS2-GFP system over the expression of fluorescent-tagged endogenous RNA binding proteins is that the MS2-GFP system is highly specific to RNA containing the MS2 stem-loop structure, while the other may bind to several mRNAs and reflect the behavior of all of them. Therefore this system provides two benefits: detection of specific RNA molecules under the microscope and study the dynamics of tagged RNA in live cells.

In this thesis, we made use of the MS2-GFP tagging system to study intervals between consecutive RNA production events of multiple E. coli promoters, under different inductions, and conditions in live cells. In all Publications, target genes are inserted in the single copy F- _plasmid.

3.2.2 Engineering of Synthetic Genetic Constructs

A decade ago it was reported that using advanced molecular techniques like DNA assembly and de novo synthesis, it is possible to construct functional synthetic genomes (Gibson et al. 2009). This is achieved by synthesizing multiple small DNA fragments separately and assembling them to a larger piece of DNA, which is transferred into the genome-free host cell. These advancements have contributed to creating synthetic organisms with engineered genomes, with pre-defined specifications and functions. DNA assembly methods have become essential tools in synthetic biology to engineer complex systems with stand- ardized and specific genetic parts. Also, it is reliable, cheap and fast. Thus, most researchers use these methods, instead of traditional methodologies such as molecular cloning using restriction enzymes, etc. Several techniques made possible to construct genetic parts; one such technology is the Gibson Assembly®

method. This method has proven its value by synthesizing the complete genome and transfer it into genome-free host cells (Gibson et al. 2010). It is a cloning method that allows assembling the multiple overlapping regions of DNA fragments in a single reaction mixture. In detailed, the Gibson assembly® master mix consists of three components: (i) T5 exonuclease enzyme, that cleaves the 5’ ends of double-stranded DNA generating single-stranded complementary DNA overhangs, (ii) a DNA polymerase enzyme, which fills in the gaps of the annealed sequence, and (iii) a Taq ligase enzyme that joins the ends of the two DNA strand nicks (See Figure 3.4).

When performing the Gibson Assembly® method, the following steps have to be considered for success- ful construct. When designing primers for the DNA fragments, one must consider adding the overlapping sequences, such that when amplifying the DNA fragments, it contains at least 40 bp overlapping regions with the adjacent DNA fragment. Considering that these DNA fragments are assembled with a vector to form a circular product, this vector should also have the overlapping region at the terminal ends with the DNA fragments, which they will ligate with. When combining all fragments and the vector, the concentration must be in the ratio of 3:1. Gibson Assembly® method can be performed by a single isothermal reaction, by adding DNA fragments, Vector and Gibson Assembly® master mix.

Figure 3.4: Overview of the single-step reaction of the Gibson Assembly® method. The reaction mixture consists of multiple DNA

fragments with overlapping regions, DNA polymerase, T5 exonuclease and ligase enzymes that are needed to ligate these fragments. In this picture, the two DNA fragments (green and pink coloured) are treated with T5 exonuclease at 50o _{C. Next, the products are treated with Phusion polymerase and Taq ligase to fill the gap of the final} ligated DNA products. This picture is adopted and reprinted with permission from Macmillan Publishers Ltd: [Nature Methods] (Gibson et al. 2009), copyright (2009).

In this thesis, for studying the in vivo production of RNA molecules using the MS2 GFP system, we have constructed single-copy plasmids using the Gibson Assembly® method. Particularly in Publication II and V, using a computer-based simulator and the Gibson Assembly® method, we have built a DNA fragment containing multiple repeats of MS2-GFP binding sites for individual RNA detection and integrated them into single copy F-plasmids.

3.2.3 Time-lapse microscopy

As mentioned in section 3.2.1, MS2 GFP tagging allows monitoring individual RNAs by time-lapse fluorescence microscopy at the single-molecule level (Muthukrishnan et al. 2012; Mäkelä et al. 2013; Lloyd- Price et al. 2016; Startceva et al. 2019). To perform such experiments, cells containing the target and reporter system must be placed on the 2.5% agarose gel pad, which is sandwiched between the microscopic slide and the glass cover-slip. The agarose gel pad consists of necessary nutrients requires for cell growth, the inducers to activate the reporter and target systems, and the respective antibiotics (Golding et al. 2005; Muthukrishnan et al. 2012; Mäkelä et al. 2013; Lloyd-Price et al. 2016; Startceva et al. 2019). Also, there is a continuous supply of nutrient medium to the cells, done with the help of a peristaltic pump. This allows steady-state growth for many hours under the microscope. In addition, a temperature-controlled chamber was used to ensure that a specific temperature is maintained during the measurements.

The time-lapse images were acquired by a Nikon Eclipse inverted microscope (Ti-E, Nikon, Japan) with confocal laser scanning with a 100x Apo TIRF (1.49 NA, oil) objective. Confocal images were taken at specific time intervals with a Nikon C2 camera and respective phase-contrast images were captured with a DS-Fi2 CCD camera. GFP fluorescence measured by using a 488 nm argon ion laser (Melles-Griot) and a 515/30 nm emission detection filter. For image acquisition, we used the NIS-Elements software (Nikon). In this thesis, we used this experimental approach for studying the in vivo transcription activation kinetics and subsequent RNA production of E. coli genes. Particularly in Publication II, to investigate the transcription dynamics of multiple genes in different growth phase conditions, cells (when under the microscope) were continuously supplied with respective growth phase medium and all the inducers by a peristaltic pump and maintained at a specific temperature during the measurement times. In Publication III, to study how the variability in gene activation times and RNA production intervals contribute to variability in RNA numbers between cell lineages, cells were placed under the microscope with a constant supply of fresh medium containing the inducers for the target and reporter genes. Here, the target gene was activated under the microscope to determine the time taken to produce the first RNA. Next, we captured fluorescence images every 2 min for 2 hours, and phase-contrast images every 5 min. During the two hours, we maintained a specific temperature using the temperature chamber (Bioptechs, FCS2). The tools used to analyze images are described in chapter 4.

In document Rate-limiting Steps in Transcription Initiation are Key Regulatory Mechanisms of Escherichia coli Gene Expression Dynamics (Page 45-50)