Chapter 2 Mathematical Modelling in Synthetic Biology
3.1 Scientific background
3.1.2 Existing recombinase-based systems
Many of the earliest publications regarding both tyrosine and serine integrases were focused on elucidating their structural and functional properties through compre- hensive experimentation [Thorpe and Smith, 1998; Thorpe et al., 2000; Smith and Thorpe, 2002; Ghosh et al., 2003; Groth and Calos, 2004]. In light of these obser- vations and the continued publication of increasingly detailed studies regarding the nuances underlying DNA recombination, such as the function of the synaptic com- plex [Rowley and Smith, 2008; McEwan et al., 2009, 2011; Bai et al., 2011; Olorunniji et al., 2012], a slew of useful applications were identified that prompted the design and implementation of many novel systems. Serine integrases have become highly prevalent as integration vectors, emerging as the preferred method of transferring DNA into other organisms such as streptomycetes [Zhang et al., 2013]. These SSRs are capable of mediating integration of entire antibiotic biosynthetic clusters into target genomes [Baltz, 2011, 2012] and are also highly compatible when intercon- nected within the same organism [Gregory et al., 2003].
Other applications of the serine integrases are focused on therapeutics such as engineered bacteria that are programmed to invade cancer cells [Anderson et al., 2006] and the production of two essential human blood-clotting proteins known as factor XII and factor IX in mice to treat haemophilia A and B respectively [Chavez et al., 2010; Olivares et al., 2001]. Additional biomedical applications include trans- genic cattle capable of expressing milk containing the human β-defensin-3 antimi- crobial peptide which naturally protects the surfaces of human organs and blood vessels from bacterial colonisation [Yu et al., 2013], inducible production of pluripo- tent stem cells from human amniotic cells and embryonic cells in mice [Ye et al., 2010] and engineering specific skin cells and partially specialised stem cells for gene therapy of skin disorders [Ortiz-Urda et al., 2003, 2002].
Sophisticated methods concerning the assembly and optimisation of complex large-scale synthetic systems are ideally suited to site-specific DNA recombination [Xu et al., 2013]. DNA assembly via serine integrases enables the construction of highly modular pathways that can be adapted without the need for repeated cloning [Zhang et al., 2008, 2011] and can also facilitate rapid metabolic pathway assembly [Colloms et al., 2014]. Genome engineering is also benefiting from the application of serine integrases;φC31 has been utilised for the specific modification of genomes in mice [Tasic et al., 2011], silkworm embryos [Yonemura et al., 2013, 2012] and zebrafish [Hu et al., 2011; Lister, 2010], thus elucidating gene function in model organisms, and also for the deletion of genetic markers in plants such as Arabidopsis (rockcress), wheat and barley [Thomson et al., 2010; Kempe et al., 2010;
Kapusi et al., 2012], thus generating stable progeny void of undesired DNA. The potential of recombinase-based circuitry to provide transistor-like be- haviour in synthetic biological systems has naturally opened up numerous applica- tions relating to genetic data storage and biocomputing [Baumgardner et al., 2009; Ham et al., 2008]. Digital data storage in particular has attracted much attention, re- sulting in several validated memory devices such as the aforementioned event counter that utilises tyrosine integrase-based recombination to count and record sequential pulses of inducer; the purely transcriptional circuit tested in parallel was only able to count induction events [Friedland et al., 2009]. Of course, transcriptional elements are necessary for realising desired outputs however, these results indicate that their control must be mediated via DNA recombination in order to achieve memory of external stimuli. Memory modules can support efficient inducible DNA inversion in alignment with mathematical modelling simulations [Bonnet et al., 2012], store over 1 B of digital information via layered arrangements of attachment sites specific to multiple recombinases [Yang et al., 2014] and represent integral components in the construction of a biological microprocessor [Moe-Behrens, 2013]. Consequently, the level of input-output complexity that can be realised is theoretically unbounded due to the scaling up of systems through numerous pairs of attachment sites corre- sponding to distinct integrases, therefore providing the platform for engineering the full range of Boolean logic operations in response to induction of independent serine integrases [Bonnet et al., 2013; Siuti et al., 2013]. At the current rate of expansion, it is thought that worldwide data will require in the region of 4×1010(forty-trillion) GB of digital storage by 2020 and, although cloud computing is hoped to address this demand, approximately 90 grams of DNA would be sufficient to store such an amount [O’Driscoll and Sleator, 2013].
Of the diverse range of recombinase-based systems, only a small fraction have been published in conjunction with mathematical modelling investigations that re- veal, for example, specific reaction rates that are currently intractable experimen- tally, or expected dynamical behaviour via qualitative or even quantitativein silico
simulations as a reference point forin vivo circuit assembly [Ringrose et al., 1998; Bonnet et al., 2012; Friedland et al., 2009]. Of these extant models, even fewer are related specifically to serine integrase recombination interactions which may be sur- prising considering the relative wealth of publications detailing the structural and functional properties of recombinases that could inform mechanistic model construc- tion. It is worth reiterating that the repressilator and toggle switch were both de- signed and characterised in alignment with albeit simple mathematical models. The deployment of synthetic biological devices is unlikely to ever be entirely predictable
purely by virtue of experimental observations and thus there exists a definite need for extensive modelling approaches to serine integrase reactions in order to deliver and enhance the aforementioned applications.