• No results found

4.4 Data

4.4.1 Generation of realistic data

Various mathematical models have been developed to describe the molecular inter- actions and signal transduction processes in the central circadian clock of A.thaliana [93, 113, 115]. They are based on systems of ordinary dierential equations (ODEs) that describe the chemical kinetics of transcription initiation, translation, and post- translational modication, using mass action kinetics and/or Michaelis-Menten kinet- ics. In principle, I could use these mathematical models together with the published values of the kinetic rate parameters to generate synthetic transcription proles from the circadian regulatory networks published by Locke et al. [93] and Pokhilko et al. [113, 115], then use the latter as a gold standard for my method evaluation.

However, this approach would not generate data that are suciently biologically realistic. The solutions of ODEs typically converge to limit cycles with regular oscilla- tions and constant amplitude, which fail to capture the stochastic amplitude variation observed in real qRT-PCR experiments. In addition, the damping of oscillations exper-

4.4. DATA 69

Chemical Kinetics Described by Ordinary Dierential Equations (ODEs) mRNA Concentration Change

dP RR9mRNA dt =q3·light·Pprotein+n7· gh8 gh8+T OC1hprotein· LHYproteini LHYproteini +gi9−m12·P RR9mRN A Protein Concentration Change

dP RR9protein

dt =p8·P RR9mRN A−(m13·light+m22·dark)·P RR9protein Discrete Stochastic Kinetics of Molecular Reactions

mRNA Count Update

P RR9mRN A=P RR9transcr↑+P RR9mRN A.degrad↓ P RR9transcr= Ω· „ q3 Ω ·light·Pprotein+ (g8·Ω)h (g8·Ω)h+T OC1hprotein · „ n4+n7· LHYproteini LHYproteini +(g9·Ω)i «« P RR9mRN A.degrad=m12·P RR9mRN A Protein Count Update

P RR9protein=P RR9translate↑+P RR9protein.degrad↓

P RR9translate=p8·P RR9mRN A

P RR9protein.degrad= (m13·light+m22·dark)·P RR9protein

Table 4.2: Ordinary dierential equations (ODEs) and corresponding dis- crete molecular reaction kinetics for the morning gene PRR9.

The symbol "P RR9x" denotes the concentration of a molecular species

of the morning gene PRR9, specied by the index "x". For instance,

P RR9mRNA is the concentration of mRNA transcribed from PPR9,

P RR9protein is the concentration of PRR9 protein, etc.. The symbol

light is a binary indicator for the status of light (1=light, 0=darkness),

dark =1-light, lower case letters indicate kinetic parameters, and Ω is

a volume parameter. Top panel: ODE description of chemical kinet- ics, with non-linear Michaelis-Menten kinetics for mRNA concentration change, and linear mass action kinetics for protein concentration change. Bottom panel: The corresponding discrete kinetic reactions, which in the

limitΩ→ ∞converge to the ODE solutions. An upper arrow ↑ on the

right indicates an amount by which the quantity on the left is increased, a

down arrow↓on the right indicates an amount by which the quantity on

the left is decreased. The reactions occur stochastically, with propensi- ties determined by the reaction rates. Mathematical details can be found from Wilkinson [153]. The complete set of equations for all genes in the central circadian clock of A.thaliana is available from Guerriero et al. [62].

70 Chapter 4

Figure 4.1: Model network of the circadian clock in A.thaliana and net- work modications. Each graph shows interactions among core cir- cadian clock genes with dierent degrees of interconnectedness. Solid lines show protein-gene interactions; dashed lines show protein modi- cations; and the regulatory inuence of light is symbolized by a sun symbol. The top left panel (`wildtype') shows the network structure published by Pokhilko et al. [114]. The remaining panels show modied network structures, corresponding to subsequent pruning of the wild- type network. This is realized by articially disabling certain proteins (displayed in the panel title) to act as transcription factor and thus loos- ing their regulatory function on mRNA transcription that existed in the wildtype network. The expression of the associated mRNA of these pro- teins is not aected. Grey boxes group sets of regulators or regulated components. Arrows symbolize activations and bars inhibitions.

imentally observed in constant light conditions is not correctly modelled. The problem of ODEs is that the intrinsic uctuations of molecular processes in the cell are ignored, thereby not allowing for molecular noise that may have a signicant impact on the behaviour of the system [62, 152].

4.4. DATA 71 For a more realistic approach, I model the individual molecular processes of transcrip- tion, translation, degradation, dimerisation etc. as individual discrete events, as shown in Tables 4.1 and 4.2. Statistical mechanics arguments then lead to a Markov jump process in continuous time whose instantaneous reaction rates are directly proportional to the number of molecules of each reacting component [152, 153]. Such dynamics can be simulated exactly using standard discrete-event simulation techniques, as illustrated in Table 4.1. For my study, I followed Guerriero et al. [62] and adopted the Bio-PEPA framework from Ciocchetta and Hillston [29] to simulate gene expression proles for the core circadian clock of A.thaliana, using the Bio-PEPA Eclipse Plug-in4. This

framework is built on a stochastic process algebra implementation of chemical kinetics, and the stochastic simulations are run with the Gillespie algorithm [54]. Figure 4.2 illustrates such stochastically generated mRNA time series data using Bio-PEPA and the corresponding real data from qRT-PCR measurements for two components of the circadian clock.

In order to correctly quantify stochastic uctuations, concentrations are represented as numbers of molecules per unit volume. This requires the unit volume size Ωto be dened, which scales the molecule amounts and kinetic laws such that a unit concentra- tion in an ODE representation becomes a molecule count close toΩ; see Guerriero et al. [62] for more details. The size ofΩ has a strong inuence on the stochasticity of the system. Since larger volumes entail a more pronounced averaging eect, the stochas- ticity decreases with increasing values of Ω, and the solutions from the equivalent deterministic ODEs are subsumed as a limiting case forΩ→ ∞. Conversely, decreas-

ing values of Ω increase the stochasticity. Guerriero et al. [62] showed that replacing the continuous deterministic dynamics of ODEs by the discrete stochastic dynamics with an appropriate choice ofΩleads to a more accurate matching of the experimental data, including the damping of oscillations experimentally observed in constant light, better entrainment to light in several light patterns, better entrainment to changes in photo period, and the correct modelling of secondary peaks experimentally observed for certain photo periods.

I simulated mRNA and protein concentration proles over time from the circadian clock regulatory network published in Guerriero et al. [62] and Pokhilko et al. [114], shown in Figure 4.1 (top left, network `wildtype') and Figure 4.17 (middle left, network `P2010'). This involves genetic regulatory reactions for mRNA transcription, protein translation, and mRNA and protein degradation for 7 genes. Figure 4.3 shows the

72 Chapter 4 0 10 20 30 40 50 60 70 0 100 200 300 400 time (h)

count per cell

qRT−PCR − LHY mRNA 0 10 20 30 40 50 60 70 0 10 20 30 40 time (h)

count per cell

qRT−PCR − TOC1 mRNA 0 20 40 60 0 20 40 60 80 100 time (h)

count per cell

Bio−PEPA − LHY mRNA

0 20 40 60 20 40 60 80 100 time (h)

count per cell

Bio−PEPA − TOC1 mRNA

Figure 4.2: Real mRNA data in comparison to generated data. The top plot shows the qRT-PCR time series data for the LHY and TOC1 mRNA of A.thaliana with two hour measurement intervals (derived from Sec- tion 4.4.2, `TiMet' [138]). The bottom panels show the corresponding synthetic measurements for the stochastically simulated data described

in Section 4.4.1 with a unit volume ofΩ = 100.

trajectories of the mRNA and protein measurements for 6 of the 7 components of the clock (mRNA/protein for hypothetical Y is not displayed) for a regular day with 12 hour light and 12 night. Table 4.2 shows the underlying chemical kinetic reactions for a single component in this network (PRR9), as an illustration. A full list of reactions and their corresponding mathematical descriptions is available from the supplementary material from Guerriero et al. [62].

An additional advantage of this procedure is that it is straightforward to assess the eect of network structure modication on the performance of the network reconstruc- tion methods. This can easily be eected by inactivating certain reactions in the gold standard network, by setting the respective reaction rates to zero. Figure 4.1 shows the complete circadian regulatory network in A.thaliana, as published by Guerriero

4.4. DATA 73 0 20 40 60 0 20 40 60 80 100

count per cell

0 20 40 60 0 20 40 60 80 100 GI 0 20 40 60 0 20 40 60 80 120 0 20 40 60 0 20 40 60 80 120 LHY 0 20 40 60 0 20 40 60 80 120

count per cell

0 20 40 60 0 20 40 60 80 120 PRR5 0 20 40 60 0 50 100 150 200 0 20 40 60 0 50 100 150 200 PRR7 0 20 40 60 0 20 60 100 time (h)

count per cell

0 20 40 60 0 20 60 100 PRR9 0 20 40 60 0 20 60 100 time (h) 0 20 40 60 0 20 60 100 TOC1

Figure 4.3: Synthetically generated mRNA and protein time series us- ing Marcov Jump Processes (MJP) with Bio-PEPA. Each panel shows the mRNA (solid line) and corresponding protein proles (dashed line) for dierent components of the circadian clock of A.thaliana as de- scribed in Section 4.4.1. The light conditions in this particular data set is a regular day with 12 hour light and 12 hour darkness, without

any knock-outs, and a unit volume of Ω = 100. Note the long time-

delay between the mRNA and protein concentration of `GI' (top left panel). This is because the formation of this protein depends on the protein Zeitlupe (ZTL, not shown), which exhibits a substantial phase shift compared to the `GI' mRNA expression.

et al. [62] and Pokhilko et al. [114] (`wildtype'), and several modied sparser struc- tures, which are used throughout my study. The exact setup of the data generation process is described in detail in Section 4.5.1.