Chapter 6 – Causal Structure Identification Infers Consistent
6.4 Applying CSI to Simulated Data
If the core network of the circadian clock were conserved across the conditions, then this network would most likely be pulled out by CSI in each of the conditions. Not forcing these core network connections may result in few differences between the different networks and therefore produce very little information about condition dependent network changes. As such, it would be more informative to force the core connections. Assuming that the current model of the circadian clock (Pokhilko et al. 2012) was the foundation of each of the networks, then the results of using CSI on the luciferase data should match very closely the results of using CSI on Pokhilko et al. 2012 simulated data.
6.4.1
– Luciferase Data vs. Pokhilko 2012 Simulation
To test this idea, data from a simulated model (Pokhilko et al. 2012) was passed through CSI (Table 6.2), only simulated profiles of components previously analysed in Chapter 6.3.1 (Table 6.1) were used. Network inference of the simulated data recovered a high number of the interactions found in the Pokhilko et al. 2012 model (Figure 1.3 C). Limitations of the model meant that some genes were missing from the network, as they were not simulated at mRNA level. Additionally some interactions are incorrectly modeled by CSI since none of the information about protein or protein complexes from the simulation was supplied to CSI.
Comparing table 6.1 with 6.2, there were many differences, apparent by lack of highlighting in Table 6.2. A major difference between the data in tables 6.1 and 6.2 was how they acted at high values of cMax. In the luciferase networks, the majority of gene regulators increased with cMax. However, in the networks inferred from simulated data, all genes have reached their maximum number of regulators by cMax =3. Additionally, although many of the connections
CSI Infers Consistent Networks
Table 6.2 Predicted regulators of genes within Pokhilko 2012 SaSSY model. Connections were inferred using data simulated under the same conditions as the experiment. Cells highlighted in green show interactions recovered by CSI networks inferred using luciferase data (Table 6.1) as well as those inferred using simulated data.
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
LHY TOC1 PRR7 PRR7 PRR7 PRR7
ELF3 ELF3 ELF3 ELF3
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
PRR9 LHY 0.8 LHY LHY LHY LHY
GI 0.2
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
PRR7 LHY LHY LHY LHY LHY
ELF3 ELF3 ELF3 ELF3
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
TOC1 ELF4 0.5 GI GI GI GI
LUX 0.5 PRR9 PRR9 PRR9 PRR9
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
ELF4 GI GI GI GI GI
PRR9 0.5 PRR9 0.3 PRR9 0.3 PRR9 0.3
ELF3 0.5 ELF3 0.7 ELF3 0.7 ELF3 0.7
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
ELF3 LHY LHY LHY LHY LHY
PRR7 PRR7 PRR7 PRR7
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
LUX GI GI GI GI GI
PRR9 0.5 PRR9 0.3 PRR9 0.3 PRR9 0.3
ELF3 0.5 ELF3 0.7 ELF3 0.7 ELF3 0.7
ELF4 0.2 ELF4 0.2 ELF4 0.2
Gene cMax=1 cMax=2 cMax=3 cMax=4 cMax=5
GI ELF4 0.3 ELF4 0.5 ELF4 0.5 ELF4 0.5 ELF4 0.5
ELF3 0.3 ELF3 0.16 ELF3 0.16 ELF3 0.16 ELF3 0.16
LUX 0.3 LUX 0.5 LUX 0.5 LUX 0.5 LUX 0.5
CSI Infers Consistent Networks
usually only featured when CSI was run with high values of cMax on the luciferase data. This potentially suggested that the biological network, which controls gene expression, was significantly different to the model in terms of the most important regulators in a topological study. Additionally, the simulations lacked mRNA information for several key components within the clock. For example CCA1 was completely missing from the Pokhilko et al. (2012) model as it was assumed to be the same as LHY for modeling purposes. Additionally, although ZTL was modeled in the Pokhilko et al. (2012) clock, it was only done so at a protein level. From this analysis it was seen that forcing mRNA network interactions, within CSI, for the analysis of luciferase data, informed by the Pokhilko et al. (2012) model, which was itself informed by biological information at multiple levels, is unwise and likely to have required even more assumptions of the biological system. Thus, it made more sense to use CSI to fit networks to each condition set independently, without a set of prior assumptions. Had this have resulted in a single core network in all conditions, than the core interaction network would have been fixed in CSI and the software rerun.
This investigation did provide some very interesting information about how CSI coped with the evening complex protein complex. For instance, GI was modeled in Pokhilko et al. (2012) as being repressed by the evening complex. The evening complex is made up of ELF3, ELF4 and LUX. After running CSI on the Pokhilko et al. 2012 simulated data, all three components of the evening complex were predicted as having a role in regulating GI. However, this was not done in the binary format normally returned by CSI, these partial connections were maintained even when cMax was increased to a point that would allow all of the connections to have a value of 1. This shows that a pSet member containing all three components as well as the additional interactions did not provide an increased explanation to a genes expression however any of them on their own was able to describe the connection with similar effectiveness.
CSI Infers Consistent Networks