Modelling Larger Networks: Boolean Modelling

64 Although small networks may be modelled using quantitative approaches, this is generally less feasible for larger networks. Due to the high computational demand of quantitative approaches and due to the need for detailed kinetic data for the network constituents, for larger networks discrete modelling may be adopted. Discrete modelling simplifies the modelling process by removing the need for parameters such as initial concentrations and kinetic data such as rate constants. Thus analysis of discrete models is not quantitative, but instead is qualitative, relying primarily on network structure and topology (Khan et al., 2014).

Network modelling for larger networks can be applied to a variety of biological phenomena such as metabolic networks (Feist et al., 2007) and protein-protein interaction networks (Jeong et al., 2001). Mathematical graphs are utilised to formalise and represent the networks, with the nodes of the graphs representing biological entities such as proteins or genes, whilst the edges of the graph represent (for example) the interactions between the nodes of the network (Klipp et al., 2009). The edges of the graph may be directed or undirected; for directed edges the interaction consists of ordered node pairs (linked by a directed arrow) whilst undirected edges are represented by unordered node pairs linked by an edge, represented by a line (Klipp et al., 2009). Analysis can be undertaken in both directed and undirected graphs, though analysis of undirected graphs is limited in that you may only see node connectivity. For analysis of gene regulatory networks, directed graphs are most suitable as they show which node is affected in any particular interaction (Liu et al., 2014).

The simplest form of discrete modelling is Boolean modelling (Saadatpour and Albert, 2012). Boolean modelling utilises the principles of Boolean logic: everything is either true (1, ON) or false (0, OFF). Logical operators such as AND, OR and NOT may be used alone or in combination to modify statements/interactions. Under a Boolean model every network constituent will have a value/state of either 1 (ON) or 0 (OFF). Although this is of course not as quantitatively precise as ODE models, such logic is a good representation of certain processes such as gene regulatory networks, since many genes or proteins exhibit ON/OFF styles of function (Khan et al., 2014).

65 Boolean modelling utilises mathematical graph theory to represent the network. Nodes can represent biological entities such as proteins, whilst edges represent the interactions between those proteins. Interactions within a Boolean model may be represented with different formalisms such as interaction graphs and interaction hypergraphs (Klamt et al., 2006). The difference between an interaction graph and interaction hypergraph is that interaction hypergraphs are capable of connecting more than one node to a downstream node simultaneously. A simple example is clarified in Figure ‎1.20.1:

Figure ‎1.20.1: An interaction graph compared to an interaction hypergraph.

Figure ‎1.20.1A represents the interaction graph whilst Figure ‎1.20.1B represents the interaction

hypergraph. Adapted from a similar example in Klamt et al. (2006).

In the hypothetical example above in Figure ‎1.20.1, the biological effect that is intended to be modelled is that both Protein A and Protein B are required for the activation of Protein C, which subsequently activates Protein D. As is clear, the interaction hypergraph (Figure ‎1.20.1B) represents this interaction more accurately, as it allows for simultaneous interactions of upstream nodes. The interaction graph, however, in Figure ‎1.20.1A does not present this as accurately and would allow for the activation of Protein C with Protein A or Protein B alone. Thus, Klamt et al. (2006) argued that the interaction hypergraph makes for a more accurate simulation of cellular networks due to the biological reality that proteins often work in tandem with each other to exert their

66 effects. The allowance of logical operators such as “AND”, “OR” and “NOT” further improves model simulation and analysis (Klamt et al., 2006).

Klamt et al. (2007) introduced a MATLAB package called CellNetAnalyzer (CNA), which can be used to create and analyse Boolean models. Two types of models can be created within CNA: mass-flow (suitable to metabolic network) and signal-flow (suitable for gene regulatory networks). Interaction graphs and interaction hypergraphs are accepted into CNA as are logical operators such as AND, OR and NOT which allows for more complex models to be constructed. CNA has been successfully applied to cancer research, for instance a model on the TP53 protein interaction network was generated and analysed in CNA by Tian et al. (2013). Several analyses were undertaken in CNA for this model, such as logical steady state analysis (LSSA), dependency matrix generation, in silico knockouts, in addition to wet laboratory verification of model predictions (Tian et al., 2013). Approaches such as LSSA are described in detail in Chapter 2 (Materials and Methods).

The model constructed by Tian et al. (2013) showed high accuracy (up to 71%), demonstrating the power and usefulness of CNA. A semi-quantitative algorithm (called the signal transduction score flow algorithm, STSFA) that superimposes microarray and/or ChIP-seq onto a network model (Isik et al., 2012) was later applied to the TP53 model generated by Tian et al. (2013), and demonstrated improved predictive power over LSSA (Hussain et al., 2014). The TP53 model was later expanded to consider 260 nodes and 980 interactions, with this expanded model again showing accuracy of up to 71% when compared to microarray data (Hussain et al., 2015).

Boolean modelling as a whole offers many advantages. The simplification of interactions (down to a simple ON or OFF as opposed to exact kinetic mechanisms) requires only a low level of computational demand, which in turn allows for the modelling of much larger networks than is generally feasible for ODE models. Though not as quantitatively precise as ODE models, the advantage of Boolean models through the ability to capture large networks and the possibility of semi-quantitative analysis

67 through the use of algorithms such as the STSFA highlights their strengths and usefulness (Albert and Othmer, 2003).

In document Analysis of drug resistance and the role of the stem cell niche in leukaemia (Page 64-68)