• No results found

5.4 Study 7: The robustness of development

5.4.1 Perturbation analysis

This section describes the methodology that was used to measure the robustness of the DRGN-lineage model to genetic and environmental sources of perturbation. We used 21 ensembles of DRGNs parameterised by size (N = 1,2,4,8,16,32) and connectivity (K = 1,2,4,8,16,32). The remaining parameters, weight scale, threshold scaling and number of cell types, were fixed for all ensembles (W = 2.0;λ = 0.8;NO = 2). Each ensemble consisted of 200 randomly generated

DRGNs. For each DRGN, the unperturbed lineage was generated and stored. Four sets of perturbed lineages were then generated for each DRGN. The four sets varied the source of perturbation (genetic or environmental) and the rate of pertur- bation (absolute or relative). Each set consisted of 100 perturbed lineages. The robustness for the four sets was calculated by comparing each of the perturbed lineage to the unperturbed lineages. Further implementation details relating to the source of perturbation, the rate of perturbation and the lineage comparison algorithm are described below.

Source of perturbation

A developmental system may experience perturbation from two different sources: genetic and environmental (Wagner and Altenberg, 1996). In the context of the DRGN-lineage model, these can be interpreted as structural and dynamic per- turbation respectively. Genetic perturbation, or mutation, is a heritable change to an organism’s genome, here represented as the pattern of interactions between nodes in the DRGN. Environmental perturbation by contrast is a non-heritable and transient disturbance affecting development, here represented by the dynamic behaviour of the DRGN. Structural perturbations were implemented by modifying the connection strengths between nodes. Each perturbed DRGN was generated by adding Gaussian noise with distribution G(0,0.1) to 20% of randomly chosen connections. This implementation of mutation corresponds to the random walk

120 The Structure and Composition of Ontogenetic Space

assumption used in population genetics models (Zeng and Cockerham, 1993). Dy- namic perturbations were implemented by adding probabilistic noise to a subset of node activations. After each cell division, each node activation had a 10% chance of being modified by the addition of Gaussian noise with distribution G(0,0.05).

Rates of perturbation

In biology, mutation rates are known to vary widely both among different species and among different regions of a single genome (Kumar and Subramanian, 2002). Similarly, the level of stochasticity in the regulatory events involved in gene expres- sion is not known precisely (McAdams and Arkin, 1997). As a first approximation, we considered the two approaches to determining rates of perturbation.

As the number of parts and interactions in a system varies, there are two possible ways of measuring the amount of perturbation applied to the system. The amount of variation may be absolute, in the sense that a fixed number of perturbations are applied regardless of the structure of the system. Alternatively, the probability of any part or interaction being perturbed may be held fixed, in which case the number of perturbations will be relative to the size or connectivity of the system. In this study, we explored the effect of both absolute and relative rates of perturbation.

Measuring robustness and variability

In order to measure the effect of a perturbation, a metric was required for quantify- ing the difference between the unperturbed and perturbed cell lineages. Measuring the distance between tree structures is a common task in phylogenetics for which there are a number of widely used methods (Felsenstein, 2004). In general, these methods rely on the terminal nodes of the trees (i.e., the extant species in a phy- logenetic tree) being fixed with the variation between trees being in the branching relationship that links the terminal nodes. When considering perturbations to cell lineages however, the set of terminal nodes cannot be assumed to remain constant. Perturbations may result in the terminal cells of a lineage increasing or decreasing in number and also changing from one type to another.

For organisms with invariant patterns of development, the physical location of a cell is often closely tied to its position in the lineage (Sulston et al., 1983, Nishida, 1987, Houthoofd et al., 2003). We therefore decided to base our compari-

5.4 Study 7: The robustness of development 121

son between lineages on the similarity between the order and composition of their terminal cells3. It was important for this measure that, not only could sequences be of dissimilar lengths, but common sub-sequences could be recognised despite being shifted in location. The degree of similarity between two phenotypes was based on the Levenshtein distance (Sankoff and Kruskal, 1983) between the unper- turbed fate sequenceU and the perturbed fate sequenceP. Levenshtein distance is defined in terms of the minimum number of transformations required to change U

intoP, where possible transformations are the insertion, deletion and substitution of cell fates. The dynamic programming algorithm used to calculate the distance between two sequences U and P was as follows:

LevenshteinDistance(U, P):

declare int d[length(U)+1, length(P)+1]

declare int i, j

for i from 0 to length(U)

d[i,0] =i×spaceP enalty

for j from 0to length(P)

d[j,0] =j×spaceP enalty

for i from 1 to length(U)

for j from 1 to length(P)

if (U[i] =P[i])

currentV alue =matchV alue

else

currentV alue =mismatchV alue d[i, j] =minimum(

d[i−1, j] + 1, d[i, j −1] + 1,

d[i−1, j−1] +currentV alue,

)

return d[length(U),length(P)]

3Several alternative approaches to comparing cell lineages are described and investigated in the following chapter.

122 The Structure and Composition of Ontogenetic Space

matchV alue and mismatchV alue were the scores assigned for a correct or incorrect match at a particular position. spaceP enalty was the score assigned for an insertion or deletion. The values used for matchV alue, mismatchV alue

and spaceP enalty were 1,−1 and −2 respectively. The similarity between two sequences was then defined as:

similarity(U, P) = LevenshteinDistance(U, P)

|U| (5.1)

where |U| was the length of the unperturbed sequence U. A similarity of 1.0 indicated a perfect match between the perturbed and unperturbed sequence—the phenotype was robust to the perturbation. Any value less than 1.0 indicated an imperfect match—the perturbation resulted in a modified phenotype.

For each ensemble we calculated two statistics: the percentage of perturbations that left the order and composition of terminal cell fates completely unchanged (a similarity of 1.0) and the percentage of perturbations that produced only minor changes to the terminal cell fates (a similarity greater than 0.9). The latter statistic reflects the possibility that small changes to an organism’s phenotype may have a negligible effect from the point of view of natural selection (Ohta, 2002).