Learning Networks Using the Snap Shot Score

2.2 The BD-Score Family

3.1.2 Learning Networks Using the Snap Shot Score

Potential information flow networks are assessed by identifying each data chan- nel with one network node. Every node is then assigned a score value depending on the nodes linked to it. A child-node is scored by applying the SSS to the join of all parent-channels and the child’s spike train.3 If a node does not have any parents, its score value requires the joina(1,...,n) of all channels. With the child’s spike trains, the score of the parent-less node is SSS a(1,...,n), s; ∆t

if this value is non-zero, and 1 otherwise. Finally, the score of the full network is the product of all its nodes’ scores.

Learning an information flow network from data generally involves scoring many potential structures. Ideally, the highest scoring one would be found. Be- cause of the score’s decomposability (Section E.1.1), the best scoring network can be assembled from each node’s best scoring parent configuration. Thus, full network scores need not to be calculated for learning, but it is sufficient to determine each node’s optimal parent configuration. In order to identify these with certainty, all 2n _{possible joins for each node would have to be evaluated}

(Fig 3.2).4 However, for practical dimensions (like a 60 electrode array, for example) there are far too many joins for an exhaustive evaluation (Section B.2.1). To circumvent this problem, the set of information flow networks to score can be limited to those with sparse connectivity or by limiting the number of parents per node. The number of potential child-parent relations might also be reduced by excluding connections ruled out by factual knowledge (like large physical

3_{Recall that a link’s source node is called a}_parent _{of the destination node (}_child_{) (Sec-}

tion B.1). A loop-link renders a node parent and child at the same time. Such configurations will be referred to asself-exciting.

distance between electrodes, for example). Additionally or alternatively, search heuristics and Monte Carlo methods can be used to select promising configurations to assess. For a discussion of suitable techniques see appendix D.

Incorporating Prior Knowledge

Network inference can be assisted by prior knowledge about the studied system. Here available information will be used in order to derive a separate

link-acceptance-threshold (LAT) for each network node. During network learn-

ing any parent configuration with a score value lower than the child node’s LAT will be rejected. This selection removes irrelevant links and can lead to sparser, more relevant networks.

The LAT is chosen to reflect the best explanation for the data at a particular level of complexity. The actual level of complexity is determined by the prior information at hand: Knowledge about the studied system constrains the space of potential parent configurations for each node. For example, self-excitation might be excluded or observed units are known to only have few interaction partners. The space of potential configurations can thus be restricted to a particular level of complexity, i.e. number of parents. Configurations at the highest permitted level of complexity determine the LAT, which is the highest score value of these configurations. This highest scoring configuration (LAT-

configuration) reflects the best explanation for the data at a level of complexity,

which could not be limited further by using prior knowledge. In other words, the LAT-configuration represents all background information formulated in terms of the SSS. Better explanations than the LAT-configuration might exist; these are simpler configurations with scores equal or above the LAT. Calculating the SSS for several parent configurations might reveal such superior explanations. Ultimately, we seek to find the simplest among the best scoring configurations consistent with prior knowledge. Thus, any configuration with a score value below LAT should be omitted from result lists, as it gives a worse explanation for the data than prior knowledge, i.e. the LAT-configuration. For a demonstration of how to determine the LAT consider the following

Example 4 (LAT) The spike trains shown in Fig. 3.1a are analysed using the

SSS (d= 3−3, ∆t= 1). Assume that no prior information is available, which

facilitates any restriction of potential parent configurations. We thus consider all possible configurations including self-excitation of units (Fig. 3.2). In order to determine the LAT for node 1 its most complex parent configurations must be found and scored. Here, only one configuration of highest complexity exists; namely that where nodes 2, 3, and node 1 itself are parents of node 1 (Fig. 3.2

1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 1 3 2 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 0.26 0.52 0.00 0.26 0.25 0.42 0.13 0.26 0.19 0.00 0.07 0.04 0.33 0.25 0.21 0.19 0.28 0.50 0.22 0.33 0.28 0.35 0.25 0.28

all configurations excluding self-excitation all configurations including self-excitation

1 2 3

! "# $

best scoring network (self-excitation excluded)

best scoring network (self-excitation permitted)

Figure 3.2: Exhaustive evaluation of all possible parent configuration of 3 nodes (top) for spike trains shown in Fig. 3.1a. Each row shows the configurations for one child node with the rounded SSS value above each (d= 3−3, ∆t = 1). All configurations excluding self-excitation are located on the left half; consid- ering self-excitation doubles the number of configurations (left and right part together). Best scoring parent configurations (high-lightened in gray) combined to full networks (bottom). Depending on whether or not self-excitation is con- sidered, top-scoring networks differ.

threshold. Similarly, for nodes 2 and 3 we find thresholds0.19and0.28respec-

tively. The best scoring parent configurations of each node exceed each node’s LAT; they are thus all included in the combined network (Fig. 3.2 bottom). (Otherwise, i.e. if the LAT was not out-valued by the best scoring configuration, corresponding links would be omitted.)

The preceding sections showed how to interpret and how to use the SSS. It has already been noted that the score favours sparse networks, but no practical demonstration of this and other features has been given yet. The next section uses simple examples to illustrate the score’s characteristics, which are investigated formally later (Chapter 4).

In document Causal pattern inference from neural spike train data (Page 52-55)