a Question of Semantics
Networks can be found in a variety of domains [Bornholdt and Schuster, 2003, Sporns et al., 2004, Borgatti et al., 2009, Lazer et al., 2009, Hidalgo et al., 2009, Bullmore and Sporns, 2009] and the same structure possibly has a different meaning in each of these. This is because networks are used to encode different kinds of information: Train- or tube-networks, for example, generally represent the existence of direct physical connections between different stations. But con- nections can also be more abstract and encode references only, like for networks representing web-links from one web-site to another. Depending on the desired semantics of a network, different information is needed for its construction. Vice versa, the kind of information that is available determines which semantics a derived network can possibly have. In this section it will be discussed how the BD scores and the SSS use given data, i.e. which information is extracted for inference. This will show that both scores interpret the data differently, such that resulting networks do not share the same semantics.
in the score’s formula when applying Bayes’ theorem.) However, no attempt is made here in order to present the SSS as the outcome of such Bayesian concept, which is subject to future work.
5.2.1
Data Interpretation by BD Scores
As mentioned during the introduction of the BD scores (Section 2.2), these can handle any discrete data (with a finite number of states). No particular data- source was considered for their design and they are based on rather general assumptions; one of them is that states of variables do not possess any semantics. The ordering of the ri states variable i can take is thus not important and
any permutation of these results in the same score value. Therefore, the score cannot distinguish two complementary spike trains: If in one spike train all spiking time-bins are non-spiking ones in the other and vice versa, they can be exchanged without affecting the score value. In other words, the score cannot tell apart a spike train from a extremely active unit from one with very sparse firing, for example. For practical application this ambiguity is unlikely to have an effect when time-bins with high temporal resolution (1 msec) are chosen: If few or even single units are represented per channel, spike-events will be rare at common spike rates; channels that are active more than half of the time will probably not exist. Hence, it is unlikely that the data contain two time- series, which are complementary to each other. But the before mentioned rare occurrence of spikes can cause a different kind of problem, which is considered next.
The BD score assigns high score values to networks in which parents are good predictors of the child; more precisely, if the joint parent state and that of the child are reliably coupled. In a good network, particular firing patterns of parent nodes regularly evoke a certain response of their child. While there is nothing wrong with this criterion, practical network inference can be highly problematic when firing rates are low. In these cases spike events are clearly outnumbered by time-bins that code non-spiking. This enormous imbalance impacts on the assignment of scores: Structures are favoured for which parent- and child-states are well correlated; but when most time-bins code non-spiking, a reliable coupling between the parents and their child means that specifically their non-active states match well. Comparatively, occasional spikes are seen as minor disagreements, which have far less weight on the score value than the large number of matching non-spikes.5 While the highest score value is assigned to the structure where both non-spiking and spiking activity of the parents and the child match, it is extremely difficult to identify that structure. The reason
5This effect is due to the Γ-function’s increasing growth for larger numbers. This can
be easily seen by its relationship to the factorial [equation (2.13)] for whichStirling’s for- mula [Feller, 1950, pp.52] holds for large argumentsx:
x!≈√2πxx+12e−x. (5.1) As growth ofx! increases inx, parent-child-state combinations with high counts have higher weight for the score value than those for which counts are small.
for this is the similarity of scores for structures for which non-spiking bins match well. The score’s undifferentiated assignment is, however, correct, which can be best understood in an extreme situation: Consider themth order Markov model
P Xt(i) X¯t(1), . . . , X¯t(n) ¯ t∈{t−1,...,t−m} =PXt(i) pa(X (i)) (5.2)
for which the most appropriate parent setpa(X(i)) is to be determined. IfX(i) is in fact a constant, any combination of parents predicts its state equally well! Their scores should thus be similar or even equal. Neural data with rare spike events resemble a similar situation: Variables appear to be nearly constant, by which it becomes likely that several equally high scoring networks exist. When using search heuristics to recover these networks, their undifferentiated scores constitute different local maxima or even a plateau of solutions; this can lead to unstable results, such that connectivity of inferred networks differs substantially. This is problematic, since interpretation of highly diverse results is generally complicated, as model averaging techniques (Section B.3.1) might fail to identify any consensus between them. Analysing spike train data with the BD scores can thus be problematic; however, methods that can help to circumvent problems exist and will be discussed together with their drawbacks, later.
5.2.2
Data Interpretation by the SSS
The SSS has been designed with the aim to reveal excitatory relationships be- tween neural entities. The score is particularly laid out for one kind of data — spike trains. With this specific adaptation it is possible to respect the semantics of these data: Unlike the BD scores, absolute variable states matter to the SSS, which treats spiking and non-spiking events very differently. Indeed, since the SSS is designed to detect time-lagged correlation of spiking activity, it is not concerned with periods without any activity; the score of parent configurations is only determined by the behaviour of the child after its parents have been active. Whether the child is silent or spiking at other times does not affect the score value. Networks learned with the SSS therefore do not represent close cou- pling of linked units, but rather uni-directional relations: A link indicates that parent spikes reliably trigger spike responses of the child. Revealed connections do not imply anything about the child’s activity at times its parents are silent (unlike the BD scores).
The SSS’s emphasis on periods where putative causal units are active is mo- tivated by the relatively low spatial density and coverage at which spike train data can be collected nowadays. In detail, neurons for which spike trains are
recorded may show correlated activity with other recorded units; however, neu- rons for which no data is collected can also trigger firing in observed units. With such spike responses in mind, the SSS has been constructed: Child responses to hidden units do not affect the score value; the score only reflects the degree to which child spikes follow activity of observable parents. The SSS can thus be used to find plausible causal relations between observed units; these explana- tions of the data do not exclude the existence of external, hidden factors. The BD scores differ with respect to this aspect: While the SSS ignores uncoordi-
nated spikes of the child, the BD scores weight activity of the child regardless
whether its parents are active or not. Any lack of synchronisation of joint par- ent states and that of the child are interpreted as an indication for stochastic independence. Due to uncoordinated spikes, the child’s activity might not be fully explainable by observed units, such that links corresponding to suchpartial
correlation might not be learned with the BD scores.
Previous sections have explained how the BD scores and the SSS differ, as well as how learned networks are to be understood. It has also been explained why the application of BD scores to spike trains can be problematic, and ways to address these problems will be outlined shortly. Before that, however, it should be briefly considered whether or not it is generally useful to infer DBNs instead of temporally less precise networks corresponding to the SSS. This question is subject to the next section.