• No results found

Temporal Network Partitions

3.2 Aggregation Errors in Communicability Calculation

3.2.2 Temporal Network Partitions

The examples above highlight the potential errors which can be introduced when applying the communicability metric to aggregated temporal networks. We now outline four particular partitions of time, their effect on the errors in communicability metric, and simple algorithms to calculate these partitions. For simplicity we assume the full temporal network is known in advance, however in most cases these partitions are easily extended to instances where it is not, such as in a real-time implementation.

Fixed-Length Interval Partition

The fixed-length interval partition divides the temporal network into intervals of equal length.

This partition is the most commonly used in the literature [58, 132, 99]. Partitioning the network in this manner has a number of advantages that make it attractive for study. One advantage is that the number of intervals can be calculated in advance and so the run-time of the algorithm can be estimated easily. The fixed-length interval partition is also the most simple as the partition requires no knowledge of the temporal network beyond the times at which events occur. However, to ensure convergence of the communicability matrix one needs to calculate the spectral radius of each adjacency matrix which is anO(N2)operation. For convergence we require α < 1/ max

k(ρ(Ak))

where ρ(·) is the spectral radius. The maximum spectral radius possible for a network

involving N nodes is N − 1, meaning for large systems the parameter α is potentially

extremely small and the communicability metric will correlate strongly with the degree of each node.

meaning to the parameter and impossible to compare values across different temporal networks. For Google’s PageRank centrality [91], the corresponding model parameter is the perceived (and to some extent measured) probability of following a hyperlink on a web page, rather than typing in a new internet address. No such meaning can be given here.

For real-time implementation there is an added caveat; if the latest interval has an adjacency matrix with a larger spectral radius than those previously calculated then

α needs to be reduced to conform to the new restriction. As a result, all previous

communicability scores need to be recalculated with the new parameter which can become a computationally intensive task.

The fixed-length interval partition has the potential to carry all three types of calculation error.

Acyclic Partition

An acyclic partition of the temporal network consists of a set of intervals covering the time frame such that each adjacency matrix which encodes the interactions over each interval is acyclic. This is equivalent to requiring that each adjacency matrix is nilpotent, i.e., there exists an m such that Am = 0. Following previous notation the partition is

formally given as the set{I0, I1, . . . , In} such that for all k = 0, . . . , n there exists an m

such that Am k =0.

The eigenvalues of a nilpotent matrix are special in that they are all zero [133]. Consequently the spectral radius of each matrix is trivially zero. The restriction placed on α in the calculation of the communicability matrix reduces to α being a positive finite constant. With free reign to choose α this allows a physical meaning to be attached to the parameter as well as allowing the use of a constant parameter across data sets. In the information transfer setting α can be seen as a probability of successfully passing information across an edge provided 0≤ α ≤ 1.

The average length interval in an acyclic partition depends wholly on the underlying system being studied. For electronic instantaneous communication (Twitter, email, etc.)

Aggregation Errors in Communicability Calculation 49

the most likely cause of cycle creation is through reciprocated messages. This can happen in the order of tens of seconds for short messages, to hours for longer messages [134,135].

The acyclic partition removes the error of counting infinitely many paths within a time frame, however it does not guarantee that causality is preserved. As the partition produces variable length intervals the decay of paths is also uneven.

There are many ways to detect cycle formation in growing networks. Bender et al. [136] and Haeupler et al. [137] use a two way search to maintain a topological ordering of events as well as to detect cycles. The best of these algorithms work at O(M3/2

k )where

Mk ≪ M is the number of events in a given time step. This calculation is insignificant

compared to the matrix inversion and multiplication involved in the calculation of communicability at each time frame.

Causality-Preserving Partition

A causality-preserving partition ensures that within each interval the causal relationship of any two adjacent events, i.e. events that share at least one node, is preserved. This equates to enforcing that all events for which a node is a source occur after events where the same node is a target. This condition also prevents the formation of cycles and hence is a stricter partition than the acyclic partition. As we have explicitly removed the causality error described earlier and there are no cycles in each adjacency matrix the only error realised with this partition is the incorrect time decay of temporal walks.

True Partition

The true partition of the temporal network is where each event occurs exclusively in its own interval. This assumes that no two events can occur at the same time. However, in real life data temporal events are recorded at discrete times, such as every second. This makes it impossible to always guarantee a true partition is possible, however we will make the assumption that events occur at unique times or that the chance of two events

occuring at the same time is negligible1.

We will consider this partition of the network as the ‘ground-truth’ when calculating the communicability centrality.