The observed existence of spurious detection or overestimation [74] is not uncommon and has been reported in relation to causality measures in [100,51,101,79,71]. These spurious values are caused by bias in relation to individual dynamics, state space reconstruction, coupling measure on so on so forth. The bias of an estimator is the difference between estimators expectation value and it’s theoretical value. Bias in estimation causes non-zero spurious values when there is no causal effect and this problem is not only unique for Transfer Entropy [79]. This is a problem in which positive bias may be misinterpreted
8.3 Correcting for finite sampling effects 148 5 10 15 20 25 30 35 40 45 50 10-8 10-6 10-4 10-2 100 102 ns I(X,Z) simulation S=100 simulation S=1000 simulation S=10000 simulation S=100000 simulation S=1000000
Figure 8.13: Figure (8.12) with log values on y axis
5 10 15 20 25 30 35 40 45 50 -20 -10 0 10 20 30 40 50 ns Γ (X,Z) simulation S=100 simulation S=1000 simulation S=10000 simulation S=100000
Figure 8.14: Covariance Γ(X, Z) versus number of states ns for null model. Analytical
values are0 and simulated values acquired using equation (1.2) on simulated data of varying sample sizeS.
Chapter 8. Finite sampling effects and estimations 149 as weak coupling when there is actually no causal effect. Therefore there needs to be way to indicate significance and reduce bias so that the causal measure gives zero values when there is no causal relationship. In [100, 51] correction terms to cancel out the bias related errors are suggested. Another alternative to cope with the finite sampling effects is significant testing. Surrogates have been suggested as a form of significant testing for Transfer Entropy [101,102,78,75].
8.3.1
Surrogates for significant testing
When compared to G-causality, it is often pointed out that significant testing is not present for Transfer Entropy. Schreiber outlined that directionality can only be concluded if the value of Transfer Entropy is 0 in one direction and nonzero in another. However due to
bias, the value0 is not normally obtained for Transfer Entropy in real data sets. [79] points out the importance of having significant test for causality measures in terms avoiding false directionality conclusions.
[78] claims that the only practical significant testing for Transfer Entropy is probably in the form of surrogates. Surrogates data sets are synthetically generated data which ideally preserve all properties of the underlying system except the one being tested [101]. There are many different types of surrogates to serve different purposes. Fourier surrogates are used to randomize frequencies [90, 101]. Randomizing temporal values have been done using permutation surrogates [78], time shift test [102] and twin surrogates [101]. Surrogates have also been used in testing whether or not data sets are nonlinear [75]. Surrogates in the form of reshuffled time series are utilized in [72,18]. The idea is to break the coupling (causal link) but maintain dynamics in hope that one can differentiate cause and effect from the any other dynamics.
Significant testing with surrogate is usually done as a standard one sided hypothesis test where the null hypothesis is that the two systems (time series) are independent. Attempts are made to reject the null hypothesis with a certain confidence level. A more inclusive test taking into account different directions and non-directionality is proposed in [79]. Rather than testing for surrogates separately it has also been suggested that significant testing can be done in a form of modified information theoretic functionals [85].
8.3 Correcting for finite sampling effects 150
8.3.2
Effective and Corrected Transfer Entropy
From Figures (8.5) and (8.6), it seems like the values are simply shifted upwards and if one could simply subtract values related to the shift then perhaps the true values would be obtained. This is the idea behind the effective and corrected Transfer Entropy. Effective Transfer Entropy [71] between two time series is the modification of Transfer Entropy defined as the difference of Transfer Entropy computed on the original time series and Transfer Entropy computed between a surrogate time series where the driving process is randomly shuffled. Therefore in relation to our definition of Transfer Entropy in equation (4.5) the effective Transfer Entropy can be defined as
ETY X(τ ) = TY X(τ ) − TY(τ )S,X (8.1)
where YS is the randomly shuffled surrogate of time series Y . Figures (8.15) and (8.16)
displays the values on effective Transfer Entropy on Case3 of the general model and null
model in direct contrast to Figures (8.9) and (8.10).
5 10 15 20 25 30 35 40 45 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 n s ET zx (t Z ) analytics simulation S=1000 simulation S=10000 simulation S=100000
Figure 8.15: Effective Transfer Entropy ET(tZ)
ZX versus number of states ns for Case 3.
Analytical values are obtained with equation (7.17) and simulated values ac- quired using equation (8.1) on simulated data of varying sample sizeS.
Chapter 8. Finite sampling effects and estimations 151 5 10 15 20 25 30 35 40 45 50 -0.15 -0.1 -0.05 0 0.05 0.1 n s ET zx (t Z ) simulation S=100 simulation S=1000 simulation S=10000
Figure 8.16: Effective Transfer EntropyET(tZ)
ZX versus number of statesnsfor null model.
Analytical values are0 and simulated values acquired using equation (8.1) on simulated data of varying sample sizeS.
The corrected Transfer Entropy suggested in [82] generalizes the effective Transfer Entropy by taking the average values ofM permutation surrogates instead of just one real-
isation such that
ETY X(τ ) = TY X(τ ) − M X i=1 TY(τ ) Si,X (8.2)
whereYSi is the ith randomly shuffled surrogate of time series Y . The reasoning behind
this is that surrogate have bias of their own [101] and by taking the average of different realisations the bias and variance is reduced producing a much stable and smooth estimate of Transfer Entropy on the shuffled surrogate. From Figure (8.16), using sufficient surro- gate estimate, it may be possible to identify0 values of Transfer Entropy on the toy model
where overestimation is only due to insufficient data to get good probabilities. However, in real data sets there are many other factors to be taken into account in terms of obtaining good probability.