Detecting Long-Range Dependence using Parametric Models

Review of Detrended Fluctuation Analysis

A.3 Investigating the Prague Temperature Anomalies

A.3.2 Detecting Long-Range Dependence using Parametric Models

From a previous analysis (not shown), we find the FARIMA[2, d, 2]being the most suitable model for the Prague maximum temperature residuals with parameters given in Table A.1. The fractional difference parameter was estimated to be bd = 0.109(0.017)for this model. Employing the Gaussian limiting distribution, we find a 95% confidence in- terval for the difference parameter of [0.076, 0.142] and thus d = 0 for SRD processes is not included. We thus infer the underlying process to be LRD.

It remains to investigate whether we can correctly identify the realisation of the AR- superposition as coming from an SRD process. This is pursued in the following.

Stochastic Modelling of the AR[1]Superposition

The AR model (A.2) is constructed such that realisations generated from it mimic the characteristics of the Prague temperature anomalies. We thus do not have to account for deterministic components, such as the seasonal cycle or a trend. Starting with the set of FARIMA[p, d, q]and ARMA[p, q]models compiled in a previous analysis and reduce these with the model selection strategies: the goodness-of-fit test, the HIC-based model selection and, finally, the likelihood-ratio-based model selection.

1.5 2.0 2.5 3.0 3.5 4.0 0.4 0.6 0.8 1.0 1.2 log10 s α (a) 1.5 2.0 2.5 3.0 3.5 4.0 0.4 0.6 0.8 1.0 1.2 log10 s α (b)

Figure A.9: Local slopes α (or H) of the fluctuation functions plotted in Figure A.8 for DFA1 (a) and DFA2 (b) of the Prague daily temperature data. The dotted lines border the 1.96bσ confidence regions of the short-range correlated model (A.2) (dark shadow), the dashed lines those of the long-memory model with H=0.6 (light shadow).

Parameter FARIMA[2, d, 2] ARMA[5, 3] ARMA[8, 1] d 0.109(0.017) – – a1 1.210(0.066) 1.548(0.275) 1.762(0.012) a2 −0.332(0.042) −0.139(0.544) −0.879(0.012) a3 – −0.679(0.368) 0.145(0.008) a4 – 0.314(0.116) −0.035(0.008) a5 – −0.050(0.019) 0.005(0.008) a6 – – −0.003(0.008) a7 – – 0.002(0.008) a8 – – −0.005(0.004) b1 −0.513(0.057) −0.741(0.275) −0.955(0.011) b2 −0.106(0.007) −0.567(0.326) – b3 – 0.342(0.095) – p-val 0.074 0.071 0.075

Table A.1:Maximum likelihood parameter estimates and asymptotic standard deviation in paren- theses for the FARIMA[2, d, 2], ARMA[5, 3] and ARMA[8, 1] process obtained from the Prague temperature residuals. The last line gives the p-values of the goodness-of-fit test.

Goodness-of-fit On the basis of the goodness-of-fit test (2.4) applied to the before men- tioned set of models we reject only the FD (FARIMA[0, d, 0]) model on a 5%-level of significance.

HIC Model Selection Figure A.10 depicts the HIC values for the models considered. It is striking that for the most cases the ARMA[p, q] model has a smaller HIC than the corresponding FARIMA[p, d, q]model. Investigating the fractional difference parameter for the latter models (Table A.2) reveals that besides FARIMA[1, d, 0], FARIMA[1, d, 1]and FARIMA[2, d, 0]all estimates are compatible with the d = 0 within one (or, in one case, 1.96) standard deviation. We retain these models, as well as ARMA[p, q] models with orders(p, q) ∈ {(2, 1),(2, 2),(3, 1),(3, 2)}. Keeping other FARIMA[p, d, q]models is not meaningful, because their fractional difference parameter is compatible with zero and thus they are equivalent to the corresponding ARMA[p, q]processes.

Likelihood-Ratio Test With a likelihood-ratio test, we investigate whether the three models with a non-trivial fractional difference parameter are admissible simplifications of the FARIMA[2, d, 1], the simplest models with trivial difference parameter. The p-values in Table A.3 (right) reveal that we can reject this hypothesis for all three cases on a 5%- level (or even 1%-level) of significance. This implies, that there is no evidence for a LRD process underlying this data series.

Among the ARMA[p, q]models, we find the ARMA[2, 2]and ARMA[3, 1]as admissible simplifications of the ARMA[3, 2](Table A.3, left, lines 1 and 2). Both models can be further simplified by ARMA[2, 1](lines 3 and 4). The latter is the most suitable model out of the canon we started with for the realisation of the AR-superposition (A.2).

Using the full-parametric modelling approach we are thus able to correctly identify the realisation of a superposition of three AR[1]processes as coming from an SRD process. This was not possible by means of DFA. The original model, a superposition of AR[1] processes (A.2), was not included in the model canon we used to describe the example series with. However, such a superposition can be well approximated with ARMA[p, q] processes.

139060 139080 139100 Model Order [p,q] HIC [0,0] [1,0] [1,1] [2,0] [2,1] [2,2] [3,0] [3,1] [3,2] [3,3] [4,0] [4,1] [4,2] [4,3] [4,4] [5,0] [5,1] [5,2] [5,3] [5,4] [5,5] [6,0] [6,1] [6,2] [6,3] [6,4] [6,5] [6,6] [7,0] [7,1] [7,2] [7,3] [7,4] [7,5] [7,6] [7,7] [8,0] [8,1] [8,2] [8,3] [8,4] [8,5] [8,6] [8,7] ARMA[p,q] FARIMA[p,d,q]

Figure A.10:HIC for various ARMA[p, q](green) and FARIMA[p, d, q](red) models fitted to realisation of the AR model (A.2) The model orders[p, q]are plotted along the abscissa.

Model db(σ_d_b) [1, d, 0] 0.148(0.010) [1, d, 1] 0.158(0.011) [2, d, 0] 0.160(0.011) [2, d, 1] 0.000(0.015) [2, d, 2] 0.000(0.037) [3, d, 1] 0.000(0.037) [3, d, 2] 0.012(0.017) [4, d, 1] 0.000(0.040) [3, d, 3] 0.028(0.038) [4, d, 2] 0.027(0.038) [5, d, 1] 0.000(0.042) [4, d, 3] 0.023(0.022) [5, d, 2] 0.000(0.041)

Table A.2: Estimates and asymptotic standard deviation of the fractional difference parameter obtained for the FARIMA[p, d, q]models with smallest HIC.

ARMA[p, q]

Model f Model g p-val 1 [3, 2] [2, 2] 1 2 [3, 2] [3, 1] 0.713 3 [2, 2] [2, 1] 0.584 4 [3, 1] [2, 1] 0.939

FARIMA[p, d, q]

Model f Model g p-val 1 [2, d, 1] [2, d, 0] <0.001 2 [2, d, 1] [1, d, 1] <_0.001 3 [2, d, 1] [1, d, 0] <0.001

Table A.3: p-values for a likelihood-ratio test of model g being an admissible simplification of

model f for ARMA[p, q]and FARIMA[p, d, q]models of the realisation stemming from the AR[1]- superposition (A.2).

A.4 Summary

In a simulation study, we investigated bias and variance for a DFA-based estimator for the Hurst exponent using realisations of power-law noise. These properties of the estimator are not only influenced by the time series length but also by the Hurst exponent of the underlying process. Without knowledge about bias and variance, inference about the Hurst exponent cannot be made.

We further considered the inference of LRD by means of DFA with respect to the notions of sensitivity and specificity. The inference of a LRD process underlying a given time series requires not only to show compatibility of the data with a LRD process in a certain range of scales. Furthermore, other possible correlation structures, especially SRD, have to be excluded.

Power-law like behaviour in some range of scales of the DFA fluctuation function alone is frequently taken as evidence for LRD. To reliably infer power-law scaling, it must not be assumed but has to be established. This can be done by estimating local slopes and investigating them for constancy in a sufficient range. However, finite data sets bring along natural variability. To decide, if a fluctuating estimation of the slope has to be considered as being constant, we calculated empirical confidence intervals for a LRD and a simple SRD model.

Discussing typical difficulties of interpreting DFA results, we note that scaling cannot be concluded from a straight line fit to the fluctuation function in a log-log representation. Additionally, we show that a local slope estimate bHDFAn > 0.5 for large scales does not necessarily imply long-memory. If the length of the time series is not sufficiently large compared to the time scales involved, also for SRD processes bHDFAn = 0.5 may not be reached. Finally, we demonstrated, that it is not valid to conclude from a finite scaling region of the fluctuation function to an equivalent scaling region of the ACF.

With the Prague temperature anomalies and a corresponding artificial series from a SRD process, we exemplify the problems and pitfalls discussed. Using DFA we could not discriminate the Prague record from the artificial series. Furthermore, by means of a log-log plot, we classify the underlying process of both series as LRD, leading to a false positive result for the SRD process. A reliable identification was possible with the full- parametric ansatz (Section 3.4).

Because it is always possible to find and SRD process to describe a finite set of data, some criterion is needed to evaluate the performance of the description taking the com- plexity of the model into account. We thus approach the problem of distinguishing SRD and LRD processes on the basis of a realisation of a parametric modelling point of view developed in Chapter 3.

Appendix B

Long-Range Dependence – Effects,

In document Spectral Analysis of Stochastic Processes (Page 60-65)