CONSTRUCTING THE ARMA BASE PROCESS - EXTENDED ARTA PROCESSES

Extended ARTA Processes

CHAPTER 4. EXTENDED ARTA PROCESSES

4.2. CONSTRUCTING THE ARMA BASE PROCESS

Since the autocorrelation of a stationary ARMA(p, q) process is given by ρk = γ(k)/γ(0)

the fraction can be reduced by eliminating σ². Thus, the ρk are independent of σ² and one can set σ² such that the {Zt; t = 1, 2, . . .} have a standard normal distribution, i.e.

are N(0, 1), without modifying the autocorrelation structure of the base process.

[37] provides another formula for the autocovariance that is more appropriate for actually computing the autocovariances:

γ(k)−α1γ(k−1)−. . .−αpγ(k− p)= σ²

k≤ j≤q

βjψj−k, 0 ≤ k < max(p, q+1) (4.7) and

γ(k) − α1γ(k − 1) − . . . − αpγ(k − p)= 0, k ≥max(p, q+ 1). (4.8) Defining β0= 1, βj= 0, j > q and αj = 0, j > p the ψican be computed from

ψj− X

0<k≤ j

αkψj−k = βj, 0 ≤ j < max(p, q+ 1) and

ψj− X

0<k≤p

αkψj−k= 0, j ≥max(p, q+ 1).

Thus, for a given ARMA(p, q) process with ˜σ² being the variance of the white noise one can solve Equations 4.7 and 4.8 to obtain the autocovariance at lag 0, ˜γ(0), and then set the new variance to

σ² = ˜σ²/˜γ(0) (4.9)

resulting in a new N(0, 1) process with the same autocorrelations as the old process.

Finding suitable base process autocorrelations ρ = (ρ1, ρ₂, . . . , ρr) such that the extended ARTA process has autocorrelations ˆρ= (ˆρ1,ˆρ2, . . . ,ˆρr) requires a numerical search procedure. Observe from Equation 4.3 that it is only possible to compute ˆρh

from a given ρh, but not the other way round. Since in our case ˆρh is given (i.e.

estimated from the trace) and we have to determine the corresponding ρh, this has to be done numerically by a search algorithm. More precisely, we are searching for ρh

that minimizes

(ˆρh−ω(ρh))² (4.10)

where ˆρh is the desired ARTA correlation and ω(ρh) is the ARTA correlation that re-sults from a base process correlation ρh. The minimum of Equation 4.10 can be deter-mined by a golden section search [131]. This search algorithm always maintains three points x1, x₂, x₃. x1 and x3are the lower and upper bound of the interval under con-sideration and x2is a point within this interval. The initialization values for x1and x3

are set according to Proposition 4.1, i.e. x1and x3define the interval [−1, 0] if ˆρh< 0 or [0, 1] if ˆρh > 0. For ˆρh = 0 no search is necessary, since this implies ρh = 0. The optimal value for x2divides the interval [x1, x₃] such that x2 has a fractional distance of 0.38197 to x1and 0.61803 to x3(cf. [131]). The search procedure then iteratively selects a new point x4 that divides the interval [x2, x₃] into the fractions mentioned above, evaluates Equation 4.10 at x2 and x4and depending on the outcome continues

CHAPTER 4. EXTENDED ARTA PROCESSES

to search in the interval [x1, x₄] with mid-point x2or in the interval [x2, x₃] with mid-point x4until we have found an interval of a width less than a given . The mid-point of that interval is the base process autocorrelation ρh that results in a correlation for the extended ARTA model that is close enough to ˆρh. For autocorrelation coefficients estimated from real traces it is very likely that coefficients that are only a few lags apart have a similar value, e.g. the difference between ˆρh and ˆρh+1might be small in many cases. For these lags the intervals examined by the search algorithm are identical for the first iterations of the procedure. To improve the performance of the search algo-rithm already computed pairs (ρh, ω(ρh)) should be saved in a table such that they only have to be computed once and can be looked up if they are required again.

This way the autocorrelation structure can be determined numerically up to an arbitrary accuracy. A similar search procedure for ARTA models is given in [48].

Once the base process autocorrelation structure ρ= (ρ1, ρ₂, . . . , ρr) has been deter-mined, an ARMA(p, q) process has to be constructed that exhibits this structure. This can be done by using a general purpose optimization algorithm that minimizes the dif-ference between the autocorrelations of the ARMA model that is constructed during the minimization process and ρ= (ρ1, ρ₂, . . . , ρr).

Recall, that we required the ARMA(p, q) to be stationary and thus, we additionally have to penalize non-stationary solutions.

The stationarity of an ARMA(p, q) model only depends on the AR coefficients αi

of the process, i.e. the process is stationary if the zeroes of the polynomial α(z) = 1−α1z −α₂z²−. . .−αpz^plie outside the unit circle [35]. More formal, the ARMA(p, q) process is stationary if and only if α(z) , 0 for all z ∈ C with |z| ≤ 1.

Hence, we use the following goal function to fit an ARMA(p, q) process to a set of given autocorrelations:

arg min

α₁,...,αp,β₁,...,βq

i=1

ρ^∗_i ρi

−1

+ % X

ξ,α(ξ)=0

min (0, (ξ − (1+ ε)))². (4.11)

The first term is the objective function to minimize the difference of the autocorrelation coefficients, the second term is the penalty function. The ρi in Equation 4.11 are the autocorrelation coefficients to be achieved and the ρ^∗_i are computed from the ARMA process as described in Equations 4.7 and 4.8. Thus, if the α1, . . . , αp, β₁, . . . , βq de-scribe a stationary ARMA process the ρ^∗_i are the autocorrelation coefficients of the ARMA(p, q) model that is constructed during the minimization process. If the ARMA process is not stationary the values can be computed anyway, but do not have the in-terpretation as autocorrelation coefficients. In this case the penalty function is used to obtain a larger value for the goal function to force the optimization algorithm to leave the non-stationary region. For the penalty function all roots ξ of α(z) are consid-ered. Let ε > 0 be some small constant, i.e. for an implementation of Equation 4.11 ε should be the smallest value such that 1+ ε , 1. Then the term ξ − (1 + ε) is non-negative, if the roots are outside the unit circle, i.e. the model is stationary, and min (0, (ξ − (1+ ε)))²is zero. For non-stationary models at least one root lies inside the unit circle and min (0, (ξ − (1+ ε)))² >0. The penalty function is multiplied with some factor %, which is increased in each iteration of the minimization to ensure that the algorithm leaves the area with non-stationary solutions. For a stationary solution the goal function only depends on the difference between the autocorrelation

coeffi-4.2. CONSTRUCTING THE ARMA BASE PROCESS

cients and the optimal solution that exactly matches the desired autocorrelation will result in a goal function value of zero.

Hence, finding an appropriate ARMA(p, q) base process for a given marginal dis-tribution FY and given autocorrelations ˆρ = (ˆρ1,ˆρ2, . . . ,ˆρr) for the extended ARTA process consists of three steps:

1. For each ˆρhfind a ρhsuch that Corr[Yt, Yt+h]= ˆρhusing Equation 4.2.

2. Fit an ARMA(p, q) model to the autocorrelations ρ= (ρ1, ρ₂, . . . , ρr) determined in the first step.

3. Adjust the variance of the innovations of the ARMA(p, q) base process resulting from the second step according to Equation 4.9.

Extended ARTA models can have any marginal distribution for which the inverse cdf can be computed. This includes uniform, triangular, normal and lognormal dis-tributions. Moreover, of course distributions from the Johnson system, which have been used for the original ARTA process from Section 2.2, can be applied. From the class of Phase-type distributions exponential and Erlang distributions could serve as marginal distribution. Furthermore, gamma and χ²distributions, which are related to the Erlang distribution but are not Phase-type, can be used. See e.g. [104] for a defi-nition and closed-form expressions for most of the mentioned distributions. For some distributions no closed-form expression for the inverse cdf exists. In these cases fast numerical algorithms are available. The inverse cdf of normal, lognormal, and John-son distribution can be computed using [163], for Erlang, gamma and χ²distributions the algorithm from [20] can be used. A brief overview of the mentioned distributions and their properties can be found in Appendix B.

For the above considerations the marginal distribution FY is assumed to be given.

Furthermore, the determination of the order p, q of the base process has been left open.

These issues will be addressed in Chapter 7 where the ideas for extended ARTA models and further process types developed in the following chapters are integrated into an algorithmic framework.

Chapter

5

In document Fitting simulation input models for correlated traffic data (Page 70-73)