Some technical details - Model estimation

3.4 Model estimation

3.4.1 Some technical details

Recall that from Equation (3.2), a general basic ACD(p,q) model is defined by

yt = ψt�t ψt = ω+ p � i=1 αiyt−i+ q � j=1 βjψt₋j t= 1,2, ..., n,

with ω >0, αi, βj ≥ 0∀i, j. Moreover, in order to ensure the stationarity and existence of the unconditional expected duration the model requires that�_iαi+�_jβj<1. These actually imply that 0 ≤αi, βj< 1.

Given a set of initial values for the estimated parameters ˆθo, initial values for ψ1

and ψ2 need to be determined to start the the estimation procedure. In the spirit of

Engle and Russell (1998), the initial values forψ1 andψ2 are obtained in the following

way. For the ACD(1,1) models ψ1 = ˆωo/(1−βˆ1o). For the ACD(2,2) models ψ1 =

ωo/(1−βˆ1o−βˆ2o) and ψ2 = ˆωo+ ˆα1oy1+ ˆβ1oψ1. We found that the choice of starting

values had little eﬀect on the final parameter estimates.

Model estimation is essentially an issue of optimization. In this study, a general- purpose optimization R functionoptim, which was introduced in Section 2.4.2, is used for parameter estimation. We have chosen to use the default method inoptim, which is an implementation of that of Nelder and Mead (1965), so that the optimization process uses only function values and an analytic form of the first derivative of the objective function is not required.

In order to ensure the satisfaction of the model’s stationarity and moment existence conditions, the following transformation is imposed on the searching range for possible parameter values in the optimization process.

tsparam = log�param−lb

ub−param �

where ‘param’ stands for the original parameter value, ‘tsparam’ for the transformed parameter value, lb = lower bound, and ub = upper bound of the parameter value range. Hence,

param = lb+ ub−lb

1 + exp(−tsparam). (3.24)

Therefore, while the transformed values can vary over the whole real number line (from

−∞ to +∞), the original parameter values can only vary between a specified lower limit and an upper limit. For example, we may set lb= 0 andub= 1 for αi and βi in the process of parameter estimation so that the condition 0 ≤ αi, βi < 1 will always be satisfied.

Once an optimization process is initiated, the estimation results will be recorded when the process reaches its convergence state. If an optimization process reaches its iteration limit without convergence, the output results from the current run will be used as the input initial values for a second run of optimization. The optimization processes continue until theconvergence signal is obtained.

Based on Equation (2.8), AIC can then be calculated by

AIC(k) = −2 logL(y|θˆ, y1) + 2k (3.25)

for ACD(1,1) models, where ˆθ is a k×1 M-estimate vector. Accordingly, aPseudo-AIC (AIC∗) is defined by

AIC∗(k) = −2 logL∗(y|θˆ, y1) + 2k. (3.26)

The QMLE, ˆθ, in Equation (3.25) and (3.26) is a solution of the conditional likelihood equation

∂logL∗₍_y_|_θˆ_{, y}₁₎

∂θˆ = 0, (3.27)

for ACD(1,1) models. In the same way, AIC(k) and AIC∗₍_k_{) can be calculated for}

ACD(2,2) models. Note that the likelihood functions L(y|θˆ, y1) and L∗(y|θˆ, y1) have

been specified in Section 3.3 for various ACD models.

To obtain the result in Section 2.4.3, some special R functions have been written for solving Equation (2.32). The built-in R functionoptim is used as a minimizer function to find the numerical solution. The minimization objective function is the negative sample log-likelihood −logL∗(y|θˆ, y1) ≡ − n � t=2 logf(yt|Ft−1,θ)ˆ. (3.28)

Recall also that

GIC(k) = −2 logL(y|θˆ, y1) + 2 EMD, (3.29)

where EMD = tr�R(ψ,Gˆ)−1_Q_(ψ_,_Gˆ₎�_{. The two}_k_×_k _matrices,_R_(ψ_,_Gˆ_{) and}_Q_(ψ_,_Gˆ₎

are defined by R(ψ,Gˆ) =−_n1 n � t=1 ∂ψ(yt,θ) ∂θ � � � θ= ˆθ, and Q(ψ,Gˆ) = 1 n n � t=1 ψ(yt,θ)∂logf(�t|Ft−1,θ) ∂θT � � � θ= ˆθ, where ψ(yt,θ) = ∂ ∂θ logf(yt|Ft−1,θ).

But as we pointed out in the discussion on the model evaluation results given in Table 2.6 and Figure 2.1, the estimated residuals using the above fitting procedure still contain significant autocorrelation. The concern about the violation of the in- dependence condition with the estimated residual series makes us believe that the parameter estimation results could be unreliable due to the incorrect specification of the log-likelihood function logL∗₍_y_|_θˆ_{, y}₁_{). Therefore, we decided to use a constrained} log-likelihood function as the minimization objective function to ensure the remaining autocorrelation is minimized and the estimated residual mean is unity. In this way, the model estimation is still under an M-estimation framework.

The L-B statistic is used as a measure of autocorrelation in the estimated residuals. A penalty term in terms of estimated residual L-B statistic and the absolute value of the diﬀerence between the residual mean value and the unity is added to the negative log- likelihood function. The new minimization objective function is defined as (assuming an ACD(1,1) model)

−logL∗(y|θˆ, y1) +C(y|ωˆ,αˆ1,βˆ1), (3.30)

where C(y|ωˆ,αˆ1,βˆ1) is the penalty term which is defined by

C(y|ωˆ,αˆ1,βˆ1) = c1[L-B(ˆ�)] +c2[abs(mean(ˆ�)−1)], (3.31)

where L-B statistic value is calculated using built-in R function Box.test and c1,

c2 are two arbitrarily determined positive numbers.2 The notation ‘abs’ stands for 2_{In the self-written R functions, the L-B statistic is calculated with the following R commands.}_nj

= round(log(n) + 0.5); residLB = Box.test (residX, lag = nj, type="Ljung")$statistic

where residLB ≡L-B(ˆ�) and residX≡ˆ�. The two positive constants are set to bec1 =n/500 and c2= 105, wherenis the sample size.

‘taking absolute value’ operation and ‘mean’ for ‘calculating sample mean’ operation. The notation ˆ� ≡ {ˆ�t} is used to represent the estimated residual series. Since ˆ�t =

yt/ψt, the penalty term in the new minimization function is a function of parameterˆ estimates ˆω,αˆ1, and ˆβ1 which are always greater than zero. Therefore, the parameter

estimation for an ACD model is now implemented by minimizing the negative sample log-likelihood function−logL∗₍_y_|_θˆ_{, y}₁_{) subject to a small residual L-B score and a unit} mean constraint.

With the addition of the penalty term to the minimization objective function, a problem is created in calculating GIC scores. If we take a close look at the L-B statistic formula (Equation A.20), we realize that parameters are connected in a complicated way. We now have trouble calculating the EMD term in Equation (3.29) because it is very hard to express theψ(yt,θ) function analytically, and hence evaluating matrices

R(ψ,Gˆ) andQ(ψ,Gˆ). Therefore, we turn to AIC for model evaluation for ACD models. Equation (3.25) is valid for evaluating ACD models as we have justified the AIC usage under an M-estimator framework in Section 2.5.1. The model estimation results presented in the following section are obtained from optimization processes based on the minimization objective function (3.30). The calculated AIC scores for all fitted ACD models are reported in Section 3.5.

In document Further developments of two point process models for fine scale time series : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany (Auckland), New Zealand (Page 91-94)