A Model for Time Series Analysis

(1)

A Model for Time Series Analysis

A. H. Pooi

Sunway University Business School Sunway University

Bandar Sunway, Malaysia [email protected]

Abstract

Consider a time series model in which the response r



tt



at a short time t ahead of the present time t depends on the present responser

 

t and l1 other responses before time t via a conditional distribution with parameter

)

x(t ,and the parameter x



tΔt



depends on the present parameter x(t) and 1



m other parameters before time t via another conditional distribution with parameter θ . We propose a method for estimating the above two types of conditional distributions. A data set on interest rate is used to compare the performance of the proposed model and the Chan, Karolyi, Longstaff and Sanders (CKLS) model. The comparison shows that the prediction intervals derived from the proposed model have coverage probabilities and expected lengths which are comparable to those of the prediction intervals based on the CKLS model.

Mathematics Subject Classification: 62M10

Keywords: Time series model, conditional distributions, prediction intervals.

1

Introduction

Given the observed response variables y₁, y₂,..., y_t_₁ , one may use the conditional probability density function (pdf) f(y_t | y₁, y₂,..., y_t_₁ ) to specify a time series model for the next response yt. Some examples of important time series model are as follows:

(2)

(1) The ARMA (p, q) model (Box and Jenkins,1970) given by _t _i q i i i t p i i t t c y y _           



1 1

where c and the _i and _i are constants, and the error terms _t are assumed to be independent and identically normally distributed random variables with mean zero and variance 2

.

(2) The Auto-Regressive Conditional Heteroscedasticity (ARCH) model (Engle, 1982) for the time series _t is given by _t _tz_t where z_t is a strong white noise process, and the series _t2 is modeled by

where the i are constants.

(3) The Generalized ARCH (GARCH) model (Bollerslev, 1986). This model is similar to the ARCH model except that the series _t2 is now modeled by

2 2 2 0 1 1 q p t i t i i t i i i    _  _    







where the _i are constants.

(4) State space model (Anderson and Moore,1979; Lewis,1986) An example of the state space model is

1 ( , x, ) t  ft t  t x x yt has pdf (. | , ) y t t g x 

where x_tRd is the vector of the unobserved state variables,

that of the observed response variable, is a vector of p unknown fixed parameters, _t is a vector of random variables, the

 

_t

(3)

being independent of each other, ft is a known Borel measurable function, g_t is an absolutely continuous probability distribution function (pdf) with bounded density.

Among the above four examples, although only the last example gives the conditional pdf of the next response directly, the equations relating the next response to the observed responses enable us to find the conditional pdf for the next responses in the first three examples.

Presently, an alternative time series model is proposed. In the proposed model, we assume that the responser



tt



at a short timet ahead of the present time t may depend on the present response r

 

t and l1 other responses r



tt

 

,r t2t



,...,r



t(l1)t



before time t via a conditional distribution with parameter x

 

t . We also assume that the parameter x



tΔt



may depend on the present parameter x

 

t and m1 other parameters x



tΔt

 

,x t2t



,..., x



t(m1)t



before time t via another conditional distribution with parameter θ. We may refer to the parameterx

 

t as a Type 1 parameter, andθ a Type 2 parameter.

The dependence of the parameter x

 

t on t is incorporated in the model to take care of the situation when the time series formed by the responses is non-stationary. Furthermore, when the present Type 1 parameter x

 

t and m-1 other Type 1 parameters before time t are given, a conditional distribution is imposed on the future Type 1 parameter x



tΔt



to describe the variation of the future Type 1 parameter as t increases.

The above conditional distributions with respectively Type 1 and Type 2 parameters provide a mechanism for generating the future responses



t t

 

,r t 2 t



,...

r    .

The proposed model and the Chan, Karolyi, Longstaff and Sanders (CKLS) model are compared using a set of interest rate data. It is found that the prediction intervals derived from the proposed model have a performance which is comparable to that of the prediction intervals derived from the CKLS model in terms of the ability of the prediction intervals to cover the future interest rates, and the average length of the prediction intervals.

(4)

2

A Time Series Model

Let r ( )* t [ (r t  (l 1) t), ..., (r t t r t), ( )] be a vector formed by the present response r(t) and l-1 other responses before time t. We assume that the response (r t t) at a short timet ahead of time t depends onr*

 

t via the conditional pdff_{x (t)}(r(tt)r*(t)) with Type 1 parameter x

 

t .

Next let x*

 

t [x(t(m1)t), ..., x(tt),x(t)]be a vector formed by the Type 1 parameter x

 

t at the present time t and m-1 other Type 1 parameters before time t. We assume that the Type 1 parameter x



tΔt



at a short time

t

 ahead of time t depends on _x*

 

t

via the conditional pdf f_θ( (x t t)x ( )* t ) with Type 2 parameterθ which does not depend on t.

The above two types of conditional pdf are sufficient to specify a model for the responses. For a given integer , the future responses (r t t r t), (  2 ), ..., (t r t J t) may be generated from the given values of _x*

 

t

and _r*

 

t

using the following procedure:

(1) Generate r t(  t) from r*

 

t using the conditional pdf f_{x (t)}(r(tt)r*(t))

(2) For j = 1, 2, …, J1

(i) Generate x



t jt



from the value



t(j1)t



 * x







( 1) ( 1)



, ...,



( 2)

 

, ( 1)



] [xt j  m t xt j t x t j t using the conditional pdf

(ii) Generate r t(   (j 1) t) from the value





[



( ( 1))



, ...,



( 1)

 

,



] * t j t r t j t r t l j t r t j t            r

using the conditional pdf f_{x (}_t_j_t₎(r(t(j1)t)r*(t jt)) We may estimate the above conditional distributions using a type of multivariate non-normal distribution called the multivariate power-normal

(5)

distribution which will be described in Section 3. Sections 4 and 5 will deal with the estimation of the conditional distributions.

3

Conditional pdf Derived from Multivariate Power-normal

Distribution

Yeo and Johnson (2000) considered the following power transformation





































                                     0 , 0 1 log 0 , 0 ] 1 1 [ 0 , 0 1 log 0 , 0 ] 1 1 [ , , ~           _  z z z z z z z z z (3.1)

If z has the standard normal distribution, then ~ has a non-normal distribution which is derived by a type of power transformation of a random variable with normal distribution. We may say that ~ has a power-normal distribution.

We may now use the univariate power-normal distribution to obtain the multivariate power-normal distribution. First let y be a vector consisting of k correlated random variables. The vector y is said to have a k-dimensional power-normal distribution with parametersμ H, , λ_i, _i-, , 1ik if

Hε μ

y  (3.2)

where μE

 

y , H is an orthogonal matrix, 1,2,...,k are uncorrelated,



 





 



2 1 ~ var ~ ~ i i i i i   E      , (3.3) i

 > 0 is a constant, and~i has a power-normal distribution with parameters



i λ and _i.

When the values of y₁,y₂,...,y_k_₁ are given, we may find an approximation for the conditional pdf of y_k by using the following numerical procedure:

(1) Select a large integer Np 0 and compute

 





h i y y_kip _k _p 1     , p p N i  

(6)

to 1, and h



y_ky_k



N_p

(2) Form the vector  ip



y y y_k y_k ip



T

y  1, 2,..., 1, and find the value of ε ip

such that

 ip  ip

Hε μ

y  

(3) Replace



,



in Equation (3.1) by



_i,_i



and find z such that

 ip i

i 

  . Let the answer of z be denoted by  ip i z . (4) Compute

 

  ) ( 1 2 2 1 Exp 2 1 p i i i p p z z i i k i i i i dz d z f                



(5) Estimate the conditional pdf (evaluated at  ip k y ) of y_k by



 p p p p N i i i f f 1 ' '

4

Estimation of Conditional pdf with Type I Parameter

Suppose the observed responses are r(jt) where 1 ≤ j

The following is a procedure for estimating the conditional pdf of r(tt) when the values of r(t(l1)t), ..., r(tt) and r(t) are given:

(1) Obtain r~(n) [~r₍_n ₁₎ _l, , ~r₍_n ₁₎₁, ~r ₍_n ₁₎]T 1 1 1 1           where ) ( ~ 1 1 r t n t

rn    , 1n1  N1 and N1 0 is a chosen integer.

The value ~_r(n1)_{may be viewed as the}

1

n -th observed value of a certain 1

) 1

(l  vector r~ of random variables.

(2) Compute



  1 1 1 1 ) ( 1 ~ 1 N n n i i r N r , and , 1i j,  l 1; 1 , 0k₁ k₂  .

(7)

(3) Compute the l1 eigenvectors of the variance-covariance matrix }

{m_ij(1,1)m_ij(1,0)m_ij(0,1) and form the matrix H₁ of which the i-th column is the ith eigenvector. (4) Compute . (5) Compute



  1 1 1 1 ) ( 1 ) ( ] [ 1 N n k n i k i s N m , 1  i l 1 ; k2, 3, 4.

(6) Find (_i,_i) and such that E(k_i)m_i(k), where iis defined in Equation(3.3) and 1  i l 1; k2, 3, 4.

(7) From the multivariate power-normal distribution ofr r H1ε

~_ _ _{, we} estimate the conditional pdf of r(tt) using the conditional pdf of

1

~



l

r given r~₁ r(t(l1)t), ..., r~l1 r(tt) and ~rl r(t). We note that the vector formed by

(i) the values in the upper triangle of the variance-covariance matrix in (3)

(ii) the components of r (iii)_i, _i and , 1  i l 1

may be viewed as an estimate of the Type 1 parameter x(t) which has a total of n_c  (l 1)(l2) 24(l1) components.

5

Estimation of conditional pdf with Type 2 Parameter

The following procedure may be used to estimate the conditional distribution of x



tt



when the values of x



t



m1



t



,,x



tt



and x

 

t are given:

(1) Obtain ~  [~ , ,~ ,~ ] 2 2 2 2 1 n n m n n x x x x  _  _ where



( 1)



, ~ 2 1 2 N l m n t n x      x 1n2  N2 N N1 ml.

(8)

The value may be viewed as the n₂-th observed value of a certain vector ~x_{consisting of (m+1)}nc random variables.

(2) Obtain  2

] [

~ n q

x as a vector consisting of the initial mnc components of

 2 ~ n

x and q initial components of ~_n ,1qn_c 2 x . The value  2 ] [ ~ n q

x may be viewed as the n₂ -th observed value of a certain vector ~x_[_q_] consisting of mnc q random variables.

(3) For apply the method in Section 4 to find a multivariate power-normal distribution for ~x_[_q_] and use the resulting distribution to generate a value x_q*



tt



for the qth component of



tt



x when







t m1 t



,, x



   

, ,

 

, 2*



, , * 1 t t x t t  x t t t x   x



t t



x_q*_₁  are given.

(4) Repeat (3) to generate m* values of



 









x t t x t t x t t



c n      * * 2 *

1 , ,, and use the method in Section 4

to find a multivariate power-normal distribution as an estimate of the conditional distribution of x



tt



.

6

Performance of Model

We may construct prediction interval for the future response and prediction region for the future Type 1 parameter from the conditional pdf with Type 1 and Type 2 parameters respectively, and assess the performance of the model by using the expected sizes of the prediction interval and prediction region , and the ability of the interval and region to cover the respective future values. From the conditional pdf of r(tt)when the values of r(t(l1)t),,r(tt) and

) (t

r are given, we may use the 100( 2)% point L_ and 100(1 2)%_point



U of the conditional pdf to form a nominally 100(1)% prediction interval ]

,

[L_ U_ for the future value r(tt). The important characteristics of the prediction interval [L_,U_] are its

(9)

(A1) coverage probability P1 which is defined as the probability that

] ,

[L U will cover the future observation r(tt)and

(B1) expected length L₁ which is defined as the expected value of the length

  L

U 

Estimates of P₁ and L₁ may be obtained via the following procedure. Suppose there are N observed values r₁, r₂,, r_N of the response. For , we use the method in Section 4 to obtain, from the

values ₁ ₁ 1 , , , _n _ _n __N__l_ nw rw rw

r  , an estimated conditional pdf, and construct the prediction interval based on the estimated conditional pdf.

The prediction interval may or may not cover the observed future observation _n _N _l

w r _ _

1 . The proportion of times (out of Nw times) the prediction interval covers the observed future observation _n _N _l

w r _ _

1 is then an estimate of the coverage probability P₁. Furthermore the average value of

over 1n_w N_w is an estimate of the expected length L₁.

We note that when the values of x(t(m1)t),,x(tt) and x(t)are given ,the random variable x(tt)may be written as x(tt)rxHxεxof which the right side has a structure which is similar to that of the right side of Equation (3.2). Thus like i in Equation (3.3), xi is a function of a random variable (denoted as z_xi ) having a standard normal distribution. A nominally

)% 1 (

100  prediction region R for x(tt) can now be formed from the

values of x(tt) of which the corresponding

c xn x x z z z ₁, ₂,, have a sum of squares



  nc i xi z D 1 2 2

which is less than or equal to the 100(1)% point 2_,_ c

n of

a chi square distribution with n_c degrees of freedom. The important characteristics of the prediction region R_are its

(10)

(A2) coverage probability P₂ which is defined as the probability that R_ will cover the future value x(tt) and

(B2) expected size L₂ which is defined as the expected value of the size of Rin

the nc-dimensional space.

A measure of the size of Rcan be obtained as follows. We first transform

the nc -dimensional space for x(tt) to an nc -dimensional spherical polar coordinate system with center r_x . We next choose a large number (N_p, say) of polar angles which are of the same size, and find the average of the radial distances of the corresponding N_p points x(tt) of which 2 2 _,_

c n

D . The

average radial distance is then a measure of the size of R_.

Estimates of P₂ and L₂ may be obtained in a way similar to that used for estimating P₁and L₁.

For a prediction interval (or region) to be classified as satisfactory, it should at least have a coverage probability which is not too far from the target value. Among two prediction intervals (or regions) of which the coverage probabilities are close to the target value, the one with a smaller expected size is deemed to be a better prediction interval (or region).

The model would be considered as satisfactory if the related prediction interval and prediction region are satisfactory.

7

Numerical Examples

The fluctuation of interest rate is very important in the decision of investment and risk management in the financial markets. One-factor models are a popular class of model for describing the fluctuation of interest rate. In a one-factor model, the interest rate r

 

t at time t may be specified via the stochastic differential equation

 

t



t,r(t)



dt



t,r(t)



dW(t)

(11)

where and  are respectively the drift and diffusion term of the interest rate process, and W is a Brownian motion.

An important example of one-factor model is the Chan, Karolyi, Longstaff and Sanders (CKLS) model:

dr

  

t   r(t)



dt

 

r(t) dW(t),r

 

0 r₀ where , ,   0, 0 are constants.

The above one-factor model may be used for determining the prices of bonds, bond options, swaps, caps, floors, etc.

In what follows, the CKLS model will be used to fit a real dataset on interest rates.

Consider the 6 month treasury Bill Rates data obtained from the link http: research.stlouisfed.org/fred2/categories/116 under the file name WTB6MS.xls. The total number of data point is N=2566 and ∆t = 7/365 represents the length of a one-week period. The maximum likelihood estimates of the parameters in the CKLS model for the dataset are found to be respectively =0.0280, =0.0615,

=0.1733 and =0.9599.

Let l = 1 and choose a value of 200 for N1 (see Section 4). For

, 1 ,.., 1 , ,  ₁  ₁     j t j N l N l N

t we use the method in Section 4 to estimate

the conditional pdf of r t(  t) when r t



  (l 1) t



,...,r t



 t



and r t

 

are given. From the conditional pdf of r t(  t), we find a nominally 95% prediction interval for r t(  t). The total number of prediction intervals which can be obtained will then be NN₁ l . From the NN₁l prediction intervals, the estimates of the coverage probability and expected length of the prediction interval are found to be 1



P = 0.9226 and 1



L =0.00612 respectively.

By using the CKLS model, we can obtain a total of NN₁ l corresponding prediction intervals which give 0.9496 and 0.006132 as the estimates of the coverage probability and expected length respectively.

Table 7.1 gives the estimates of coverage probability and expected length for the cases when 1 ≤ l ≤ 5 are used. The table shows that the prediction intervals

(12)

based on the proposed model have smaller coverage probabilities and shorter expected lengths when compared with that based on the CKLS model. We also noted that the expected length of the prediction interval based on the proposed model clearly decreases as l increases from 1 to 3 but the decrease becomes less obvious after l = 3.

Table 7.2 displays the estimates of the coverage probability and expected length when l is fixed at 3 but N₁ ranges from 200 to 1500. From the table, we see that for suitable choices of N₁, the prediction interval based on the proposed model can have an estimated coverage probability which is close to the target value, but yet has an expected length which is shorter than that of the prediction interval based on the CKLS model.

Table 7.1 Estimated coverage probability and expected length of prediction interval [α = 0.05, N1 = 200]. l 1  P 1  L 1 0.9226 0.00612 2 0.9163 0.00532 3 0.9153 0.00517 4 0.9178 0.00510 5 0.9165 0.00502 CKLS (0.9496) (0.00613)

Table 7.2 Estimated coverage probability and expected length of prediction interval [α = 0.05, l = 3, the values in parentheses are the estimated coverage probability and expected length of prediction interval based on the CKLS model].

l 1  P 1  L 200 0.9153 (0.9496) 0.00517 (0.00613) 500 0.9210 (0.9442) 0.00547 (0.00638) 1000 0.9405 (0.9507) 0.00547 (0.00639) 1500 0.9774 (0.9652) 0.00434 (0.00483)

(13)

Now let l 2 , m1 and N₁=200. For t  jt,j N₁ l,N₁l1,..,N 1,_we use the method in Section 4 to estimate the Type 1 parameter x(t) for the conditional pdf of (r t t) when r t



  (l 1) t



,...,r t



 t



and r t

 

are given. From the estimated values of x(t) ,we next obtain ~_x n2 _for

l m N N N n       ₂ ₂ ₁

1 , and use the method in Section 5 to estimate the parameter θof the conditional pdf of x



tt



_whenx*

 

t [x(t)] is given. For a given value of n₂ , a nominally 95% prediction region R_ for

) ) 1 ((N₁lmn₂  t

x can be found. To find out whether the observed )

) 1 ((N₁lmn₂  t

x lies in the prediction region R, we compute



  nc i xi z D 1 2 2

where n_c_{=18 (see Section 6) , and find out whether the computed}D2_{is less} than the 95% point 28.87 of the chi square distribution with 18 degrees of freedom. When n_c=1 and 2000, the computed values of D2 are found to be 2.44 and 7.55 respectively , indicating that the observed values of

) ) 1 ((N₁lmn₂  t

x lie in the corresponding prediction regions. As the above two computed values of 2

D are both less than the 95% point of the chi square distribution, there are no good reasons to suspect the validity of the conditional pdf of x



tt



. A more thorough investigation of the conditional pdf of x



tt



would be possible if we could alleviate the problem of long computing time involved in estimating the high- dimensional distributions.

8

Concluding Remarks

The model proposed for time series analysis features a fairly general conditional distribution of the future response when the values of the present and past observations are given. Although the mean of the conditional distribution is restricted to be a linear function of the present and past observations, the model is fairly general as the variance, skewness and kurtosis of the conditional distribution are modeled as functions of the present and past observations. The generality of the model is further enhanced by describing the parameter of the conditional distribution of the future response using a multi-dimensional time series.

(14)

The performance of the conditional distribution of the future response in the proposed model has been examined and found to be comparable to the well-known CKLS model when a real dataset on interest rates is used.

However the performance of the multi-dimensional time series formed by the parameters of the conditional distributions of the future responses in the proposed model has not been extensively examined due to the requirement of long computing time.

The proposed model may be applied to other dataset in finance and other areas where the underlying distributions are unimodal nonnormal, and the time series of the responses is non-stationary.

References

[1] B.D.O. Anderson and J.B. Moore, Optimal Filtering, Prentice-Hall Inc., Englewood Cliffs, New Jersey,1979.

[2] T. Bollerslev , Generalized Autoregressive Conditional Heteroscedasticity, Journal of Econometrics, 31 (1986), 307-327.

[3] G.Box and G.Jenkins,Time series analysis: Forecasting and control, San Francisco: Holden-Day, 1970.

[4] R. Engle, Autoregressive Conditional Heteroscedasticity with Estimates of

United Kingdom Inflation, Econometrica, 50 (1982) ,987-1008. [5] F.L.Lewis, Optimal estimation, John Wiley & Sons, New York, 1986.

[6] I.K.Yeo and R.A. Johnson, A new family of power transformations to improve normality or symmetry,Biometrika, 87 no. 4 (2000), 954-959.