Maximum likelihood estimators in linear regression models with Ornstein Uhlenbeck process

(1)

R E S E A R C H

Open Access

Maximum likelihood estimators in linear

regression models with Ornstein-Uhlenbeck

process

Hongchang Hu

1

_{, Xiong Pan}

2*

_{and Lifeng Xu}

1

*_{Correspondence: [email protected]} 2_{Faculty of Information Engineering,}

China University of Geosciences, Wuhan, 430074, China Full list of author information is available at the end of the article

Abstract

The paper studies the linear regression model

yt=xtT

β

+

ε

t, t= 1, 2,. . .,n,

where

d

ε

t=

λ

(

μ

–

ε

t)dt+

σ

dBt,

with parameters

λ

,

σ

∈R+_,

_μ

_∈_R_and_{B

t,t≥0}the standard Brownian motion. Firstly,

the maximum likelihood (ML) estimators of

β

,

λ

and

σ

2_{are given. Secondly, under} general conditions, the asymptotic properties of the ML estimators are investigated. And then, limiting distributions for likelihood ratio test statistics of the hypothesis are also given. Lastly, the validity of the method are illuminated by two real examples.

MSC: 62J05; 62M10; 60J60

Keywords: linear regression model; maximum likelihood estimator; Ornstein-Uhlenbeck process; asymptotic property; likelihood ratio test

1 Introduction

Consider the following linear regression model

yt=xTtβ+εt, t= , , . . . ,n, (.)

whereyt’s are scalar response variables,xt’s are explanatory variables,βis anm

-dimen-sional unknown parameter, and{εt}is an Ornstein-Uhlenbeck process, which satisﬁes the

linear stochastic diﬀerential equation (SDE)

dεt=λ(μ–εt)dt+σdBt (.)

with parametersλ,σ∈R+_,_μ_∈_R_and_{B

t,t≥}the standard Brownian motion.

It is well known that a linear regression model is the most important and popular model in the statistical literature, which attracts many people to investigate the model. For an

(2)

ordinary linear regression model (when the errors are independent and identically dis-tributed (i.i.d.) random variables), Wang and Zhou [], Anatolyev [], Bai and Guo [], Chen [], Gilet al.[], Hampelet al.[], Cui [], Durbin [] and Li and Yang [] used various estimation methods to obtain estimators of the unknown parameters in (.) and discussed some large or small sample properties of these estimators. Recently, linear re-gression with serially correlated errors has attracted increasing attention from statisticians and economists. One case of considerable interest is that the errors are autoregressive processes; Hu [], Wu [], and Fox and Taqqu [] established its asymptotic normal-ity with the usualpn-normalization in the case of long memory stationary Gaussian

ob-servations errors. Giraitis and Surgailis [] extended this result to non-Gaussian linear sequences. Koul and Surgailis [] established the asymptotic normality of the Whittle estimator in linear regression models with non-Gaussian long memory moving average errors. Shiohama and Taniguchi [] estimated the regression parameters in a linear re-gression model with autoregressive process. Fan [] investigated moderate deviations for M-estimators in linear models withφ-mixing errors.

The Ornstein-Uhlenbeck process was originally introduced by Ornstein and Uhlenbeck [] as a model for particle motion in a ﬂuid. In physical sciences, the Ornstein-Uhlenbeck process is a prototype of a noisy relaxation process, whose probability density function f(x,t) can be described by the Fokker-Planck equation (see Janczuraet al.[], Debbasch et al.[], Gillespie [], Ditlevsen and Lansky [], Garbaczewski and Olkiewicz [], Plastino and Plastino []):

∂f(x,t)

∂t =

∂ ∂x

λ(x–μ)f(x,t)+σ  

∂f(x,t)

∂x .

This process is now widely used in many areas of application. The main characteristic of the Ornstein-Uhlenbeck process is the tendency to return towards the long-term equilib-riumμ. This property, known as mean-reversion, is found in many real life processes,e.g., in commodity and energy price processes (see Fasen [], Yu [], Geman []). There are a number of papers concerned with the Ornstein-Uhlenbeck process, for example, Janczuraet al.[], Zhanget al.[], Rieder [], Iacus [], Bishwal [], Shimizu [], Zhang and Zhang [], Chronopoulou and Viens [], Lin and Wang [] and Xiaoet al. []. It is well known that the solution of model (.) is an autoregressive process. For a constant or functional or random coeﬃcient autoregressive model, many people (for ex-ample, Magdalinos [], Andrews and Guggenberger [], Fan and Yao [], Berk [], Goldenshluger and Zeevi [], Liebscher [], Baranet al.[], Distaso [] and Harvill and Ray []) used various estimation methods to obtain estimators and discussed some asymptotic properties of these estimators, or investigated hypotheses testing.

By (.) and (.), we can obtain that the more general process satisﬁes the SDE

dyt=λ

L(t,λ,μ,β) –yt

dt+σdBt, (.)

(3)

and Wang [] established the existence of a successful coupling for a class of stochastic diﬀerential equations given by (.). Bishwal [] investigated the uniform rate of weak convergence of the minimum contrast estimator in the Ornstein-Uhlenbeck process (.).

The solution of model (.) is given by

εt=e–λtε+μ

 –e–λt+σ

t



eλ(s–t)dBt, (.)

where_teλ(s–t)_dB

t∼N(,–exp –λt λ ).

The process observed in discrete time is more relevant in statistics and economics. Therefore, by (.), the Ornstein-Uhlenbeck time series fort= , , . . . ,nis given by

εt=e–λdεt–+μ

 –e–λd+σ

 –e–λd

λ ηt, (.)

whereηt∼N(, ) i.i.d. random errors and with equidistant time lagd, ﬁxed in advance.

Models (.) and (.) include many special cases such as a linear regression model with constant coeﬃcient autoregressive processes (whenμ= ; see Hu [], Wu [], Maller [], Pere [] and Fuller []), Ornstein-Uhlenbeck time series or processes (whenβ= ; see Rieder [], Iacus [], Bishwal [], Shimizu [] and Zhang and Zhang []), constant coeﬃcient autoregressive processes (whenμ= , β = ; see Chambers [], Hamilton [], Brockwell and Davis [] and Abadir and Lucas [],etc.).

The paper discusses models (.) and (.). The organization of the paper is as follows. In Section  some estimators ofβ,θ andσ_{are given by the quasi-maximum likelihood} method. Under general conditions, the existence and consistency of the quasi-maximum likelihood estimators as well as asymptotic normality are investigated in Section . The hypothesis testing is given in Section . Some preliminary lemmas are presented in Sec-tion . The main proofs of theorems are presented in SecSec-tion , with two real examples in Section .

2 Estimation method

Without of loss generality, we assume thatμ= ,ε=  in the sequel. Write the ‘true’ model as

yt=xTtβ+et, t= , , . . . ,n (.)

and

et=exp(–λd)et–+σ

 –exp(–λd) λ

ηt, (.)

whereηt∼N(, ) i.i.d.

By (.), we have

et=σ

 –exp(–λd) λ

t

j=

(4)

Thusetis measurable with respect to theσ-ﬁeldHgenerated byη,η, . . . ,ηt, and

Eet= , Var(et) =σ

 –exp(–λd) λ

exp –λdt(t– ). (.)

Using similar arguments as those of Rieder [] or Maller [], we get the log-likelihood ofy,y, . . . ,ynconditional ony,

n

β,λ,σ=logLn

= –

(n– )log

π σ λ

– 

(n– )log

 –exp(–λd)

– λ

σ_{( –}_exp_(–_λ_d₎₎

n

t=

εt–exp(–λd)εt–



. (.)

We maximize (.) to obtain QML estimators denoted by σˆ_n, βˆn,λˆn(when they exist).

Then the ﬁrst derivatives ofnmay be written as

∂n

∂σ = – n– 

σ +

λ

σ_{( –}_exp_(–_λ_d₎₎

n

t=



, (.)

∂n

∂λ =

n–  λ –

(n– )dexp(–λd)  –exp(–λd) –

dλexp(–λd)

σ_{( –}_exp_(–_λ_d₎₎

n

t=

εt–

– – ( + dλ)exp(–λd)

σ_{( –}_exp_(–_λ_d₎₎

n

t=



(.)

and

∂n

∂β =

λ σ_{( –}_exp_(–_λ_d₎₎

n

t=

xt–exp(–λd)xt–

. (.)

Thusσˆ_n,βˆn,λˆnsatisfy the following estimation equations:

ˆ

σ_n= λˆn

(n– )( –exp(–λˆnd)) n

t=

ˆ

εt–exp(–λˆnd)εˆt–



, (.)

ˆ

σ_n( – ( + dλˆn)exp(–λˆnd))

λˆn

–dλˆnexp(–λˆnd) n– 

n

t=

ˆ

εt–

– – ( + dλˆn)exp(–λˆnd) ( –exp(–λˆnd))(n– )

n

t=

ˆ



=  (.)

and

n

t=

ˆ

xt–exp(–λˆnd)xt–

= , (.)

where

ˆ

(5)

To obtain our results, the following conditions are suﬃcient (see Maller []).

(A) Xn= n

t=xtxTt is positive deﬁnite for suﬃciently largenand

lim

n→∞≤maxt≤nx T

tXn–xt= . (.)

(A)

lim sup

n→∞ |˜

λ|maxX–

  n ZnX

–T_

n

< , (.)

whereZn=_ n

t=(xtxTt–+xt–xTt),|˜λ|max(·)denotes the maximum in absolute value of the eigenvalues of a symmetric matrix.

For ease of exposition, we shall introduce the following notations which will be used later in the paper.

Let (m+ )-vectorθ= (β,λ). Deﬁne

Sn(θ) =σ

∂n

∂θ =σ



∂n

∂β , ∂n

∂λ

, Fn(θ) = –σ

∂n

∂θ ∂θT. (.)

By (.) and (.), we get the components ofFn(θ)

–σ ∂



n

∂β ∂βT =

λ

n

t=

T

= λ

 –exp(–λd)Xn(λ), (.)

–σ∂



n

∂β ∂λ= –

dλexp(–λd)  –exp(–λd)

n

t=

εt–xt+εtxt–– exp(–λd)xt–εt–

– – ( + dλ)exp(–λd) ( –exp(–λd)) ·

n

t=

(.)

and

–σ∂



n

∂λ =

σ(n– ) λ –

σ(n– )dexp(–λd) ( –exp(–λd)) +

dλexp(–λd)  –exp(–λd)

n

t=

ε_t_–

+d( –dλ– ( +dλ)exp(–λd))exp(–λd) ( –exp(–λd))

n

t=

εt–

+dexp(–λd)[ – ( + dλ)exp(–λd)] ( –exp(–λd))

n

t=

εt–

+dexp(–λd)[dλ–  + ( +dλ)exp(–λd)] ( –exp(–λd))

n

t=



=σ ₍_n_{– )}

λ –

σ₍_n_{– )}_d_exp_(–_λ_d₎ ( –exp(–λd)) +

d_λ_exp_(–_λ_d₎  –exp(–λd)

n

t=

(6)

+dexp(–λd)[( –dλ) –dλexp(–λd)] ( –exp(–λd))

n

t=

εt–

·

n

t=



. (.)

Hence we have

Fn(θ) = __λ

–exp(–λd)Xn(λ) –σ

∂n

∂β ∂λ

∗ –σ∂n

∂λ

, (.)

where the∗indicates that the elements are ﬁlled in by symmetry. By (.), we have

E

–σ∂



n

∂λ

θ=θ

= (n– )σ_

 λ_+

dexp(–λd)[– + ( +dλ)exp(–λd)]

λ( –exp(–λd))

+d _λ

exp(–λd)  –exp(–λd)

n

t= Ee_t_–

= (n– )σ_[ – ( + dλ)exp(–λd)]

 λ

( –exp(–λd)) +d

_λ

n

t= Ee_t_–

=n(θ,σ) =O(n). (.)

Thus,

Dn=E

Fn(θ)

=

__λ 

–exp(–λd)Xn(λ)   n(θ,σ).

(.)

3 Large sample properties of the estimators

Theorem . Suppose that conditions(A)-(A)hold.Then there is a sequence An↓

such that,for each A> ,as n→ ∞,the probability

P there are estimatorsθˆn,σˆnwith Sn(θˆn) = ,and _ˆ

θn,σˆn

∈N_n(A)→. (.)

Furthermore,

_ˆ

θn,σˆn

→p

θ,σ

, n→ ∞, (.)

where,for each n= , , . . . ,A> and An∈(,σ),deﬁne neighborhoods

Nn(A) = θ∈Rm+: (θ–θ)TDn(θ–θ)≤A

(7)

and

N_n(A) =Nn(A)∩ σ∈

σ_–An,σ+An

. (.)

Theorem . Suppose that conditions(A)-(A)hold.Then

 ˆ

σn

F T



n (θˆn)(θˆn–θ)→DN(,Im+), n→ ∞. (.) In the following, we will investigate some special cases in models (.) and (.). From Theorem . and Theorem ., we obtain the following results. Here we omit their proofs.

Corollary . Ifβ= ,then √

n(θ,σ) ˆ

σn

(λˆn–λ)→DN(, ), n→ ∞. (.)

Corollary . Ifβ= ,then √

n(λˆn–λ)→DN

,σ_, n→ ∞. (.)

4 Hypothesis testing

In order to ﬁt a data set {yt,t= , , . . . ,n}, we may use model (.) or an

Ornstein-Uhlenbeck process with a constant mean level model

dyt=λ(μ–yt)dt+σdBt. (.)

Ifβ= , then we use model (.), namely models (.) and (.). Ifβ= , then we use model (.). How to knowβ=  orβ= ? In the section, we shall consider the question about hypothesis testing and obtain limiting distributions for likelihood ratio (LR) test statistics (see Fan and Jiang []).

Under the null hypothesis

H: β= , λ> , σ> , (.)

letβˆn,λˆn,σˆnbe the corresponding ML estimators ofβ,λ,σ. Also let

ˆ

Ln= –n _ˆ

βn,λˆn,σˆn

(.)

and

ˆ

Ln= –n _ˆ

βn,λˆn,σˆn

. (.)

By (.) and (.), we have that

ˆ

Ln= (n– )log

πσˆ_n

ˆ

λn

+ (n– )log –exp(–λˆnd)

+ λˆn ˆ

σ

n( –exp(–λˆnd)) n

t=

ˆ

(8)

= (n– )log

πσˆ

n

ˆ

λn

+ (n– )

= (n– )log(π+ ) + (n– )logσˆ_n

–log(λˆn)

. (.)

And similarly,

ˆ

Ln= (n– )log(π+ ) + (n– )log

ˆ

σ__n

+ (n– )log –exp(–λˆnd)

–log(λˆn)

. (.)

By (.) and (.), we have

˜

d(n) =Lˆ n–Lˆn

= (n– )log

_ˆ

σ n

ˆ

σ

n

+ (n– )

log

 –exp(–λˆnd)

 –exp(–λˆnd)

–log

_ˆ

λn

ˆ

λn

= (n– )

_ˆ

σ__n

ˆ

σ

n

– 

+ (n– )

–λˆn ˆ

λn

+op()

= (n– )σˆ  n–σˆn

σ 

+ (n– )

–λˆn ˆ

λn

+op()

= (n– )σˆ  n–σˆn

σ_ +op(). (.)

Large values ofd˜(n) suggest rejection of the null hypothesis.

Theorem . Suppose that conditions(A)-(A)hold.IfHholds,then

˜

d(n)→Dχ(m), n→ ∞. (.)

5 Some lemmas

Throughout this paper, letCdenote a generic positive constant which could take diﬀer-ent value at each occurrence. To prove our main results, we ﬁrst introduce the following lemmas.

Lemma . If condition(A)holds,then for anyλ∈R+_{the matrix X}

n(λ)is positive deﬁnite

for large enough n,and

lim

n→∞≤maxt≤nx T

tXn–(λ)xt= .

Proof Letλ˜andλ˜mbe the smallest and largest roots of|Zn–λ˜Xn|= . Then from Ex. .

of Rao [],

˜

λ≤ uT_Z

nu

uT_X nu≤ ˜

(9)

for unit vectors u. Thus by (.) there are some δ∈(, ) andn(δ) such thatn≥N implies

uTZnu≤( –δ)uTXnu. (.)

By (.) and (.), we have

uTXn(λ)u= n

t=

uTxt–exp(–λd)xt–



≥

n

t=

uT_x t

 +min

λ exp(–λd) n

t=

uT_x t–



–max

λ exp(–λd)u T_Z

nu

≥uTXnu+min

λ exp(–λd)u T_X

nu–uTZnu

≥ +min

λ exp(–λd) – ( –δ)

uTXnu

=

min

λ exp(–λd) +δ

uTXnu=C(λ,δ)uTXnu. (.)

By Rao [, p.] and (.), we have

(uTxt)

uT_X nu

→. (.)

From (.) andC(λ,δ) > ,

xT_t X_n–(λ)xt=sup u

(uTxt)

uT_X n(λ)u

≤sup

u

(uTxt)

C(λ,δ)uT_X nu

→. (.)

Lemma . The matrix Dn is positive deﬁnite for large enough n, E(Sn(θ)) =  and

Var(Sn(θ)) =σDn.

Proof Note thatXn(λ) is positive deﬁnite andn(θ,σ) > . It is easy to show that the matrixDnis positive deﬁnite for large enoughn. By (.), we have

σ_E

∂n

∂β

θ=θ

= λ  –exp(–λd) ·

n

t=

Eet–exp(–λd)et–

xt–exp(–λd)xt–

= λ  –exp(–λd)σ

 –exp(dλ) λ ·

n

t=

Eηt

(10)

Note thatet–andηtare independent, so we haveE(ηtet–) = . Thus, by (.) andEηt= ,

we have

E

∂n

∂λ

θ=θ

=n–  λ

–(n– )dexp(–λd)  –exp(–λd) –  – – ( + dλ)exp(–λd)

σ_( –exp(–λd)) σ  

 –exp(dλ) λ

n

t= Eη_t

=n–  λ

–(n– )dexp(–λd)  –exp(–λd)

– – ( + dλ)exp(–λd) λ( –exp(–λd))

(n– )

= . (.)

Hence, from (.) and (.),

ESn(θ)

=σ_E

∂n

∂β

θ=θ

,∂n

∂λ

θ=θ

= . (.)

By (.) and (.), we have

Var

σ_∂n ∂β

θ=θ

=Var

λ  –exp(–λd)

n

t=

et–exp(–λd)et–

= σ  λ

 –exp(–λd)Var

_n

t=

ηt

= σ  λ  –exp(–λd)

Xn(λ). (.)

Note that{ηtet–,Ht}is a martingale diﬀerence sequence with

Var(ηtet–) =EηtEet–=Eet–, so

Var

σ_∂n ∂λ

θ=θ

=E

σdexp(–λd)

λ  –exp(–λd)

n

t=

ηtet–



+E

σ_[ – ( + dλ)exp(–dλ)] λ( –exp(–λd))

n

t=

η_t – 



+ √

σ_dexp(–λd)[ – ( + dλ)exp(–dλ)] √

λ( –exp(–λd))

 

·E

_n

t=

ηtet–

n

t=

η_t – 

=λσ 

dexp(–λd)  –exp(–λd)

n

(11)

+

σ_[ – ( + dλ)exp(–dλ)] λ( –exp(–λd))



(n– )Eη_t – 

+ √

σ_dexp(–λd)[ – ( + dλ)exp(–dλ)] √

λ( –exp(–λd))

 

·

_n

t=

Eη_t – ηtet–

+

t=k

Eηtet–

η_k– 

=λσ 

dexp(–λd)  –exp(–λd)

n

t= Ee_t_–

+ (n– )

σ

[ – ( + dλ)exp(–dλ)] λ( –exp(–λd))



=σ_n(θ,σ). (.)

By (.), (.), and noting thatet–andηtare independent, we have

Cov

σ_∂n ∂β ,σ

 

∂n

∂λ

θ=θ

= –σ_ – ( + _√ dλ)exp(–λd)

λ( –exp(–λd))

 

·E

_n

t=

η_t

n

t=

ηt

= –σ_ – ( + _√ dλ)exp(–λd)

λ( –exp(–λd))

 

Eη_t

·

n

t=

= . (.)

From (.)-(.), it follows thatVar(Sn(θ)) =σDn. The proof is completed. Lemma .(Maller []) Let Wnbe a symmetric random matrix with eigenvaluesλ˜j(n),

≤j≤d.Then

Wn→pI ⇔ ˜λj(n)→p, n→ ∞. Lemma . For each A> ,

sup

θ∈Nn(A)

D–

  n Fn(θ)D

–T_

n –n→p, n→ ∞ (.)

and also

n→D, (.)

lim

c→lim sup_A_→∞ lim supn→∞ P

inf

θ∈Nn(A)

λmin

D–

  n Fn(θ)D

–T_

n

≤c

= , (.)

where

n= ⎛ ⎝

λ(–exp(–dλ))

λ(–exp(–dλ))Im 

 –σ

∂n

∂λ |θ=θ n(θ,σ)

⎞

(12)

Proof LetXn(λ) =X

  n(λ)X

T



n(λ) be a square root decomposition ofXn(λ). Then

Dn= ⎛

⎝ –exp(–λdλ)X  

n(λ) 

 √n(θ,σ)

⎞ ⎠

·

⎛

⎝ –exp(–λdλ)X

T



n (λ) 

 √n(θ,σ)

⎞ ⎠

=D

  nD

T



n. (.)

Letθ∈Nn(A). Then

(θ–θ)TDn(θ–θ) =

λ  –exp(–dλ)

(β–β)TXn(λ)(β–β)

+ (λ–λ)n(θ,σ)≤A. (.)

From (.), (.) and (.),

D–

  n Fn(θ)D

–T

n –n=

W W W

, (.)

where

W=λ( –exp(–dλ))

λ( –exp(–dλ)) X–

 

n (λ)Xn(λ)X

–T

n (λ) –Im

, (.)

W=

–exp(–dλ)

λ X

–

n (λ)(–σ∂



n

∂β ∂λ)

√

n(θ,σ)

(.)

and

W=–σ ∂n

∂λ –σ ∂n

∂λ |θ=θ

n(θ,σ)

. (.)

Let

Nβ n(A) =

β: λ  –exp(–dλ)

(β–β)TX

  n(λ)

 ≤A

(.)

and

Nλ n(A) =

θ:|λ–λ| ≤ √ A n(θ,σ)

. (.)

As the ﬁrst step, we will show that, for eachA> ,

sup

θ∈Nθ

n(A)

(13)

In fact, note that

W=λ( –exp(–dλ))

λ( –exp(–dλ)) X–

  n (λ)

Xn(λ) –Xn(λ)

X– T

 n (λ) =λ( –exp(–dλ))

λ( –exp(–dλ)) X–

 

n (λ)(T+T–T)X –T

n (λ), (.)

where

T=

n

t=

exp(–dλ) –exp(–dλ)

xt–

xt–exp(–dλ)xt–

T

,

T=

n

t=

xT_t

and

T=

n

t=

exp(–dλ) –exp(–dλ)

 xt–xTt–.

Let u,v∈Rd_, _|_u_|₌_|_v_|_{= , and let} _uT n =uTX

–_

n (λ), vTn =X

–T_

n (λ)v. By the Cauchy-Schwarz inequality, Lemma . and notingNλ

n(A), we have

uT_nTvn=

n

t= uT_nxt–

T

vn

≤maxexp(–dλ) –exp(–dλ)

_n

t=

uT_nxtxTtun 



·

_n

t=

vT_nxt–exp(–dλ)xt–

T

vn 



≤d|λ–λ| · √

nmax

≤t≤n

xT_tX–_n (λ)xt

·

≤C

n

n(θ,σ)

o()→. (.)

Similar to the proof ofT, we easily obtain

uT_nTvn→. (.)

By the Cauchy-Schwarz inequality, Lemma . and notingNλ

n(A), we have

uT_nTvn= uTn

n

t=



xt–xTt–vn

≤maxexp(–dλ) –exp(–dλ)

·

_n

t=

uT_nxtxTt un n

t=

vT_nxtxTtvn 

(14)

≤n|λ–λ|max ≤t≤n

xT_tX_n–(λ)xt

≤ nA

n(θ,σ)

o()→. (.)

Hence, (.) follows from (.)-(.). For the second step, we will show that

W→p. (.)

Note that

εt=yt–xTtβ=xTt(β–β) +et (.)

and

εt–exp(–dλ)εt–=

T

(β–β) +σ

 –exp(–dλ) λ

ηt. (.)

Write

J=

 –exp(–dλ) λ

X–

  n (λ)

–σ∂



n

∂β ∂λ

= –

 –exp(–dλ) λ

X–

  n (λ)

dλexp(–λd)  –exp(–λd) ·

n

t=

εt–xt+εtxt–– exp(–λd)xt–εt–

–

 –exp(–dλ) λ

X–

  n (λ)

 – ( + dλ)exp(–λd) ( –exp(–λd)) ·

n

t=

= –

 –exp(–dλ) λ

dλexp(–λd)  –exp(–λd)X

–_

n (λ)(T+T+ T+ T+ T)

–

 –exp(–dλ) λ

 – ( + dλ)exp(–λd) ( –exp(–λd)) X

–_

n (λ)T, (.)

where

T=

n

t=

xT_t_–(β–β)

,

T=

n

t=

T

(β–β)xt–,

T=

n

t=

exp(–λd) –exp(–λd)

(15)

T=σ

 –exp(–λd) λ

n

t=

ηtxt–, T=

n

t=

et–xt–,

T=σ

n

t=

ηt

.

Forβ∈Nnβ(A) and eachA> , we have (β–β)Txt



= (β–β)TX

  n(λ)X

–_

n (λ)xtxTt X

–T_

n (λ)X T



n(λ)(β–β) ≤ max

≤t≤n

(β–β)TXn(λ)(β–β) ≤Amax

≤t≤n

. (.)

By (.) and Lemma ., we have

sup

β∈Nnβ(A)

max

≤t≤n

(β–β)Txt→, n→ ∞,A> . (.)

Using the Cauchy-Schwarz inequality and (.), we obtain

uT_nT =

n

t=

uT_nxT_t_–(β–β)

≤

_n

t=

xT_t_–(β–β)



 

·

_n

t=

uT_nxt–exp(–λd)xt–

T

un 



≤√nmax

≤t≤n

₍β–β)Txt=o(

√

n). (.)

Using a similar argument asT, we obtain that

uT_nT=op(

√

n). (.)

By the Cauchy-Schwarz inequality and (.), (.), we get

uT_nT =

n

t=

exp(–λd) –exp(–λd)x_tT_–(β–β)uTnxt–

≤

_n

t=

exp(–λd) –exp(–λd)



xT_t_–(β–β)

n

t=

uT_nxt–



 

≤C|λ–λ|

_n

t=

xT_t_–(β–β)

n

t=

uT_nxt–



 

≤C√ A

n(θ,σ) √

(16)

By (.), we have

VaruT_nT=σ –exp(–λd)

λ

n

t=

uT_nxt–



=o(n). (.)

Thus, by the Chebychev inequality and (.),

uT_nT=op(

√

n). (.)

By Lemma . and (.), we have

VaruT_nT=Var

_n

t=

uT_nxtet–

=σ_ –exp(–λd)

λ ·Var

_n_–

j=

_n

t=j+

uT_nxtexp –λd(t–  –j)

ηj

=σ_ –exp(–λd)

λ

n–

j=

_n

t=j+

uT_nxtexp –λd(t–  –j)



≤σ_ –exp(–λd)

λ

max

≤t≤nu T nxt

·

n–

j=

_n

t=j+

exp –λd(t–  –j)



≤Cmax

≤t≤n

uT_nxtn=o(n). (.)

Thus, by the Chebychev inequality and (.),

uT_nT=op(

√

n). (.)

Using a similar argument asT, we obtain

uT_nT=op(

√

n). (.)

Thus (.) follows immediately from (.), (.)-(.), (.), (.) and (.). For the third step, we will show that

W→p. (.)

Write that

J= –σ∂



n

∂λ –σ ∂n

∂λ

θ=θ

=σ ₍_n_{– )}

λ –

σ₍_n_{– )}_d_exp_(–_λ_d₎ ( –exp(–λd)) +

d_λ_exp_(–_λ_d₎  –exp(–λd)

n

t=

(17)

+dexp(–λd)[( –dλ) –dλexp(–λd)] ( –exp(–λd))

n

t=

εt–

n

t=



–σ  (n– )

λ_ +

σ_(n– )d_exp_(–_λ_d₎ ( –exp(–λd)) –

d_λ

n

t= e_t_–

–dexp(–λd)[( –dλ) –dλexp(–λd)] ( –exp(–λd))

n

t=

et–

–dexp(–λd)[dλ–  + ( +dλ)exp(–λd)] ( –exp(–λd))

·

n

t=



. (.)

By (.) and (.), we obtain that

T=σ ₍_n_{– )}

λ –

σ_(n– ) λ_

= n–  λ_λ 

σλ_–λ+λσ–σ_

=o(n) (.)

and

T=σ 

(n– )dexp(–λd) ( –exp(–λd)) –

σ(n– )d_exp_(–_λ_d₎ ( –exp(–λd))

= d

₍_n_{– )}

( –exp(–λd))_{( –}_exp_(–_λ_d₎₎ σ

exp(–λd) –exp(–λd) +exp(–λd)(σ–σ) +exp(–λd–λd)

·σexp(–λd) –exp(–λd)+exp(–λd)(σ–σ)

·σexp(–λd)

 –exp(–λd)+σexp(–λd) –exp(–λd)

=o(n). (.)

By (.), we have

T=

d_λ_exp_(–_λ_d₎  –exp(–λd)

n

t=

ε_t_––d _λ

n

t= e_t_–

=d

_λ_exp_(–_λ_d₎  –exp(–λd)

n

t=

xT_t(β–β)



+ xT_t(β–β)et+et

–d _λ

n

t= e_t_–

=d

_λ_exp_(–_λ_d₎  –exp(–λd)

n

t=

xT_t(β–β)

 +d

_λ_exp_(–_λ_d₎  –exp(–λd)

n

t=

(18)

+

dλexp(–λd)  –exp(–λd) –

dλexp(–λd)  –exp(–λd)

n

t= e_t_–

=d

_λ_exp_(–_λ_d₎  –exp(–λd) T+

d_λ_exp_(–_λ_d₎

 –exp(–λd) T+T. (.) By (.), it is easy to show that

T=o(n). (.)

By Lemma ., (.) and (.), we have

Var(T) =Var

_n

t=

xT_t(β–β)et

=Var

_n_–

j=

_n

t=j+

xT_t(β–β)exp –λd(t–  –j)

ηj

=

n–

j=

_n

t=j+ xT

t(β–β)exp –λd(t–  –j)



≤ max

≤t≤nx T

t(β–β)

n–

j=

_n

t=j+

exp –λd(t–  –j)



≤Cmax

≤t≤n _xT

t(β–β)n=o(n). (.) Thus by the Chebychev inequality and (.),

T=op(

√

n). (.)

Write

d_λ_exp_(–_λ_d₎  –exp(–λd) –

d_λ

= d



( –exp(–λd))( –exp(–λd))U, (.) where

U=λexp(–λd) –exp(–λd)–λexp(–λd)

 –exp(–λd). Note that

U=λexp(–λd)exp(–λd) –exp(–λd)

+λexp(–λd) –exp(–λd)+ (λ–λ)exp(–λd)

=o(), (.)

so we have

(19)

Thus, by (.), (.), (.) and (.), we have

T=o(n). (.)

By (.), we have

T=dexp(–λd)[( –dλ) –dλexp(–λd)] ( –exp(–λd))

n

t=

εt–

n

t=

et–

=dexp(–λd)[( –dλ) –dλexp(–λd)] ( –exp(–λd)) σ

n

t=

xT_t_–(β–β)ηt

+

dexp(–λd)[( –dλ) –dλexp(–λd)] ( –exp(–λd)) σ

·σ

n

t=

ηtet–

=T+T. (.)

It is easy to show that

T=o(n). (.)

Note that{ηtet–,Ht}is a martingale diﬀerence sequence, so we have

Var

_n

t=

ηtet–

=

n

t=

Ee_t_–=n(θ,σ).

Hence,

T=o(n). (.)

By (.)-(.), we have

T=o(n). (.)

It is easily proved that

T=dexp(–λd)[dλ–  + ( +dλ)exp(–λd)] ( –exp(–λd))

n

t=



n

t=

(20)

=

dexp(–λd)[dλ–  + ( +dλ)exp(–λd)] ( –exp(–λd)) σ

·σ

 –exp(–λd) λ

n

t=

η_t =o(n). (.)

Hence, (.) follows immediately from (.)-(.), (.), (.) and (.). This com-pletes the proof of (.) from (.), (.), (.) and (.).

It is well known thatλ(–exp(–dλ))

λ(–exp(–dλ))→ asn→ ∞. To prove (.), we need to show that

–σ∂n

∂λ |θ=θ

n(θ,σ) →p

, n→ ∞.

This follows immediately from (.) and the Markov inequality. Finally, we will prove (.). By (.) and (.), we have

D–

  n F(θ)D

–T_

n →pIm, n→ ∞ (.)

uniformly inθ∈Nn(A) for eachA> . Thus, by Lemma .,

λmin

D–

  n F(θ)D

–T_

n

→p, n→ ∞. (.)

This implies (.).

Lemma . (Hall and Heyde []) Let {Sni,Fni, ≤ i ≤ kn,n ≥} be a zero-mean,

square-integrable martingale array with diﬀerences Xni, and let η be an a.s. ﬁnite

random variable. Suppose that _iE{X

niI(|Xni|> ε)|Fn,i–} →p  for all ε → , and

iE{Xni|Fn,i–} →pη.Then

Snkn=

i

Xni→DZ,

where the r.v.Z has the characteristic function E{exp(–_ηt₎_}_.

6 Proof of theorems

Proof of Theorem. TakeA> , let

Mn(A) = θ∈Rm+: (θ–θ)TDn(θ–θ) =A

(.)

be the boundary ofNn(A), and letθ∈Mn(A). Using (.) and the Taylor expansion, for

eachσ_{> , we have}

n

θ,σ=n

θ,σ

+ (θ–θ)T

∂n(θ,σ)

∂θ +

 (θ–θ)

T∂n(θ,σ)

∂θ ∂θT (θ–θ)

= 

σn

θ,σ

+ (θ–θ)TSn(θ) – 

σ(θ–θ)

T_F

(21)

LetQn(θ) =_(θ–θ)TFn(θ˜)(θ–θ) andvn(θ) = _AD

T



n(θ–θ). Takec>  andθ∈Mn(A),

and by (.), we obtain that

P n

θ,σ≥n

θ,σ

for someθ∈Mn(A)

≤P (θ–θ)TSn(θ)≥Qn(θ),Qn(θ) >cAfor someθ∈Mn(A)

+P Qn(θ)≤cAfor someθ∈Mn(A)

≤P vT_n(θ)D–

 

n Sn(θ) >cAfor someθ∈Mn(A)

+P vT_n(θ)D–

  n Fn(θ˜)D

–T_

n vn(θ)≤cfor someθ∈Mn(A)

≤P D–

 

n Sn(θ)>cA

+P

inf

θ∈Nn(A)λmin

D–

  n Fn(θ˜)D

–T n

≤c

. (.)

By Lemma . and the Chebychev inequality, we obtain

P D–

 

n Sn(θ)>cA

≤Var(D –_

n Sn(θ)) c_A =

σ 

c_A. (.)

LetA→ ∞, thenc↓, and using (.), we have

P

inf

ϕ∈Nn(A)λmin

D–

  n Fn(θ˜)D

–T_

n

≤c

→. (.)

By (.)-(.), we have

lim

A→∞lim infn→∞ P n

θ,σ<n

θ,σ

for allθ∈Mn(A)

= . (.)

By Lemma ., λmin(Xn(θ))→ ∞asn→ ∞. Hence λmin(Dn)→ ∞. Moreover, from

(.), we have

inf

θ∈Nn(A)

λmin

Fn(θ)

→p∞.

This implies thatn(θ,σ) is concave onNn(A). Noting this fact and (.), we get

lim

A→∞lim infn→∞ P

sup

θ∈Mn(A)

n

θ,σ<n

θ,σ

,n

θ,σis concave onNn(A)

= . (.)

On the event in the brackets, the continuous functionn(θ,σ) has a unique maximum

inθ over the compact neighborhoodNn(A). Hence

lim

A→∞lim infn→∞ P Sn _ˆ

θn(A)

=  for a uniqueθˆn(A)∈Nn(A)

= .

Moreover, there is a sequenceAn→ ∞such thatθˆn=θˆ(An) satisﬁes

lim inf

n→∞ P Sn(θˆn) =  andθˆnmaximizesn

θ,σuniquely inNn(A)

(22)

Thisθˆn= (βˆn,λˆn) is a QML estimator forθ. It is clearly consistent, and

lim

A→∞lim infn→∞ P θˆn∈Nn(A)

= .

Sinceθˆn= (βˆn,λˆn) are ML estimators forθ,σˆnis an ML estimator forσfrom (.). To complete the proof, we will show thatσˆ_n→σ_asn→ ∞. Ifθˆn∈Nn(A), thenβˆn∈

Nnβ(A) andλˆn∈Nnλ(A).

By (.) and (.), we have

ˆ

εt–exp(–λˆnd)εˆt–=

T

(β–βˆn) +

et–exp(–λˆnd)et–

. (.)

By (.), (.) and (.), we have

(n– )σˆ_n= λˆn  –exp(–λˆnd)

n

t=

ˆ



= λˆn  –exp(–λˆnd)

n

t=

ˆ

· xt–exp(–λˆnd)xt–

T

(β–βˆn) +

n

t=

ˆ

T

(β–βˆn)

+ λˆn  –exp(–λˆnd)

n

t=

ˆ

n

t=

ˆ

. (.)

From (.), it follows that

n

t=

T

(β–βˆn) 

=

n

t=

ˆ



– 

n

t=

ˆ

+

n

t=



. (.)

From (.), we get

n

t=



=

n

t=

exp(–λd)et–+σ

 –exp(–λd) λ

ηt–exp(–λˆnd)et–