Local power of some tests in exponential family nonlinear models
Artur J. Lemonte
Department of Statistics, University of S~ao Paulo, Brazil
a r t i c l e
i n f o
Article history: Received 8 May 2010 Accepted 13 December 2010 Available online 21 December 2010 Keywords:
Asymptotic expansions Exponential family Gradient test Likelihood ratio test Local power Nonlinear model Score test Wald test
a b s t r a c t
In this paper we obtain asymptotic expansions up to order n1/2 for the nonnull
distribution functions of the likelihood ratio, Wald, score and gradient test statistics in exponential family nonlinear models (Cordeiro and Paula, 1989), under a sequence of Pitman alternatives. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters and for testing the dispersion parameter, thus generalising the results given inCordeiro et al. (1994)andFerrari et al. (1997). We also present Monte Carlo simulations in order to compare the finite-sample performance of these tests.
&2010 Elsevier B.V.
1. Introduction
The class of exponential family nonlinear models (EFNLMs), first defined byCordeiro and Paula (1989), is a natural extension of the generalised linear models (GLMs) and of the normal nonlinear regression models discussed in details by
McCullagh and Nelder (1989) and Ratkowsky (1983, 1990), respectively. This class of models is defined by a set of independent random variables with a distribution in the exponential family and by a monotonic function that relates the mean response to a nonlinear predictor involving covariates and unknown regression parameters. The definition of such models also includes a dispersion parameter (the variance in normal models, for example). The book byWei (1998)gives a comprehensive introduction to EFNLMs. In particular, the author pays more attention to regression diagnostics and influence analysis for these models. There are several recent articles considering the class of EFNLMs; see for example,Lin et al. (2003),
Cysneiros and Ferrari (2006),Cordeiro and Santana (2008),Ferrari and Cysneiros (2008),Cavalcante et al. (2009),Simas and Cordeiro(2009)andKosmidis and Firth (2009), among others.
In EFNLMs hypothesis testing inference is usually performed using the likelihood ratio, Wald and Rao score tests. A new criterion for testing hypothesis, referred to as the gradient test, has been proposed byTerrell (2002). Its statistic shares the same first-order asymptotic properties with the likelihood ratio, Wald and score statistics and is very simple when compared with the other three classic tests. In fact,Rao (2005)wrote: ‘‘The suggestion by Terrell is attractive as it is simple to compute. It would be of interest to investigate the performance of the [gradient] statistic.’’ To the best of our knowledge, however, there is no mention in the statistical literature on the use of the gradient test in EFNLMs.
In this paper, our main objective is to derive nonnull asymptotic expansions to order n1/2of the distribution functions of
the likelihood ratio, Wald, score and gradient statistics under a sequence of local alternatives, i.e. a sequence of Pitman alternatives converging to the null hypothesis at a convergence rate n1/2, and to compare the local power of the tests which
Contents lists available atScienceDirect
journal homepage:www.elsevier.com/locate/jspi
Journal of Statistical Planning and Inference
0378-3758 & 2010 Elsevier B.V. doi:10.1016/j.jspi.2010.12.010
Fax: + 55 11 30916130.
E-mail address: [email protected]
Open access under the Elsevier OA license.
are based on these statistics in EFNLMs, thus generalising the results presented inCordeiro et al. (1994)andFerrari et al. (1997). Additionally, in order to compare the finite-sample performance of these tests we also consider a Monte Carlo simulation study.
The nonnull asymptotic expansions up to order n1/2for the densities of the likelihood ratio and Wald statistics were
derived byHayakawa (1975), while an analogous result for the score statistic was obtained byHarris and Peers (1980). Recently, the asymptotic expansion up to order n1/2for the density of the gradient statistic was derived byLemonte and Ferrari (2010). The expansions obtained by these authors are extremely general but it can be very difficult or even impossible to particularize their formulas for specific regression models. As we shall see below, we have been able to apply their results for the EFNLMs.
The rest of the paper is organized as follows. Section 2 briefly describes the likelihood ratio, Wald, score and gradient tests. Section 3 defines the class of EFNLMs. In Section 4, we derive the nonnull asymptotic expansions for testing hypotheses on the parameters of the model. We compare the local powers of the likelihood ratio, Wald, score and gradient tests in Section 5. We give an explicit and general formula for the difference in power of any two criteria. Monte Carlo simulation results on the finite-sample performance of the tests are presented and discussed in Section 6. Finally, Section 7 closes the paper with some conclusions.
2. Asymptotic tests: background
Let ‘ð
h
Þdenote the total log-likelihood function and consider the partitionh
¼ ðh
> 1,h
> 2Þ
>
, where the dimensions of
h
1andh
2are q and k q, respectively, i.e.
h
is a k-vector of unknown parameters. Let Uhand Khdenote the score function and theinformation matrix for
h
, respectively. The partition forh
induces the corresponding partitionsUh¼ ðU>h1,U > h2Þ > , Kh¼ Kh11 Kh12 Kh21 Kh22 ! , K1h ¼ K11 K12 K21 K22 ! , where K1
h is the inverse of Kh. Suppose the interest lies in testing the composite null hypothesis H0:
h
2¼h
20 againstH1:
h
2ah
20, whereh
20is a specified vector. Hence,h
1is a vector of nuisance parameters.The likelihood ratio (S1), Wald (S2), score (S3) and gradient (S4) statistics for testing H0versus H1are given, respectively, by
S1¼2f‘ð ^
h
Þ‘ð ~h
Þg, S2¼ ð ^h
2h
20Þ>K^ 221 ð ^h
2h
20Þ, S3¼ ~U > h2 ~ K22U~h2, S4¼ ~U > h2ð ^h
2h
20Þ, where ^h
¼ ð ^h
>1, ^h
> 2Þ > and ~h
¼ ð ~h
>1,h
> 20Þ >denote the maximum likelihood estimators of
h
¼ ðh
> 1,h
> 2Þ
>
under H1 and H0,
respectively, ^K22¼K22ð ^
h
Þ, ~K22¼K22ð ~h
Þand ~Uh2¼Uh2ð ~h
Þ. The limiting distribution of S1, S2, S3and S4isw
2kqunder H0and
w
2kq,l, i.e. a non-central chi-square distribution with k q degrees of freedom and an appropriate non-centrality parameter
l
,under H1. The null hypothesis is rejected for a given nominal level,
g
say, if the test statistic exceeds the upper 100ð1g
Þ%quantile of the
w
2kqdistribution.
Clearly, S4has a very simple form and does not involve knowledge of the information matrix, neither expected nor
observed, unlike S2and S3.Terrell (2002)points out that the gradient statistic ‘‘is not transparently non-negative, even though
it must be so asymptotically.’’ His Theorem 2 implies that if the log-likelihood function is concave and is differentiable at ~
h
, then S4Z0.3. Exponential family nonlinear models
We assume that the random variables y1,y,ynare independent and each ylhas a probability density function of the form
p
ðy;x
l,f
Þ ¼exp½f
fyx
lbðx
lÞg þcðy,f
Þ, l ¼ 1, . . . ,n, ð1Þwhere bðÞ and cð,Þ are known appropriate functions. The mean and the variance of ylare EðylÞ ¼
m
l¼dbðx
lÞ=dx
l andvarðylÞ ¼
f
1Vl, where Vl¼dm
l=dx
lis called the variance function andx
l¼qðm
lÞ ¼R V1
l d
m
lis a known one-to-one function ofm
l. The choice of the variance function Vlas a function ofm
ldetermines qðm
lÞ. We have Vl¼1½qðm
lÞ ¼m
l, Vl¼m
2l½qðm
lÞ ¼ 1=m
land Vl¼
m
3l½qðm
lÞ ¼ 1=ð2m
2lÞfor normal, gamma and inverse Gaussian models, respectively. The parametersx
landf
40 in(1) are called the canonical and precision parameters, respectively. Also,
f
1 is the dispersion parameter. Further, it is assumed that the precision parameter is unknown and it is the same for all observations.The systematic part of the model is defined by
dð
m
lÞ ¼Z
l¼f ðxl;b
Þ, l ¼ 1, . . . ,n, ð2Þwhere dðÞ is a known one-to-one differentiable link function, x>
l ¼ ðxl1, . . . ,xlmÞis a vector of known variables associated with
the lth observable response,
b
¼ ðb
1, . . . ,b
pÞ>is a set of unknown parameters to be estimated ðmrponÞ and f ðxl;b
Þis a knowncontinuously differentiable nonlinear function such that the local derivative matrix X
¼X
ð
b
Þ ¼@g
=@b
>has rank p for allb
, whereg
¼ ðZ
1, . . . ,Z
nÞ>. Here we assume only identifiability in the sense that distinctb
’s imply distinctZ
’s. The n p localmodel matrix X
Let ‘ð
b
,f
Þdenote the total log-likelihood function for a given EFNLM. The total score function and the total Fisher information matrix forb
are given, respectively, by Ub¼f
X>W1=2V1=2ðyl
Þ and Kb¼f
X>WX, whereW ¼ diagfw1, . . . ,wng with wl¼Vl1ðd
m
l=dZ
lÞ 2, V ¼ diagfV1, . . . ,Vng, y ¼ ðy1, . . . ,ynÞ> and
l
¼ ðm
1, . . . ,m
nÞ >. The maximum likelihood estimate (MLE) ^
b
ofb
can be obtained iteratively using standard reweighted least squares method (Cordeiro and Paula, 1989; Paula, 1992)XðmÞ>WðmÞXðmÞ
b
ðm þ 1Þ¼XðmÞ>WðmÞyðmÞ, m ¼ 0,1, . . . , where yðmÞ¼XðmÞb
ðmÞþNðmÞðyl
ðmÞÞ is an adjusted dependent variable and the matrix N assumes the formN ¼ diagfðd
m
1=dZ
1Þ 1, . . . ,ðd
m
n=dZ
nÞ 1g.
Estimation of the dispersion parameter
f
by the maximum likelihood method is a more difficult problem than the estimation ofb
and the complexity depends on the functional form of cðy,f
Þ. The MLE ^f
off
is a function of the deviance (Dp)of the model, which is defined as Dp¼2Pnl ¼ 1fvðylÞvð ^
m
lÞ þ ð ^m
lylÞqð ^m
lÞg, where vðzÞ ¼ zqðzÞbðqðzÞÞ and ^m
ldenotes the MLE ofm
l(l = 1,y,n). That is, given the estimate ^b
, the MLE off
can be found as the solution of the equationXn l ¼ 1 @cðyl,
f
Þ @f
f¼ ^f¼ Dp 2 Xn l ¼ 1 vðylÞ:When (1) is a two-parameter full exponential family distribution with canonical parameters
f
andfx
, the term cðy,f
Þcan be written as cðy,f
Þ ¼f
c0ðyÞ þ c1ðf
Þ þc2ðyÞ, and the estimate off
is obtained fromc1uð
f
Þ ¼1 n Dp 2 Xn l ¼ 1 tðylÞ ( ) ,where c1uð
f
Þ ¼dc1ðf
Þ=df
and tðylÞ ¼vðylÞ þc0ðylÞ, for l= 1,y,n.Table 1gives the functions c1ðf
Þ, v(y) and t(y) for normal,inverse Gaussian and gamma models. For normal and inverse Gaussian models we have that ^
f
¼n=Dp, whereas for thegamma model the MLE ^
f
is obtained from logð ^f
Þc
ð ^f
Þ ¼Dp=ð2nÞ, wherec
ðÞis the digamma function, thus requiring the useof a nonlinear numerical algorithm. For further details seeCordeiro and McCullagh (1991).
In what follows, we shall consider the tests which are based on the likelihood ratio (S1), Wald (S2), Rao score (S3) and
gradient (S4) statistics in the class of EFNLMs for testing a composite null hypothesis. The hypothesis of interest is
H0:
b
2¼b
20, which will be tested against the alternative hypothesis H1:b
2ab
20, whereb
is partitioned asb
¼ ðb
>1,b
> 2Þ>,
with
b
1¼ ðb
1, . . . ,b
qÞ>andb
2¼ ðb
q þ 1, . . . ,b
pÞ>. Here,b
20is a fixed column vector of dimension p q. The partition forb
vectorinduces the corresponding partitions Ub¼ ðU>b1,U > b2Þ >, with U b1¼
f
X > 1 W 1=2 V1=2ðyl
Þand Ub2¼f
X > 2 W 1=2 V1=2ðyl
Þ, Kb¼ Kb11 Kb12 Kb21 Kb22 ! ¼f
X > 1 WX 1 X > 1 WX 2 X> 2 WX 1 X > 2 WX 2 ! ,with the matrix X
partitioned as X ¼ ðX 1X 2Þ, X 1being n q and X
2being n ðpqÞ. The likelihood ratio, Wald, score and
gradient statistics for testing H0can be expressed, respectively, as
S1¼2f‘ð ^
b
1, ^b
2, ^f
Þ‘ð ~b
1,b
20, ~f
Þg, S2¼ ^f
ð ^b
2b
20Þ>ð ^R > ^ W ^RÞð ^b
2b
20Þ, S3¼ ~s>W~ 1=2~ X2ð ~R > ~ W ~RÞ1X~>2 W~ 1=2 ~s, S4¼ ~f
1=2 ~s>~ W1=2X~2ð ^b
2b
20Þ,where ð ^
b
1, ^b
2, ^f
Þ and ð ^b
1,b
20, ~f
Þ are the maximum likelihood estimators of ðb
1,b
2,f
Þ under H1 (unrestricted) and H0(restricted), respectively, s ¼
f
1=2V1=2ðyl
Þis the Pearson residual vector and R ¼ X
2X1B, where B ¼ ðX>1 WX1Þ1X>1 WX2
represents a q ðpqÞ matrix whose columns are the vectors of regression coefficients obtained in the weighted normal linear regression of the columns of X
2on the local model matrix X
1with W as a weight matrix. Here, tildes and hats indicate
quantities available at the restricted and unrestricted maximum likelihood estimators, respectively. The limiting distribution of all these statistics under H0is
w
2pq.Now, the problem under consideration is that of testing a composite null hypothesis H0:
f
¼f
0against H1:f
af
0,where
f
0is a positive specified value forf
andb
acts as a nuisance parameter. The four statistics are expressed as follows:S1¼2nfc1ð ^
f
Þc1ðf
0Þð ^f
f
0Þc1uð ^f
Þg, S2¼ nð ^f
f
0Þ2c100ð ^f
Þ,Table 1
Some special models.a
Model c1ðfÞ v(y) t(y)
Normal logðfÞ=2 y2/2 0
Inverse Gaussian logðfÞ=2 1/(2y) 0
Gamma flogðfÞlogfGðfÞg log(y) 1 1
a
S3¼
nfc1uð ^
f
Þc1uðf
0Þg2c100ð
f
0Þ, S4¼nfc1uð
f
0Þc1uð ^f
Þgð ^f
f
0Þ,where c100ð
f
Þ ¼dc1uðf
Þ=df
. For example, we have c1ðf
Þ ¼logðf
Þ=2 for normal and inverse Gaussian models, which yieldsS1¼2n log ^
f
f
0 ! ^f
f
0 ^f
! ( ) , S2¼S3¼ n 2 ^f
f
0 ^f
( )2 , S4¼ n 2 ^f
f
0f
0 ^f
f
0 ^f
( ) :For the gamma model, we have
S1¼2n
f
0log ^f
f
0 ! logG
ð ^f
ÞG
ðf
0Þ ! ð ^f
f
0Þð1c
ð ^f
ÞÞ ( ) , S2¼nf ^fc
uð ^f
Þ1gð ^f
f
0Þ2 ^f
, S3¼ nf
0flogð ^f
=f
0Þðc
ð ^f
Þc
ðf
0ÞÞgf
0c
uðf
0Þ1 , and S4¼nð ^f
f
0Þ log ^f
f
0 ! þc
ð ^f
Þc
ðf
0Þ ( ) ,where
c
uðÞis the trigamma function.4. Nonnull asymptotic expansions in EFNLMs
In this section we shall assume the following local alternative hypothesis H1n:
b
2¼b
20þe
, wheree
¼ ðe
q þ 1, . . . ,e
pÞ>withe
r¼Oðn1=2Þfor r = q+ 1,y,p. Since the expansions ofHayakawa (1975),Harris and Peers (1980)andLemonte and Ferrari (2010)were developed for continuous distributions, the results derived in this section are only valid for continuous EFNLMs such as normal, inverse Gaussian and gamma models.Some additional notation is in order. Let
e
¼ K 1 b11Kb12 Ipq !e
, A ¼ K 1 b11 0 0 0 ! , M ¼ K1b A,where Ipqis a ðpqÞ ðpqÞ identity matrix. Additionally, let Z¼XðX>WXÞ1X>¼ fzlmg, Z1¼X 1ðX > 1 WX 1Þ 1 X> 1 ¼ fz1lmg, X l ¼ @2
Z
l @b
r@b
s ¼ X 11l X 12l X 21l X 22l ! , r,s ¼ 1, . . . ,p, l ¼ 1, . . . ,n, Zd¼diagfz11, . . . ,znng, Z1d¼diagfz111, . . . ,z1nng, F ¼ diagff1, . . . ,fng, G ¼ diagfg1, . . . ,gng, t ¼ ðt1, . . . ,tnÞ>¼X
e
, e ¼ ðe1, . . . ,enÞ>¼X
2
e
, T ¼ diagft1, . . . ,tng, Tð2Þ¼T T, Tð3Þ¼Tð2ÞT and E ¼ diagfe1, . . . ,eng, where ‘‘ ’’ denotes the Hadamard (direct) product ofmatrices and fl¼ 1 Vl d
m
l dZ
l d2m
l dZ
2 l , gl¼ 1 Vl dm
l dZ
l d2m
l dZ
2 l 1 V2 l dVl dm
l dm
l dZ
l 3 , l ¼ 1, . . . ,n:The nonnull distributions of the statistics S1, S2, S3and S4under Pitman alternatives for testing H0:
b
2¼b
20in EFNLMs canbe expressed as
PrðSirxÞ ¼ Gpq,lðxÞ þ
X3 k ¼ 0
bikGpq þ 2k,lðxÞ þ Oðn1Þ, i ¼ 1,2,3,4,
where Gm,lðxÞ is the cumulative distribution function of a central chi-square variate with m degrees of freedom and
non-centrality parameter
l
. Here,l
¼f
trfK22:1ee
>g=2, where K22:1¼Kb22Kb21K1b11Kb12and trðÞ is the trace operator. The coefficientsbik’s (i=1,2,3,4 and k=0,1,2,3) can be written, after extensive algebra, as b11=b11 L +b11 NL , b12=b12 L + b12 NL , b13=b13 L + b13 NL , b21=b21 L + b21 NL , b22 =b22 L + b22 NL , b23=b23 L + b23 NL , b31= b31 L + b31 NL , b32= b32 L + b32 NL , b33= b33 L + b33 NL , b41= b41 L + b41 NL , b42= b42 L + b42 NL , b43= b43 L + b43 NL , with bL 11¼
f
2trfðF þ GÞET ð2Þ þFTð3Þg þ1 2trfFZ 1dTg, bNL 11¼f
2trfWTðC þ 2PÞg þ 1 2trfWJTg, bL 12¼f
6trfðFGÞT ð3Þg, bNL 12¼b L 13¼b NL 13¼0,bL 21¼
f
2trfðF þGÞET ð2Þ þFTð3Þg þ1 2trfFZ dT þ2GðZ dZ 1dÞTg, bNL 21¼f
2trfWTðC þ2PÞg þ 1 2trfWðUT þ 2HÞg, bL 22¼f
2trfGT ð3Þ g1 2trfðF þ2GÞðZ dZ 1dÞTg, bNL 22¼f
2trfWTCg 1 2trfWTðUJÞ þ 2WHg, bL 23¼f
6trfðF þ2GÞT ð3Þ g, bNL 23¼f
2trfWTCg, bL 31¼f
2trfðF þGÞET ð2ÞþFTð3Þ g þ1 2trfFZ 1dT þ ðFGÞðZdZ1dÞTg, bNL 31¼f
2trfWTðC þ2PÞg þ 1 2trfWTJg, b L 32¼ 1 2trfðFGÞðZ dZ 1dÞTg, bL 33¼f
6trfðFGÞT ð3Þg, bNL 32¼bNL33¼0, bL 41¼f
2trfðF þGÞET ð2Þ þFTð3Þg þ1 4trfð3F þ2GÞZ 1dTðF þ 2GÞZ dTg, bNL 41¼f
2trfWTðC þ2PÞg þ 1 4trfWTð3JUÞ2WHg, bL 42¼f
4trfFT ð3Þ g þ1 4trfðF þ2GÞðZ dZ 1dÞTg, bNL 42¼f
4trfWTCg þ 1 4trfWTðUJÞ þ2WHg, bL43¼f
12trfðF þ2GÞT ð3Þ g, bNL43¼f
4trfWTCg, where U ¼ diagfu1, . . . ,ungwith ul¼trfXlðX> WX Þ1g, J ¼ diagfj1, . . . ,jngwith jl¼trfX11lðX > 1 WX 1Þ 1 g, C ¼ diagfc1, . . . ,cngwith cl¼trfXl
e
e
>g, P ¼ diagfp1, . . . ,pngwith pl¼trfXle
d
> g, H ¼ diagfh1, . . . ,hngwith hl¼f
trfMXle
x>l g,d
> ¼ ð0>,e
>Þand x> l isthe lth line of X. The coefficients b
i0are obtained from bi0= (bi1+ bi2+ bi3), for i=1,2,3,4. The bik’s are of order n1/2and all
quantities except
e
are evaluated under the null hypothesis H0. Details of the derivation of these expressions are tedious and may beobtained from the author upon request.
A brief commentary on these coefficients is in order. They depend on the second derivative of the nonlinear function f ðxl;
b
Þ. They are functions of the local derivative matrix, of the dispersion parameter and of the means. They involve the linkfunction and its first and second derivatives, and the variance function through its first derivative. The coefficients bik L
(i= 1,2,3 and k= 0,1,2,3) become equal to the corresponding coefficients developed for GLMs (Cordeiro et al., 1994) by replacing Z
and Z
1by Z and Z1, respectively. On the other hand, the coefficients bik NL
(i=1,2,3 and k= 0,1,2,3) may be regarded as the amount of nonlinearity in the likelihood ratio, Wald and Rao score statistics induced by the systematic component of the model, since they vanish for linear models. The coefficients b4k
NL
(k= 0,1,2,3) for the gradient statistic have the same interpretation as above described. In particular, for GLMs we have b4k
NL
= 0 (k= 0,1,2,3). Hence, the nonnull asymptotic expansion of the gradient statistic is defined by the coefficients b4k
L
(k=0,1,2,3), which seems to be a new result in this class of models.
Some relevant reductions in the coefficients bik(i= 1,2,3,4 and k=0,1,2,3) can be achieved by examining special models.
For example, consider the null hypothesis H0:
b
¼b
0(i.e. q =0) and in the exponential family (1) consider an identity linkfunction, which implies that fl= 0 and gl¼ Vl2dVl=d
m
l(l= 1,y,n). Thus, the bik’s can be written asb11¼
f
2trfGET ð2Þ g þf
2trfWTðC þ2PÞg, b12¼f
6trfGT ð3Þ g, b13¼0, b21¼f
2trfGET ð2Þ g þtrfGZ dTg þf
2trfWTðC þ2PÞg þ 1 2trfWðUT þ 2HÞg, b22¼f
2trfGT ð3Þ gtrfGZ dTg þf
2trfWTCg 1 2trfWðUT þ 2HÞg,b23¼
f
3trfGT ð3Þ gf
2trfWTCg, b31¼f
2trfGET ð2Þ g1 2trfGZ dTg þf
2trfWTðC þ2PÞg, b32¼ 1 2trfGZ dTg b33¼f
6trfGT ð3Þ g, b41¼f
2trfGET ð2Þ g1 2trfGZ dTg þf
2trfWTðC þ2PÞg 1 4trfWðUT þ2HÞg, b42¼ 1 2trfGZ dTgf
4trfWTCg þ 1 4trfWðUT þ 2HÞg, b43¼f
6trfGT ð3Þ g þf
4trfWTCg,and bi0= (bi1+bi2+ bi3), for i =1,2,3,4. For the normal model, the above coefficients reduce to
b11¼b31¼
f
2trfWTðC þ2PÞg, b12¼b13¼b32¼0, b21¼f
2trfWTðC þ 2PÞg þ 1 2trfWðUT þ 2HÞg, b22¼ 2b42¼f
2trfWTCg 1 2trfWðUT þ 2HÞg, b23¼ 2b43¼f
2trfWTCg, b41¼f
2trfWTðC þ 2PÞg 1 4trfWðUT þ 2HÞg, and bi0= (bi1+bi2+ bi3), for i=1,2,3,4.We now turn to the problem of testing hypotheses on
f
, the precision parameter. The nonnull asymptotic distributions of the statistics S1, S2, S3and S4for testing H0:f
¼f
0under the local alternative H1n:f
¼f
0þe
, wheree
¼f
f
0is assumed tobe O(n1/2), is
PrðSirxÞ ¼ G1,lðxÞ þ
X3 k ¼ 0
bikG1 þ 2k,lðxÞ þ Oðn1Þ, i ¼ 1,2,3,4,
with
l
¼ nc100ðf
0Þe
2. The bik’s for the test of H0:f
¼f
0are easy to obtain and are given byb11¼ p
e
2f
0 , b12¼ nc000 1ðf
0Þe
3 6 , b13¼0, b21¼b31¼ pe
2f
0 c 000 1ðf
0Þe
2c100ðf
0Þ , b22¼b32¼ c000 1ðf
0Þe
2c100ðf
0Þ , b23¼b33¼ nc000 1ðf
0Þe
3 6 , b41¼ pe
2f
0 þc 000 1ðf
0Þe
4c100ðf
0Þ , b42¼ c000 1ðf
0Þe
4c100ðf
0Þ nc 000 1ðf
0Þe
3 4 , b43¼ nc000 1ðf
0Þe
3 12 , where c0001ð
f
Þ ¼dc100ðf
Þ=df
and bi0¼ ðbi1þbi2þbi3Þ, for i= 1,2,3,4. It should be noticed that the above expressions depend onthe model only through
f
and the rank of the matrix X; they do not involve the unknown parameterb
. The coefficients bik(i= 1,2,3 and k= 0,1,2,3) are exactly the same given byCordeiro et al. (1994)for GLMs. Therefore, the nonnull asymptotic expansions of the statistics S1, S2and S3for testing H0:
f
¼f
0are the same for any nonlinear regression structure with thesame p. The coefficients b4k(k =0,1,2,3), which are also valid for GLMs, seem to be a new result.
5. Power comparisons
It is known that, to the first-order of approximation, the likelihood ratio, Wald, score and gradient statistics have the same asymptotic distributional properties either under the null hypothesis or under a sequence of local alternatives. On the other hand, up to an error of order n1the corresponding criteria have the same size properties but their local powers differ in the
n1/2term. A meaningful comparison among the criteria can then be performed by comparing the nonnull asymptotic
expansions to order n1/2
In what follows, we shall compare the local powers among the rival tests based on the general nonnull asymptotic expansions derived in Section 4 for testing the null hypothesis H0:
b
2¼b
20. LetP
ibe the power function, up to order n1/2, ofthe test that uses the statistic Si, for i= 1,2,3,4. We have
P
iP
j¼X3 k ¼ 0
ðbjkbikÞGpq þ 2k,lðxÞ, ð3Þ
for iaj. It is well known that
Gm,lðxÞGm þ 2,lðxÞ ¼ 2gm þ 2,lðxÞ, ð4Þ
where gn,lðxÞ is the probability density function of a non-central chi-square random variable with
n
degrees of freedom andnon-centrality parameter
l
. From (3) and (4), after some algebra, we haveP
1P
4¼k1gpq þ 4,lðxÞ þ k2gpq þ 6,lðxÞ,P
2P
4¼k3gpq þ 4,lðxÞ þk4gpq þ 6,lðxÞ,P
3P
4¼k5gpq þ 4,lðxÞ þ k6gpq þ 6,lðxÞ,P
1P
2¼k7gpq þ 4,lðxÞ þk8gpq þ 6,lðxÞ,P
1P
3¼k9gpq þ 4,lðxÞ þ k10gpq þ 6,lðxÞ,P
2P
3¼k11gpq þ 4,lðxÞ þk12gpq þ 6,lðxÞ, ð5Þ where k1¼ 12trfðF þ 2GÞðZ dZ 1dÞTg þ12trfWTðJUÞ2WHg, k2¼f
6trfðF þ 2GÞT ð3Þ gf
2trfWTCg, k3¼3k1, k4¼3k2, k5¼k1trfðFGÞðZdZ 1dÞTg, k6¼f
2trfFT ð3Þ gf
2trfWTCg, k7¼ 2k1, k8¼ 2k2, k9¼k1k5, k10¼f
3trfðFGÞT ð3Þ g, k11¼ 3trfGðZdZ 1dÞTgtrfWTðUJÞ þ2WHg, k12¼f
trfGTð3Þgf
trfWTCg:From Eqs. (5) we have
P
14P
3if k9Z0 and k10Z0 with k9þk1040, and if k9r0 and k10r0 with k9þk10o0, we haveP
1oP
3. Also,P
1¼P
3if k9= k10= 0, i.e. F ¼ G, which occurs only for normal models with any link function. Additionally,Eqs. (5) show that unless the tests which are based on the statistics S1(likelihood ratio statistic) and S3(score statistic), is not
possible to have any other equality among the power functions in the class of EFNLMs for testing the null hypothesis H0:
b
2¼b
20, because the additional contribution yielded by the nonlinear regression structure given by the function f ðxl;b
Þin the k’s, which is represented by the quantities C, H, J and U, vanish only for linear models. It implies that only strict inequality holds for any other power comparison among the power functions. For example, from (5) we have
P
14P
4ð
P
1oP
4Þif k1Z0 and k2Z0 with k1þk240 (if k1r0 and k2r0 with k1þk2o0), and so on.We now consider the class of GLMs. In this class of models we have that C ¼ H ¼ J ¼ U ¼ 0. It is possible to show that
P
1¼P
2¼P
4if F ¼ 2G, that is d2m
l dZ
2 l ¼ 2 3Vl dm
l dZ
l 2 , l ¼ 1, . . . ,n:According toCordeiro et al. (1994), the GLMs for which this equality holds have the link function defined by
Z
l¼R
Vl3=2d
m
l(l=1,y,n), i.e. their log-likelihood functions are locally symmetric in the neighbourhood of the true parameter
b
. For the gamma model this function isZ
l¼m
1=3
l (l ¼ 1, . . . ,n). Additionally, we have that
P
3¼P
4for any GLM with identity link function, i.e.F ¼ 0. Also, the equality
P
1¼P
2¼P
3¼P
4holds only for normal models with identity link function. Power comparisons amongthe likelihood ratio, Wald and score tests in GLMs are not presented here and can be found inCordeiro et al. (1994). Now, we present an analytical comparison among the local powers of the four tests for testing the null hypothesis H0:
f
¼f
0. We haveP
iP
j¼ X3 k ¼ 0 ðbjkbikÞG1 þ 2k,lðxÞ: Thus,P
1P
2¼P
1P
3¼ c000 1ðf
0Þe
c100ðf
0Þ g5,lðxÞ þ nc000 1ðf
0Þe
3 6 g7,lðxÞ,P
1P
4¼ c000 1ðf
0Þe
2c100ðf
0Þ g5,lðxÞ nc000 1ðf
0Þe
3 6 g7,lðxÞ,P
2P
3¼0,P
2P
4¼P
3P
4¼ 3c000 1ðf
0Þe
2c100ðf
0Þ g5,lðxÞ nc000 1ðf
0Þe
3 2 g7,lðxÞ:For example, for normal and inverse Gaussian models we have that c1ð
f
Þ ¼logðf
Þ=2, which implies that c1uðf
Þ ¼1=ð2f
Þ, c100ðf
Þ ¼ 1=ð2f
2
Þ and c000
1ð
f
Þ ¼1=f
3. Hence, from above expressions we arrive at the following inequalities:
P
44P
14P
2¼P
3iff
4f
0, andP
4oP
1oP
2¼P
3iff
of
0.As we showed, there is no uniform superiority of one test with respect to the others for testing the null hypothesis H0:
b
2¼b
20in the class of EFNLMs. Hence, if the sample size is large, all tests could be recommended, since type I errorprobabilities of these tests do not significantly deviate from the true nominal level. However, the natural question now is how these tests perform when the sample size is small or of moderate size, and which one is the most reliable. In the next section, we shall use Monte Carlo simulations to put some light on this issue.
6. Simulation results
In this section we shall present the results of a Monte Carlo simulation in which we evaluate the finite-sample performance of the tests that use the likelihood ratio, Wald, score and gradient statistics in the following nonlinear normal model with identity link function:
m
l¼b
1xl1þexpðb
2xl2Þ þb
3xl3þ þb
pxlp, ð6Þwhere xl1 = 1, for l = 1,y,n. The lth line of the n p local matrix Xis given by x>l ¼ ðxl1,xl2, . . . ,xlpÞ, where xl1 n
= 1, x
l2¼xl2expð
b
2xl2Þand xlp n= xlpfor p Z3. The covariate values were selected as random draws from the uniform Uð1,2Þ
distribution and for fixed n those values were kept constant throughout the experiment. The number of Monte Carlo replications was 15,000, the nominal levels of the tests were
g
= 10%, 5% and 1%, and all simulations were performed using the Ox matrix programming language (Doornik, 2007). Ox is freely distributed for academic purposes and available athttp:// www.doornik.com.At the outset, the null hypothesis is H0:
b
p1¼b
p¼0, which is tested against a two-sided alternative, the sample size isn = 25 and
f
¼1 and 2. Different values of p were considered. The values of the response were generated usingb
1¼ ¼b
p2¼1. The null rejection rates (entries are percentages) of the four tests are presented inTable 2. Note thatthe likelihood ratio (S1) and Wald (S2) tests are markedly liberal, more so as the number of regressors increases. The tests that
use the statistics S3(score test) and S4(gradient test) are also liberal in most of the cases, but much less size distorted than the
likelihood ratio and Wald tests in all cases. For instance, when
f
¼2, p = 5 andg
¼5%, the rejection rates are 9.46% (S1), 12.11%(S2), 6.63% (S3) and 6.63% (S4). In some cases the score and gradient tests are slightly conservative. It is also noticeable that the
score and gradient tests present the same null rejection rates in most of the cases.
Table 3reports results for
f
¼1, p= 5 and sample sizes ranging from 15 to 150. The null hypothesis under test is H0:b
4¼b
5¼0. As expected, the null rejection rates of all the tests approach the corresponding nominal levels as the samplesize grows. Again, the score and gradient tests present the same null rejection rates in most of the cases and the best performances. For example, when n =15 and
g
¼5%, the null rejection rates are 13.26% (S1), 18.55% (S2), 7.59% (S3)and 7.60% (S4).
Overall, in small to moderate-sized samples the best performing tests are the score and the gradient tests. They are less size distorted than the other two. The gradient test has a slight advantage over the score test because the gradient statistic is
Table 2
Null rejection rates (%);f= 1 and 2, with n = 25.
p g¼10% g¼5% g¼1% S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 f¼1 4 14.71 17.15 12.16 12.16 8.33 10.87 5.97 5.97 2.34 4.02 0.86 0.86 5 15.81 18.24 13.13 13.11 8.90 11.66 6.35 6.35 2.55 4.26 1.01 1.01 6 17.31 20.09 14.38 14.39 10.16 12.99 7.31 7.30 3.09 4.93 1.37 1.37 7 19.61 22.48 16.67 16.68 11.81 14.88 8.49 8.49 3.88 5.99 1.67 1.67 8 20.86 23.59 17.53 17.53 13.01 15.88 9.84 9.84 4.19 7.01 2.01 2.01 f¼2 4 14.10 16.92 11.63 11.63 7.90 10.34 5.53 5.53 1.99 3.69 0.81 0.81 5 16.19 18.73 13.49 13.49 9.46 12.11 6.63 6.63 2.46 4.35 0.98 0.98 6 17.52 20.35 14.52 14.52 10.34 12.96 7.49 7.49 2.91 5.00 1.20 1.20 7 19.31 22.21 16.39 16.37 11.82 14.81 8.73 8.73 3.58 6.05 1.60 1.60 8 20.99 23.87 17.90 17.88 13.17 16.38 9.71 9.73 4.35 6.92 2.15 2.14
more simpler to calculate than the score statistic for testing a subset of regression parameters in EFNLMs. In particular, it is not necessary to invert a matrix; see Section 3.
7. Concluding remarks
In this paper, we dealt with the issue of performing hypothesis testing concerning the parameters in EFNLMs. We considered the three classic tests, likelihood ratio, Wald and score tests, and a recently proposed test, the gradient test. We have derived formulae for the asymptotic expansions up to order n1/2
of the distribution functions of the likelihood ratio, Wald, score and gradient statistics, under a sequence of Pitman alternatives, for testing a subset of regression parameters and for testing the dispersion parameter in EFNLMs. In particular, we generalise the results given inCordeiro et al. (1994)and
Ferrari et al. (1997), which are valid only for GLMs. The formulae derived are simple to be used analytically to obtain closed-form expressions for these expansions in special models. Additionally, the power of all four criteria, which are equivalent to first order, was compared under specific conditions based on second-order approximations. Finally, we also present Monte Carlo simulations in order to compare the finite-sample performance of these tests for testing a subset of regression parameters. Based on the simulation results we can conclude that the score and gradient tests should be preferred.
Acknowledgement
The author gratefully acknowledge the financial support of FAPESP (Brazil).
References
Cavalcante, A.B., Cordeiro, G.M., Botter, D.A., Barroso, L.P., 2009. Asymptotic skewness in exponential family nonlinear models. Communications in Statistics—Theory and Methods 38, 2275–2287.
Cordeiro, G.M., Botter, D.A., Ferrari, S.L.P., 1994. Nonnull asymptotic distributions of three classic criteria in generalised linear models. Biometrika 81, 709–720.
Cordeiro, G.M., McCullagh, P., 1991. Bias correction in generalized linear models. Journal of the Royal Statistical Society B 53, 629–643. Cordeiro, G.M., Paula, G.A., 1989. Improved likelihood ratio statistics for exponential family nonlinear models. Biometrika 76, 93–100.
Cordeiro, G.M., Santana, R.G., 2008. Covariance matrix formula for exponential family nonlinear models. Communication in Statistics—Theory and Methods 37, 2724–2734.
Cysneiros, A.H.M.A., Ferrari, S.L.P., 2006. An improved likelihood ratio test for varying dispersion in exponential family nonlinear models. Statistics and Probability Letters 76, 255–265.
Doornik, J.A., 2007. An Object-Oriented Matrix Language—Ox 5, fifth ed. Timberlake Consultants Press, London.
Ferrari, S.L.P., Botter, D.A., Cribari-Neto, F., 1997. Local power of three classic criteria in generalised linear models with unknown dispersion. Biometrika 84, 482–485.
Ferrari, S.L.P., Cysneiros, A.H.M.A., 2008. Skovgaard’s adjustment to likelihood ratio tests in exponentional family nonlinear models. Statistics and Probability Letters 78, 3047–3055.
Harris, P., Peers, H.W., 1980. The local power of the efficient score test statistic. Biometrika 67, 525–529.
Hayakawa, T., 1975. The likelihood ratio criterion for a composite hypothesis under a local alternative. Biometrika 62, 451–460. Kosmidis, I., Firth, D., 2009. Bias reduction in exponential family nonlinear models. Biometrika 96, 793–804.
Lemonte, A.J., Ferrari, S.L.P., 2010. The local power of the gradient test. Annals of the Institute of Statistical Mathematics, doi:10.1007/s10463-010-0315-4. Lin, J., Wei, B.C., Nansong, Z., 2003. Testing for varying dispersion in discrete exponential family nonlinear models. Applied Mathematics—A Journal of
Chinese University B 18, 296–302.
McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, second ed. Chapman and Hall, London.
Paula, G.A., 1992. Bias correction for exponential family nonlinear models. Journal of Statistical Computation and Simulation 40, 43–54.
Rao, C.R., 2005. Score test: historical review and recent developments. In: Balakrishnan, N., Kannan, N., Nagaraja, H.N. (Eds.), Advances in Ranking and Selection, Multiple Comparisons, and Reliability. Birkhauser, Boston.
Ratkowsky, D.A., 1983. Nonlinear Regression Modeling: A Unified Practical Approach. Marcel Dekker, New York. Ratkowsky, D.A., 1990. Handbook of Nonlinear Regression Models. Marcel Dekker, New York.
Simas, A.B., Cordeiro, G.M., 2009. Adjusted pearson residuals in exponential family nonlinear models. Journal of Statistical Computation and Simulation 79, 411–425.
Terrell, G.R., 2002. The gradient statistic. Computing Science and Statistics 34, 206–215. Wei, B.C., 1998. Exponential Family Nonlinear Models. Springer, Singapore.
Table 3
Null rejection rates (%);f¼1, p = 5 and different sample sizes.
n g¼10% g¼5% g¼1% S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 15 21.27 26.05 15.87 15.89 13.26 18.55 7.59 7.60 4.57 9.03 0.95 0.93 20 17.45 20.83 13.65 13.64 10.23 13.61 6.77 6.77 3.20 5.85 1.01 1.01 30 15.15 17.27 12.61 12.61 8.47 10.53 6.25 6.25 2.31 3.62 1.03 1.03 40 13.53 15.07 12.05 12.05 7.24 8.94 5.68 5.68 1.74 2.67 0.99 0.99 50 12.37 13.63 11.24 11.24 6.71 7.81 5.58 5.58 1.63 2.23 1.01 1.01 70 11.34 12.21 10.63 10.63 6.17 6.88 5.33 5.33 1.28 1.71 0.93 0.93 100 11.07 11.55 10.59 10.59 5.69 6.19 5.19 5.19 1.27 1.50 1.03 1.03 150 10.33 10.69 10.03 10.03 5.33 5.57 5.03 5.03 1.07 1.20 0.89 0.89