• No results found

Composite Quantile Generalized Quasi Likelihood Ratio Tests for Varying Coefficient Regression Models

N/A
N/A
Protected

Academic year: 2020

Share "Composite Quantile Generalized Quasi Likelihood Ratio Tests for Varying Coefficient Regression Models"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

2017 2nd International Conference on Information Technology and Management Engineering (ITME 2017) ISBN: 978-1-60595-415-8

Composite Quantile Generalized Quasi-Likelihood Ratio Tests for

Varying Coefficient Regression Models

Jin-ju XU

1

and Zhong-hua LUO

2,*

1

School of Medicine economic and Management, Anhui University of Chinese Medicine, Hefei, China, 230031

2

School of Economics and Business Management, Gansu University of Traditional Chinese Medicine, Lanzhou, China, 730000

*Corresponding author

Keywords: Composite quantile regression, Varying coefficient model, Generalized quasi-likelihood

ratio tests.

Abstract. A new test procedure, called composite quantile generalized quasi-likelihood ratio

(CQGQLR) test is proposed in this paper to test whether all or partial coecients are indeed

constants or some specific functions for the varying coecient regression models. The test statistics

are constructed based on the comparison of the composite quantile quasi-likelihood functions under null and alternative hypotheses. The proposed test methodologies are applied to analyze the Boston

house price data. The simulation results and the real example illustrate the effectiveness and

practical usefulness of the proposed test statistics. AMS subject classifications. 62G0562G2060G42

Introduction

Since the seminal work of Koenker and Basset (1978), there has been an abundance of literature on various applications and theoretical extensions of quantile regression. Regression quantiles have the important advantage over conditional mean regression of being able to directly estimate the eects of the covariates on quantiles other than the center of the distribution. Quantile regression has been extensively applied in economics, finance, biology, medicine, and many other disciplines.

Many dimension-reduction techniques have been adopted for quantile regression to deal with this

problem, such as additive model, single index model and varying coecient quantile regression

models. Honda (2004) and Cai and Xu (2008) considered the quantile varying coecient model for

time series data, Wu, Yu and Yu (2010) investigated the single index model for quantile regression,

Zongwu Cai(2012) discussed Semiparametric Partially Varying Coecients Quantile Regression

Estimation in Dynamic Models.

A varying coecient regression model is a useful and natural extension of a classical linear

regression model. The varying coecient models assume the following conditional mean structure:

Y =

=

p 1

j

aj (U )Xj+ε = A(U )T X +ε (1.1)

where aj (U )denotes the unknown smooth functions ,theε is the random errors.

Composite quantile regression (CQR) has recently gained considerable attention due to its ability

to combine information across dierent quantile functions. CQR was recently proposed by Zou and

Yuan [2008] for estimating the regression coefficients in the classical linear regression models. Kai

et al. considered local CQR estimation for semiparametric varying-coecient partially linear

models. However, to the best of our knowledge, the problem in varying coecient models of test is

considered by few people although it has broad potential applications. This motivates us to consider

the problem within the framework of varying coefficient models. We propose a new test procedure,

termed as composite quantile generalized quasi-likelihood ratio (CQGQLR) test, to test whether all

(2)

regression models. The test statistics are constructed based on the comparison of the composite quantile quasi-likelihood functions under null and alternative hypotheses. I also apply the proposed test methodologies to test if the existing models in the literature used to analyze the Boston house price data are appropriate or not. The simulation results and the real example illustrate the

effectiveness and practical usefulness of the proposed test statistics.

Estimation of the Regression Coecients

The varying coecient quantile regression model takes the form

qτ(Ut, Xt) =

=

p 0

j

aj,τ(Ut)Xtj= Aτ (Ut)T Xt (2.1)

where Ut Rdis called the smoothing variable and Xt = (Xt0, Xt1, · · · , Xtp)′ with Xt0=1 are i.i.d

observations, A(Ut) = Aτ (Ut) = (a0,τ , a1,τ , · · · , ap,τ)T are smooth coefficient functions which might

be some function of Xt0, ..., Xtp or time or some other exogenous variables. Without loss of

generality, I consider only the case in which Ut in (2.1) is one dimensional (d = 1). For simplicity,

we drop τ from aj,τ (.) in what follows. To estimate the coefficient functions A(·), I apply the local

fitting technique as follows. Assume A(U ) has a continuous first derivative. For a given point u, one

can apply Taylor expansion to approximate A(Ui) as

A(Ui) =β0 +β1(Ui − u), (2.2)

whereβ0 = A(u) andβ1 = A′(u) is the first derivative of A(u). Let cτk denote the 100τk% quantile ofε.

Then For a given q, let ρτk (r) = rk − I(r<0)), where τk = k/(q + 1) for k = 1, 2, · · · , q. Thus,

following the local CQR technique, β0(u) and β1(u) can be estimated via minimizing the locally

weighted CQR loss:

=

q 1

k

=

n

i 1

ρτk {Yi − XiT (β0 − β1(Ui − u))}Kh(Ui − u), (2.3)

where K(·) is a kernel function, Kh(x) =

h 1K

h(Ui −u), and h = hn is a sequence of positive numbers

tending to zero, which controls the amount of smoothing used in estimations. we can get the local linear estimate of A(u), denoted by ∧ ∧A u( )=β0.

Test Statistics

Test of Functional Form of Varying Coecients

Section is devoted to fitting a varying coecient quantile regression model. Now, it turns to one

general and interesting testing problem to check whether the varying coecient are of some specific

functional form. This is equivalent to the following hypothesis:

H0 : Aτ (u) = A0,τ (u) versus H1 : Aτ(u) ≠ A0,τ (u) (3.1)

where A0,τ (u) is a vector of known functionals.

The likelihood ratio type test was proposed by Cai, Fan and Yao (2000) for the hypothesis testing problems formulated in (3.1) for the conditional mean regression models in (1.1).The generalized likelihood ratio is defined as follows:

0 0 1

(H ) (H )=nlogRSS n RSS RSS

(3)

2

0 0

1 1

ˆ

R ( ( ) )

p n

i j ji

i j

SS Y a u X

= =

=

where aˆ ( )0j u is the true or estimated value of coecients under

H0.

Motivated by Cai, Fan and Yao (2000), for the varying coefficients quantile regression models,

by taking the loss function as the check function instead of the sum of squared errors, I propose the similar test statistic for the testing problems in (3.1). As elaborated in Komunjer (2005),

n

1 1

( )= { ( ) }

p

i j ji

i j

H ρτ Y a u X

= =

can be regarded as the negative logarithm of quasi-likelihood. So

the corresponding composite quantile generalized quasi-likelihood ratio (CQGQLR) test statistic is defined as follows:

1 0

1 1 1 0 1 1 1

ˆ

( ) ( ) ( )

( )

q n n

n k i j ji

k i j

q n n

k i j ji

k i j

T H H Y a u X

Y a u X

τ τ ρ ρ = = = = = =   = − =  −      −  −   

∑∑

∑∑

(3.3) where q 1

1 1 1

ˆ

( )= { ( ) }

k

n n

i j ji

k i j

H ρτ Y a u X

= = =

∑∑

and aˆj(u) is the nonparametric estimate of aj (u) by

using local linear estimation technique under the alternative hypothesis, and

q

0 0

1 1 1

( )= { ( ) }

k

n n

i j ji

k i j

H ρτ Y a u X

= = =

∑∑

with a0j (u) is the true function under the null hypothesis.

Test of Constancy of Varying Coecient

One special case of the hypothesis in (3.1) is to check is that A0,τ (u) is a vector of constants. Then,

the test hypothesis becomes to checking whether the varying coecients are indeed varying. That is

equivalent to

0 A u =A0

H: ( )τ τ versus H1: ( )A uτ ≠A0τ (3.4) With a known constant vector, by the discussion above, the CQGQLR test statistic is defined as follows

(3.5) where

,

and aˆj (u)is the nonparametric estimate of aj (u) by using local linear estimation technique under

(4)

with a0j is the true function under the null hypothesis

Test of Constancy of Varying Coecient with Unknown Value

In some applications, it may be more interesting in checking the constancy of the varying

coefficient with the true value A0τ unknown. Therefore, we consider the test statistic for the

hypothesis in (3.4) with an unknown constant vector. Under the null hypothesis, one can estimate

the coecient aˆ0k for the linear quantile regression and construct the quasi-likelihood as follows

Then, the composite quantile generalized quasi-likelihood ratio (CQGQLR) test statistic for

hypothesis testing problem in (3.4) with unknown A0,τis defined by

1 0

0 1 1 1 1 1 1

0 0

1 1 1 1 1 1 1 2

( ) ( )

ˆ

( )

ˆ

n

q n n q n n

k i j ji k i j ji

k i j k i j

q n n q n n

k i k ji k i j ji

k i j k i j

n n

T H H

Y a u X Y a X

Y a X Y a X

T T

τ τ

τ τ

ρ ρ

ρ ρ

= = = = = =

= = = = = =

= −

   

=  − −  − 

   

   

+  − −  − 

   

≡ +

∑∑

∑∑

∑∑

∑∑

(3.6)

where reject H0for large value of Tn.

A Real Example

In this section, I consider the application of these methodologies to a real example. Here I analyze a subset of the Boston house price data (http://lib.stat.cum.edu/datasets/boston) of Harrison and Rubinfeld (1978) which is used to study the effect of air pollution on real estate price in the greater Boston area in 1970s.The data set consist of 506 observations on 14 variables. As indicated in Cai and Xu (2008) which analyzed this data set by using a varying coefficient quantile regression model, we focus on exploring the possible (linear, nonparametric or semiparametric) relationships between the dependent variable and some major factors which might factors the house price. Here I adopt the same notation as in Cai and Xu (2008) in order to do a comparison. Y will be used to

denote the dependent variable, the median value of owner-occupied homes in ﹩1,000’s (house

price).U is proportion of population of lower educational status. X1 is the average number of rooms

per house in the area. X2 denotes the per capital crime rate by town. X3 is the full property tax rate

per ﹩1,000. X4 is the pulil/teacher ratio by town school district.

Note that there are many papers investigating this data set in the literature, and the reader is referred the paper by Cai and Xu (2008) for details. In this section, I will focus on two models. First, we consider the model from Cai and Xu (2008) which is the following quantile smooth coefficient model

*

0 1 1 2 2

(

t

,

t

)

(

t

)

(

t

)

t

(

t

)

t

q U X

τ

=

a U

τ

+

a U X

τ

+

a U X

τ (4.1)

where

2

*

2

log( )

t t

(5)

reported in Table 1. Therefore, one can see that all the p-values are less than significant level 0.05 from Table 1, which implies that the varying coefficients are indeed varying.

Table 1. The p-values for testing constancy in model (4.1).

τ 0.1 0.2 0.4 0.5 0.6 0.8

p-value 0.00 0.001 0.00 0.002 0.000 0.0001

It is clear that model (4.1) does not include two variables X3 and X4. The reason as claimed by Cai

and Xu (2008) is that the functional coefficients for variables X3 and X4 may be constant. Therefore,

I use the proposed test procedure to test whether the coefficients of X3 and X4 are constant or not.

To this effect, we consider the following model

* *

0 3 3 4 4

( t, t) ( t) ( t) t ( t) t

q U Xτ =aτ U +aτ U X +aτ U X (4.2)

and then consider the testing problem formulated as the null hypothesis H0:A uτ*( )=A0* ,where

0,

* *

3, 4,

( ) ( ( ), ( ), ( ))T

A u a u a u a u

τ

τ = τ τ and A0,T is a vector of unknown parameters. By using the test

statistic by following the test procedure as in Section 3, I calculate the quasi-likelihood using linear parametric composite quantile regression under the null hypothesis. The corresponding p-values are reported in Table 2, from which, one can see that all the p-values are greater than significant level 0.05.This implies that the varying coefficients are indeed constant.

Table 2. The p-values for testing constancy in model (4.2).

τ

0.1 0.2 0.4 0.5 0.6 0.8

p-value 0.3402 0.3451 0.4571 0.4252 0.4784 0.4141

References

[1]Koenker, R. and Bassett, G, Regression quantiles. Econometrica, 46, 33-50, 1978.

[2]Cai, Z. and Xu, X, Nonparametric Quantile Estimations for Dynamic Smooth Coefficient

Models. Journal of the American Statistical Association, 103, 1595-1608, 2008.

[3]Honda, T, Quantile Regression in Varying Coefficient Models.Journal of Statistical Planning

and Inference, 121, 113-125, 2004.

[4]Wu, T., Yu, K. and Yu, Y., Semiparametric Quantile Regression Estimation in Dynamic Models

with Partially Varying Coefficients.Journal of Econometrics, 167, 413-425, 2012.

[5]Cai, Z., Fan, J. and Yao, Q., Functional-Coefficient Regression Models for Nonlinear Time

Series. Journal of the American Statistical Association, 95, 941-956, 2000.

[6]Cai, Z. and Xiao, Z., Single-index Quantile Regression. Journal of Multivariate Analysis, 101,

1607-1621, 2010.

[7]B. Kai, R. Li, H. Zou. New efficient estimation and variable selection methods for

References

Related documents

The rates of cognitive and functional change were calcu- lated using three methods: 1) the change in score from the start of galantamine treatment (baseline) to the assessment

The system provides security at different point in time starting from cluster head election (SLEACH), secure data transfer through session establishment CKM with inclusion of

Here, we use local snow cover maps derived from ground-based pho- tography to continuously calibrate the NDSI threshold val- ues (NDSI thr ) of Landsat satellite images at

Using the Back Propagation algorithm, the accuracy for artificial neural network foreign exchange rate forecasting model can be calculated as:. AFERFM Accuracy Forecast=

• Recession &amp; Impact on the Startup World • Recession &amp; Impact on the Startup World • Fund raising advice?. • Questions.. duPont Testamentary Trust, FLAG, Grove

The risk of complications increases with increasing number of cesarean section, the well known complications are intraabdominal dense adhesions, morbid adherent

Despite the fact that the viral load was more elevated in women starting HAART prior pregnancy compared to those started during pregnancy, there was no

Transduction with the HSP70 lentiviral vector, and consequent HSP70 over-expression, significantly increased percentage viability at both 72 h and 96 h of hypoxia compared