The Economic and Social Review, Vol. 26, No. 1, October, 1994, pp. 69-74
Artificial Regression Based Mis-Specification
Tests for Discrete Choice Models
A N T H O N Y M U R P H Y University College, Dublin
Abstract: Lagrange Multiplier (LM) tests for omitted variables, neglected heteroscedasticity and
other mis-specifications i n general discrete choice models may be simply and conveniently cal culated using an artificial regression. This artificial regression approach is likely to have better small sample properties than the more common outer product gradient (OPG) form of L M test.
I I N T R O D U C T I O N
D
avidson a n d M a c K i n n o n (1984b) derive convenient L a g r a n g e M u l t i p l i e r ( L M ) tests for o m i t t e d variables and neglected heteroscedasticity i n l o g i t a n d p r o b i t m o d e l s .1 T h e i r L M tests are convenient since t h e y arebased on a r t i f i c i a l regressions. They are also l i k e l y to have good s m a l l sample properties since they do n o t use t h e outer product g r a d i e n t (OPG) f o r m of t h e L M t e s t .2 I n s t e a d t h e i n f o r m a t i o n m a t r i x is calculated as t h e expectation o f
t h e outer p r o d u c t o f t h e score a n d is n o t j u s t a p p r o x i m a t e d by t h e o u t e r product of t h e score. The Davidson a n d M a c K i n n o n approach m a y be used to derive m a n y other mis-specification tests i n b o t h b i n a r y choice models (e.g. tests of n o r m a l i t y i n p r o b i t models and a s y m m e t r y i n l o g i t models) a n d some more general discrete choice models (e.g. n o r m a l i t y i n censored b i v a r i a t e p r o b i t models).
Paper presented at the Eighth Annual Conference of the Irish Economic Association. L Engle (1984) also suggests using this approach.
I n t h i s paper a r t i f i c i a l regression based L M tests for g e n e r a l discrete choice models are d e r i v e d . The tests are l i k e l y to have good s m a l l sample properties since t h e O P G f o r m o f t h e L M test is not used. The proposed L M tests m a y be used to detect a range of mis-specifications such as o m i t t e d v a r i a b l e s , n e g l e c t e d h e t e r o g e n e i t y , i n c o r r e c t f u n c t i o n a l f o r m a n d n o n -n o r m a l i t y / a s y m m e t r y i -n ordered probit/logit models.
I I G E N E R A L D I S C R E T E C H O I C E M O D E L
Consider a discrete choice model w i t h a r a n d o m sample of N i n d i v i d u a l s , denoted by subscript i , a n d J + l alternatives n u m b e r e d from 0 to J. L e t y^ be a n i n d i c a t o r v a r i a b l e for i n d i v i d u a l i and a l t e r n a t i v e j . T h u s , yy- equals one i f
i n d i v i d u a l i selects a l t e r n a t i v e j ; otherwise yy equals zero. L e t Py be t h e p r o b a b i l i t y t h a t i selects a l t e r n a t i v e j . py- depends on t h e parameter vector 8.
T h e t r u e p a r a m e t e r is assumed to be i n t h e i n t e r i o r of the parameter space. F o r a n y i n d i v i d u a l b o t h t h e s u m of t h e y^'s a n d t h e Py's across t h e J + l alternatives equal one.
W i t h a r a n d o m sample the log l i k e l i h o o d is:
1 = 1 Z V y l n P i j (1) i=i j=o
a n d t h e score equals:
51 = y± §Py
80 f t p y 50
(2)
w h i c h m a y be recast as:
61_ 50 = 1 1
• j
1 §P i j
/dT 50 (3)
= I I u
i jz
i jsince ZjSpy / 6 0 = 0. T h e u - ' s m a y be t h o u g h t of as standardised residuals. T h e y have zero means, variances equal t o 1 - Py and covariances equal t o
r W 1 81 81
I„„ = h m E — 6 9 N-+~ N 89 89'
= l i m E - U l A ^ | f (4) N- » o o N i j Pij 89 86'
_1 N " " N T ' T = l i m E — H z y Z y
since t h e sample is r a n d o m , E y ^2 = Ey^ = p^ a n d E y ^ y ^ = 0 w h e n j * k . T h e
i n f o r m a t i o n m a t r i x is assumed to be non-singular i n the neighbourhood o f the t r u e parameter value.
I l l L M T E S T S T A T I S T I C
The L M test statistic of the n u l l 9 = 60 is:
L M = l _ 5 _ i « - i » . (5)
N 86' e e 86
w h e r e b o t h the score a n d t h e observed i n f o r m a t i o n m a t r i x I0-- are e v a l u a t e d
u s i n g t h e r e s t r i c t e d parameter estimates 8.
T h e observed i n f o r m a t i o n m a t r i x is:
t o = E l y y S i i l
6 9 N f t 8 6 S 9 '
N i j Pij 86' 86' '
U n d e r s t a n d a r d weak r e g u l a r i t y conditions, t h e L M test s t a t i s t i c has a c h i -s q u a r e d d i -s t r i b u t i o n w i t h degree-s o f freedom e q u a l t o t h e n u m b e r o f restrictions u n d e r the n u l l .
I V A R T I F I C I A L R E G R E S S I O N B A S E D L M T E S T S T A T I S T I C
The L M test statistic m a y be calculated as:
L M = X l U y Z y
V1 J J V1 J
( l U i j Z i j ) (7)
where u ^ a n d z,j are the estimates of u ^ a n d z{- a t the r e s t r i c t e d p a r a m e t e r
13
Zy =
Pij 50y
I n (7) t h e L M t e s t s t a t i s t i c is s i m p l y the explained s u m of squares from t h e u n c e n t r e d a u x i l i a r y r e g r e s s i o n o f t h e u^'s o n t h e z^'s across a l l J + l a l t e r n a t i v e s a n d N i n d i v i d u a l s .
T h e L M test statistic is also asymptotically equal to N J times the R2 from t h i s
a u x i l i a r y regression. The proof is the same as i n McFadden, 1987. Note t h a t :
N J R2 = L M
i f f ? * '
and
N J i N J T j
Use t h e m e a n v a l u e t h e o r e m to expand the f i r s t t e r m as t h e p r o d u c t o f a stochastically bounded expression a n d 9 - 9 w h i c h has a p l i m of zero. T h u s t h e f i r s t t e r m has!a p l i m o f zero. F i n a l l y , note t h a t t h e expectation o f t h e second t e r m is one,;since Euj- = l - p ^ and X j Z j E u f j / N J = 1, and the variance tends to zero as N tends to i n f i n i t y . Thus the p l i m of N J times the R2 from the
a u x i l i a r y regression equals the L M test statistic.
V B I N A R Y C H O I C E M O D E L
I n t h e special case o f t w o alternatives 0 a n d 1, the log likelihood equals:
l = i { ( l - y i ) l n ( l - P i ) + y i l nP i} (10
i - 1
a n d t h e score equals:
81_ 89
= Z
( i r \ ( \
= 1 _ 1 -
+
Z ii
I
1
"
PiJ I
P i , (59
y i - P i 8pj
= Irixi
where y; is a n i n d i c a t o r variable (i.e. y ; is one i f a l t e r n a t i v e one is chosen a n d
zero otherwise) a n d pj is t h e p r o b a b i l i t y t h a t a l t e r n a t i v e one is chosen. T h e r;
are j u s t t h e scaled residuals — t h e y have a zero m e a n a n d a u n i t variance. N o t e t h a t , u s i n g the n o t a t i o n of t h e previous section:
1 1 yio = i - y i » y i i = yi.Pio = 1- P i . P i i = P i .ri = 1 ^ , ^ = i zs
j = 0 j = 0
The i n f o r m a t i o n m a t r i x is:
I
e e= l i m E ^ - I x ^ (5')
N - > ~ N ;
a n d t h e L M test statistic is:
L M =
^ I r
ii j J l x
ix [ j
( i r j X i ) (7')where t h e observed scaled residuals and regressors are:
f - y i~ P '
X; =
V P i ( l - P i )
1 8P i
1
VpTu
1^)^
As Davidson a n d M a c K i n n o n (1984) point out, t h e L M test statistic is j u s t t h e explained sum o f squares from the uncentred a u x i l i a r y regression of u;o n x;.
A s y m p t o t i c a l l y N t i m e s t h e R2 f r o m t h i s regression equals t h e L M t e s t
statistic.
V I C O N C L U S I O N
REFERENCES
DAVIDSON, R., and J . G . MacKINNON, 1983. "Small Sample Properties of Alternative Forms of the Lagrange Multiplier Test", Economic Letters, Vol. 12, pp. 269-275.
DAVIDSON, R., and. J . G . MacKINNON, 1984a. "Model Specification Tests Based on Artificial Linear Regressions", International Economic Review, Vol. 25, pp. 485-502. DAVIDSON, R., and J . G . MacKINNON, 1984b. "Convenient Specification Tests for
the Logit and Probit Models", Journal of Econometrics, Vol. 25, pp. 241-262.
DAVIDSON, R., and J . G . MacKINNON, 1993. Estimation and Inference in
Econometrics, New York: Oxford University Press.
E N G L E , R . F . , 1984. "Wald, Likelihood Ratio and Lagrange Multiplier Tests in Econometrics", Ch. 13, in Z. Griliches and M.D. Intriligator (eds.), Handbook of
Econometrics, Voli I I , pp. 775-826. Amsterdam: North-Holland.
G O D F R E Y , L . G . , 1988. Mis-Specification Tests in Econometrics, Cambridge: Cambridge University Press.
MacKINNON, J . G . , 1992. "Model Specification Tests and Artificial Regressions",
Journal of Economic Literature, Vol. XXX, pp. 102-146.