• No results found

Randomization and Von Neumann function variance formula and a problem

N/A
N/A
Protected

Academic year: 2020

Share "Randomization and Von Neumann function variance formula and a problem"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

Economic and Social Review Vol. 9 No. 1

Randomisation and the von N e u m a n n

Function: A Variance F o r m u l a and

a Problem

R. C. G E A R Y *

The Economic and Social Research Institute, Dublin

I

N a paper o f m a n y years ago (Geary 1952) w h a t was t e r m e d the contiguity

ratio was i n t r o d u c e d , t o determine whether, i n p r o b a b i l i t y , a statistical

map has a p a t t e r n or whether the mapped statistics are d i s t r i b u t e d at r a n d o m . This r a t i o is really a two-dimensional version o f the v o n N e u m a n n ( 1 9 4 1 ) statistic, m o r e familiar as that tabled for null-hypothesis n o r m a l O L S resi­ duals b y J . D u r b i n and G. S. Watson. Geary was also concerned w i t h the O L S residual p r o b l e m . He approached i t i n t w o ways, b y randomisation and b y classical O L S regression t h e o r y , his instruments being means and variances o f the c o n t i g u i t y r a t i o .

A d i f f i c u l t y w i t h r a n d o m i s a t i o n treatment was expressed as follows:— " T h e p r o b l e m is t o determine i f there is a c o n t i g u i t y effect, i.e. i f c (the con­ t i g u i t y r a t i o ) has a significantly l o w value after the e l i m i n a t i o n o f q indepen­ dent variables b y the least square m e t h o d . As far as randomisation is con­ cerned, i t w o u l d appear t h a t the test developed i n this section can be applied f o r m a l l y , the z being the remainders after the c o n t r i b u t i o n s o f the indepen­ dent variables have been removed. T o a certain e x t e n t the w r i t e r shares the misgivings o f some other students about the v a l i d i t y o f the r a n d o m i z a t i o n approach i n its a p p l i c a t i o n t o regression remainders. As each successive i n ­ dependent variable is removed, should n o t thedegreeof freedom be diminished? I t does n o t seem so. What happens is t h a t the variance (or range) o f the remainders d i m i n i s h as the effect o f each independent variable is a l l o w e d for, t h e test b e c o m i n g indeterminate w h e n the n u m b e r o f independent variables (originally w i t h mean zero) is one less t h a n the n u m b e r o f observations n, i.e., w h e n all the remainders are zero. A c c o r d i n g l y the f o r m a l a p p l i c a t i o n o f the r a n d o m i z a t i o n procedure, w i t h o u t d i m i n u t i o n o f the n u m b e r o f degree o f freedom, does n o t result i n obvious inconsistency: we can conceive o f cases

(2)

where c w i l l be significantly l o w even after removal o f the effect o f (n — 2) independent variables. Since doubts r e m a i n , however, the w r i t e r considered i t desirable t o examine the p r o b l e m f r o m the classical sampling aspect. I n any case i t w i l l be interesting t o compare the results o f the t w o approaches. I n the p r a c t i c a l aspect the r a n d o m i s a t i o n m e t h o d has the advantage t h a t i t can be applied w i t h o u t the assumption o f universal n o r m a l i t y i n the n obser-vations,regarded as a r a n d o m sample".

As far as the w r i t e r is aware the degrees o f freedom p r o b l e m has never been discussed i n this a p p l i c a t i o n : the controversy i n another c o n t e x t between K . Pearson and R. A . Fisher is part o f statistical h i s t o r y . One o f the objects o f the present c o m m u n i c a t i o n is t o i n v i t e statisticians t o discuss the p r o b l e m .

T h e c o n t i g u i t y r a t i o c o n t e x t is t o o esoteric f o r a suitable discussion. T h e p r o b l e m arises i n the m u c h simpler single dimension o f the r a t i o . B u t the w r i t e r is unaware o f any r a n d o m i s a t i o n treatment o f the v o n N e u m a n n statistic, so he ventures t o give one here w i t h o u t any claim t o o r i g i n a l i t y . One result is remarkable, as w i l l be seen.

One is given a sample o f n measures o f any k i n d ( t h e y may be raw values, OLS residuals etc.), x1, x2 . . ., xn, ordered i n a particular w a y . F r o m a given

f u n c t i o n (e.g., the v o n N e u m a n n r a t i o ) one wants t o make inferences about the character o f the sample (is i t p r o b a b l y n o n - n o r m a l , autoregressed etc.?) One considers the n ! permutations o f the sample values for each o f w h i c h the test f u n c t i o n has a value. These n ! values are regarded as f o r m i n g a fre­ quency d i s t r i b u t i o n . I f the single value o f the f u n c t i o n f o u n d for the given ordered sample is near the ends o f the frequency d i s t r i b u t i o n (i.e., b e y o n d the .05, . 0 1 etc., l i m i t s ) one rejects the hypothesis, exactly as i n o r d i n a r y

t h e o r y . A feature o f the test is t h a t n o assumption is made about the fre­ quency d i s t r i b u t i o n f r o m w h i c h the sample o f n is d r a w n . I n t h e o r y one c o u l d calculate the moments o f the function—or at least the first four moments—and so estimate the frequency d i s t r i b u t i o n using e.g., the Pearson curve system. Here we deal o n l y w i t h the first t w o m o m e n t s , the mean and the variance, w h i c h suffice for most practical purposes.

T h e test f u n c t i o n cannot usefully be s y m m e t r i c a l i n ( x1 ( x2 xn) because t h e n all the n ! values w o u l d be the same. T h e essence o f the v o n N e u m a n n r a t i o d is t h a t i t is n o t s y m m e t r i c a l ( f o r n > 2 ) as i t assumes t h a t the sample elements are arrayed i n a particular w a y . I n fact, assuming, w i t h o u t loss o f generality, that:—

Xx{ = 0 (1)

(3)

as w i l l always be the case w i t h O L S residuals, d is given by—

n n

d = X (xt - x , - _ i ) 2/ ^ ? = N/D

«= 2 «= 1

(2)

The n u m e r a t o r N is assymetrical, the d e n o m i n a t o r D s y m m e t r i c a l , i.e., D has the same value i n all p e r m u t a t i o n s . We need concern ourselves o n l y w i t h the n u m e r a t o r N. I t is the fact o f constant D t h a t makes the calculation o f m o m e n t s o f d exactly calculable. This is also the classical case w h e n the sample is a n o r m a l one because t h e n d is a homogeneous f u n c t i o n o f degree zero, w i t h nr2 = 2 x2 i n the d e n o m i n a t o r . T h e fact t h a t w h e n the sample is

n o r m a l r is independent o f d (Geary, 1933) makes the exact calculation o f the moments o f d, and hence the estimation o f the frequency o f d (as b y Durbin-Watson) possible.

I f / is any p o l y n o m i a l f u n c t i o n o f (xt, x2. . ., xn) ordered i n a particular

way the r a n d o m i s a t i o n mean M (f) o f / is the sum o f / for all the permuta­ tions divided b y n ! . T o f i n d the mean o f d2 given b y (2) or, i n effect, N2 we

have t o deal w i t h terms i n x*, x3. x , ' , x2 x y x , " and x,- x," x , " x , - " , all sub­

scripts d i f f e r e n t . O n t a k i n g means we m a y disregard subscripts and insert mean values o f these terms, having regard o n l y t o exponents. These mean values m a y be w r i t t e n ( i n a n o t a t i o n w h i c h is obvious) ( 4 ) , ( 3 1 ) , ( 2 2 ) , ( 2 1 1 ) , ( 1 1 1 1 ) . N o t e t h a t (31) = (13) etc.

Square (1) and take means. There are n o f t y p e x\ and n (n — l ) / 2 o f t y p e Hence:

As i n ( 4 ) , we can express a l l terms i n t w o o r m o r e variables i n single variable expressions. As an example o f the m e t h o d o f d e r i v a t i o n , we have

n ( 2 ) + (11) [ 2 n (n - l ) ] / 2 = 0 (3)

or

( l l ) = - ( 2 ) / ( « - l ) (4)

E x ; 2 x ; = 0 (5)

M u l t i p l y i n g o u t and t a k i n g means:

n (4) + n (n - 1) (31) = 0 (6)

or

(4)

T h e d e r i v a t i o n o f other r a n d o m i s a t i o n means we need is a l i t t l e m o r e c o m ­ p l i c a t e d . We shall be c o n t e n t t o give the results:

( 2 2 ) = [ n ( 2 )2 _ ( 4 ) ] / ( n - l )

( 2 1 1 ) = [ 2 ( 4 ) - n ( 2 )2] / ( n - l ) ( n - 2 ) (8) ( 1 1 1 1 ) = 3 [ n ( 2 )2 - 2 ( 4 ) ] / ( n - l ) ( n - 2 ) ( n . - 3 )

F r o m ( 2 ) ,

£> = n ( 2 ) (9)

E x p a n d i n g the n u m e r a t o r o f (2)

N = (x2 + x2) + 2 2 x2 - 2

I

x , x , _ ! (10)

• = 2 i = 2

Hence t a k i n g means

M (N) = 2 (2) + 2 (n - 2) (2) + 2 (n - 1) ( 2 ) / ( n - 1), (11)

using ( 4 ) . Hence M (N) = 2n ( 2 ) . T h e n :

M (d) = M (N)/D = 2, (12)

using ( 9 ) .

T h e algebra o f the c a l c u l a t i o n o f M (d2) or, i n effect,7W (N2) is onerous b u t

the result is simple. We regard N, given b y the r i g h t side' o f (10) as three terms

(A + B + C) w i t h square (A2 + 2AB + ... + C2) and aggregate the terms, hav­

i n g regard t o coefficients and numbers o f terms o f each k i n d , x?, x? etc->

w h i c h , o n t a k i n g means are replaced b y ( 4 ) , (31) etc. T h e n , gathering terms we f i n d :

M (N2) = 2 (2n — 3) (4) - 8 (2n - 3) (31) + 2 ( 2 n2 - 4 n + 3) (22)

- 8 (n - 2) (n - 3) (211) + 4 (n - 2) (n - 3) ( 1 1 1 1 ) . (13)

Using (7) and (8) and collecting terms:

M (N2 ) = 2n [(2n2 - 3) ( 2 )2 - ( 4 ) ] \{n - 1)." (14)

As M (d2) = M (N2 )/D2 w i t h D = n ( 2 ) :

v*x(d)=M{d2)- [M(d)]2 (15)

(5)

where b2 - ( 4 ) / ( 2 )2 the familiar k u r t i c i t y statistic i n n o r m a l t h e o r y i n w h i c h i n fact its p o p u l a t i o n value 02 is 3.

As a check o n the q u i t e elaborate, i f elementary, algebra, consider the case o f n = 2. There is t h e n b u t a single value o f d given b y ( 2 ) , f o r i n this case d is s y m m e t r i c a l i n ( x1 ( x2) . (4) = ( x j + x £ ) / 2 = x j since Xx + x2 = 0 and

(2) = x\. Hence b2 = 1 . S u b s t i t u t i n g then n = 2 and b2 = 1 i n ( 1 5 ) we f i n d

var (d) = 0, as we s h o u l d .

O f course ( 1 5 ) is O ( n_ 1) , w h i c h m e a n s ' t h a t , w i t h increasing n , d tends i n p r o b a b i l i t y towards 2 (see ( 1 2 ) ) . What, as announced above, is remarkable is that the coefficient o f b2 is O {n1). This implies that the variance is

nearly independent o f the frequency d i s t r i b u t i o n f r o m w h i c h the r a n d o m sample o f n (arrayed i n any order) is d r a w n . As an example take n = 20—one w o u l d scarcely be interested i n e.g., residual a u t o c o r r e l a t i o n f o r fewer obser­ vations—and 6 = 1, and 6, a range p r o b a b l y covering most d i s t r i b u t i o n s . Values o f standard d e v i a t i o n (= square r o o t variance) o f var (d) given b y ( 1 5 ) are—

T h e difference is o f small i m p o r t a n c e , having regard t o the uses o f the statis­ t i c d.

There is an interest i n c o m p a r i n g the l o w e r c r i t i c a l p r o b a b i l i t y points derived f r o m r a n d o m i s a t i o n w i t h t h e w e l l - k n o w n dL o f Durbin-Watson

( 1 9 5 1 ) . A s s u m i n g n o r m a l i t y , b2 = 3 a n d , since the mean is 2, l o w e r .05 and

. 0 1 N H P critical points w o u l d be given respectively b y (2 — 1.9600s) = C j and (2 — 2.5759s) = c2, s being the r a n d o m i s a t i o n standard d e v i a t i o n , s2

given b y ( 1 5 ) . F o l l o w i n g are comparisons w i t h di f o r various values o f n , w i t h k', t h e n u m b e r o f regression terms, equalling 1 and 2:—

Value o f 62 1 6

s.d. o f d 0.4353 0.4039

n s

'2

0 . 9 1 1.09 1.21 1.29 1.35 1.40 1.43 1.46 1.49 .01 probability Durbin-Watson di k'=-l

0 . 9 5 1.13 1.25 1.32 1.38 1.43 1.47 1.50 1.53

k' = 2

0 . 8 6 1.07 1.20 1.28 1.35 1.40 1.44 1.47 1.50 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0

0 . 4 2 3 0 0 . 3 5 2 3 0 . 3 0 8 0 0 . 2 7 7 0 0 . 2 5 3 8 0 . 2 3 5 6 0 . 2 2 0 7 0 . 2 0 8 4 0 . 1 9 8 0

(6)

The regularity i n relationship is m a r k e d : f r o m n = 30 o n i n n o case do the c figures differ f r o m the D W b y m o r e than 0.04. T h e differences, however, t h o u g h small, are systematic, n o t r a n d o m .

Clearly the correspondence between the c and the dL is good. As is w e l l

k n o w n , i n Durbin-Watson there is a zone o f indecision characterised b y l i m i t s dL and dv. As the previous table shows, the randomisation approach

strongly favours di. The difference between dL, and d\j is the greater the

smaller the value o f n so that i n the f o l l o w i n g table a t t e n t i o n is c o n f i n e d t o

n = 2 0 , 3 0 , 4 0 .

.05 probability .01 probability N k'=l ft'='2 k'=l k' = 2

rfi — c 1 d U - C \ dL ~ c l dU —c\ dL ~c 2 dU ~c2 dL ~c 7 dU -c2

2 0 .03 . 2 4 - . 0 7 .27 . 0 4 . 2 4 - . 0 5 . 3 6 3 0 . 0 4 . 1 8 - . 0 3 . 2 6 . 0 4 . 1 7 - . 0 2 .25 4 0 . 0 4 . 1 4 - . 0 1 . 2 0 . 0 4 . 1 3 - . 0 1 . 1 9

There is n o t a shadow o f d o u b t t h a t , w i t h k' = l,2,dL is closer t o the ran­

d o m i s a t i o n value t h a n is dy, t o the extent t h a t there seems no p o i n t i n referr­ i n g t o djj i n testing for absence o f residual autoregression. As the randomisa­ t i o n f o r m u l a is almost distribution-free (as regards the disturbances) the same m a y n o w be said o f the dL series. O b v i o u s l y the randomisation l i m i t s

(ci and c2) themselves can c o n f i d e n t l y be used f o r adjudging significance i f t o o m u c h precision be n o t attached t o the null-hypothesis p r o b a b i l i t y (i.e. t o use " n e a r " . 0 5 , . 0 1 etc. instead o f exact .05, . 0 1 etc.). When n is n o t t o o small the foregoing assumption that the n ! values are n o r m a l l y d i s t r i b u t e d is close enough for the purpose o f the approximate p r o b a b i l i t y statement.

I n t h e i r original 1 9 5 1 paper D u r b i n and Watson furnish tables o n l y f o r the l o w e r c r i t i c a l values o f the v o n N e u m a n n r a t i o , namely, di and drj. The upper critical values should also be considered: the n u l l hypothesis (i.e., no residual a u t o c o r r e l a t i o n ) should also be rejected w h e n the actual r a t i o ex­ ceeds 4 — di (4 being the algebraic m a x i m u m o f the r a t i o ) . The latter situa­ t i o n is far m o r e rare, i n actual practice, than the f o r m e r . I t can occur o n l y w h e n consecutive values o f the calculated disturbances t e n d t o oscillate i n sign f r o m plus t o minus, w h e n , i n fact the. change i n signs test tau (Geary 1 9 7 0 )1 is near n , the n u m b e r o f observations, instead o f near n / 2 w h e n the disturbance values are i n r a n d o m order.

1 . N o n e o f t h e several p a p e r s o n t h e t a u t e s t q u e s t i o n e d t h e v a l i d i t y o f t h e r a n d o m i s a ­ t i o n t e s t a p p l i e d t o O L S d i s t u r b a n c e s . A l l c o m m e n t s b o r e o n t h e d i s c r i m i n a t o r y p o w e r o f t h e t e s t , c o m p a r e d w i t h t h e D W a n d o t h e r t e s t s . N o n e o f t h e c o m m e n t a t o r s o b j e c t e d

(7)

T h e m a i n p o i n t is, however, t h a t , at the upper rejection l i m i t , the discrep­ ancies between the n o r m a l t h e o r y r a n d o m i s a t i o n critical values and the D W values are e x a c t l y those s h o w n i n the previous tables.

Comparison between r a n d o m i s a t i o n and D W n u l l hypothesis l i m i t s has been c o n f i n e d t o k' = 1 and k' = 2 , i.e., for estimated O L S disturbances after removal o f one and t w o indvars respectively. Comparison is n o t so satisfac­ t o r y for regressions w i t h m o r e t h a n t w o indvars. T h e n u b o f the p r o b l e m raised i n this paper and stated at the outset is t h a t we have b u t one r a n d o m ­ i s a t i o n critical l i m i t for given p r o b a b i l i t y ; there are k' values o f dL. W i t h

r a n d o m i s a t i o n the manner b y w h i c h the O L S disturbances are estimated (i.e., w i t h o u t regard t o n u m b e r o f indvars) is disregarded: we s i m p l y have a t i m e sequence o f numbers (positive and negative since t h e i r sum m u s t be zero) and the n u l l hypothesis is t h a t t h e y are p r o b a b l y i n n o n - r a n d o m order. The fact t h a t the estimated disturbances are related because t h e i r sum is zero is o f n o i m p o r t a n c e unless n is small. Basically the approach is the same i n using the t a u test, o n the classical approach, o n the c o n t r a r y , w i t h . k' indvars there are (k' + 1) linear relations between the disturbances so t h a t i n significance-testing the n u m b e r o f independent variables involved is n o t n b u t (n — k' — 1), the n u m b e r o f degrees o f freedom.

A n obvious approach is as f o l l o w s . I f , i n ( 1 5 ) , one regards n as n u m b e r o f degrees o f freedom established i n the o r d i n a r y w a y , w o u l d the resulting values o f c1 and c2 correspond t o the Durbin-Watson values o f di for k' -1,

2, . . ., 5. T h e answer is " N o " . A t t e n t i o n m a y be c o n f i n e d t o the most test­ i n g case o f n u m b e r o f observations equal t o 2 0 .

W i t h 1 t o 5 indvars, degrees o f freedom i n the disturbance range f r o m 18 t o 14 and using ( 1 5 ) a n d , always assuming n o r m a l i t y , the ct values for n = 18 and 14 are respectively 1.13 and 1.04, range 0.09. B u t the Durbin-Watson range for dL f r o m k' = 1 t o k' = 5 is 0 . 4 1 . This comparison relates t o p r o b ­

a b i l i t y . 0 5 . A t . 0 1 p r o b a b i l i t y the contrast is also m a r k e d : for c2 the range is 0.15, for k' 0.35. I f any c o r r e c t i o n for n u m b e r o f degrees o f freedom is necessary using r a n d o m i s a t i o n , i t w i l l n o t be o n these lines.

(8)

REFERENCES

D U R B I N , J a n d G . S. W A T S O N , 1 9 5 1 . " T e s t i n g f o r serial c o r r e l a t i o n i n least squares r e g r e s s i o n 1 1 " , Biometrika, V o l . 3 8 .

G E A R Y , R . C., 1 9 3 3 . " A g e n e r a l e x p r e s s i o n f o r t h e m o m e n t s o f c e r t a i n s y m m e t r i c a l f u n c t i o n s o f n o r m a l s a m p l e s " . Biometrika, V o l . X X V .

G E A R Y , R . C . , 1 9 5 2 . " T h e c o n t i g u i t y r a t i o a n d s t a t i s t i c a l m a p p i n g " , The Incorporated

Statistician, V o l . 5 , N o . 3 .

G E A R Y , R . C , 1 9 7 0 . " R e l a t i v e e f f i c i e n c y o f c o u n t o f sign changes f o r assessing r e s i d u a l a u t o r e g r e s s i o n i n least squares r e g r e s s i o n " . Biometrika, V o l . 5 7 .

References

Related documents

perspective and the present fatalistic time perspective are both related to depression, unhappiness, low self-esteem, aggression, and suicidal ideations, while the past

The increasing authority of inter- national public law and use of soft law in the private legal realm, as well as the prevalence of comparative law as a normative lens through which

Surgery of the Hand, Department of Surgery Instructional Course, Third Year Medical Students, University of South Florida College of Medicine, Tampa, FL, April 1987.. Surgery of

The Portuguese Government, through the Ministry of Science, Technology and Higher Education, is entering into a long-term collaboration with the Massachusetts Institute of

This collective case study used methods of discourse analysis to consider what computer-mediated collaboration might reveal about preservice teachers’ sense-making in a

The public health objective of the ABLES program (Objective 20.7 in Healthy People 2010) is “Reduce the number of adults who have blood lead concentrations of 25 micrograms

Experimental and theoretical challenges in the search for the quark gluon plasma: The STAR collaboration’s critical assessment of the evidence from RHIC collisions... Elliptic flow

DMTF shall have no liability to any party implementing such standard, whether such implementation is foreseeable or not, nor to any patent owner or claimant, and shall have