Inference for Noisy Samples

(1)

In Honour of Distinguished Professor Dr. Mir Masoom Ali On the Occasion of his 75th Birthday Anniversary PJSOR, Vol. 8, No. 3, pages 381-391, July 2012

Inference for Noisy Samples

Ibrahim. A. Ahmad

Department of Statistics Oklahoma State University Stillwater, Oklahoma 74078 [email protected]

N. Herawati

Department of Mathematics University of Lampung

Bandar Lampung, Indonesia 35145

Abstract

In the current work, some well-known inference procedures including testing and estimation are adjusted to accommodate noisy data that lead to nonidentically distributed sample. The main two cases addressed are the Poisson and the normal distributions. Both one and two sample cases are addressed. Other cases including the exponential and the Pareto distributions are briefly mentioned. In the Poisson case, the situation when the sample size is random is mentioned.

Keywords: Noisy samples, Hypothesis testing and estimation, Poisson, Normal, Exponential, and Pareto.

1.Introduction

Statistical inference with its estimation and hypothesis testing routes is usually based on the concept of “random samples”.This means that the data framework is assumed to be “independent identically distributed” (iid) random variables. Although this setting is becoming less and less practical, it is still the way inference is presented to statistics students via textbooks even at the graduate level, see Rohatgi and Saleh (2001) and Casella and Berger (2002). When the iid structure is violated, there are no guarantees that standard accepted procedures will continue to hold and often modifications result in only partial and or weaker solutions. This need not be the case in some noisy sampling schemes that are now in use. In the current work, it is shown that for some non-iid sampling schemes in the Poisson and normal cases, standard procedures can be modified keeping the main characteristics of the estimates or test statistics intact. The sampling we discuss here includes the now popular rate sampling, cf. Thode (1997) and Ng and Tang (2005) for Poisson distribution and Moser et al. (1989), and Sprott and Farewell (1993) for the normal distribution, among others.

(2)

variables Xi such that E(Xi)Ci, i1,...,n. While in Gaussian case, we have



i

i C

X

E( ) and ₍ ₎ 2_2

i

i C

X

V  or in the homogeneous variance case is just _2_.

Variations of the rate sampling in the Poisson case have been researched in the literature including work by Detre and White (1970), Sichel (1973), Thode (1997), Krishnamoorthy and Thomsons (2004), and Ng and Tang (2005). Rate sampling in the normal case with homogenous variance is known as the regression through the origin case. The celebrated Behrens-Fisher problem is a two-sample variation and solutions for several different forms of it were discussed by Bernard (1982), Moser et al. (1989), Sprott and Farewell (1993), and Moser and Stevens (1992).

2. Poisson Inference

LetX1,...,Xndenote a random sample from Poisson distributions such that Xi ~ P(i) with_i C_i, i1,...,n, where C_i 0is a known constant. Our task is to do inference about. The usual iid case falls atC₁...C_n 1.

The likelihood function is:

. ! )

, (





  

i X X i C

X C e

C X L

i i i 

 

(1)

Clearly one obtains the maximum likelihood estimate (MLE) of  to be .

ˆ ˆ

1_



i i

C X

MLE 



This estimate is complete and sufficient foras well as unbiased, so it is the uniformly minimum variance unbiased estimate (UMVUE) of. This estimate suggests the following class of unbiased estimates for:



 1  ₁

ˆ

r i

i r i

r _C

X C

 ,r 0is an integer.

Now clearly,

. ) (

) ˆ

( 1



2₁1₂ 

  r

i r i

r _C

C

V   (2)

But since ˆ1is the UMVUE, we have

. )

ˆ ( )

ˆ

( 12

1 2

) ( 1

1  _   _ 



 r

i r i

i C

C r

C V

V     ₍₃₎

Hence as a by product we have a simple proof of the inequality:



 ₎ _₍



₎₍



 ₎

( 1 2 2r 1

i i r

i C C

(3)

whereC_i 0, i=1,…, n. Note also that in the Poisson case E

 

Xi V(Xi),i1,,n. Thus we might try to find an unbiased estimate ofbased on the sample variance. Precisely we have:

THEOREM(1): The following estimate is an unbiased estimate of: .

) ˆ (

ˆˆ 2

1 )

( 2  2





 _C C_ _C _i X_i C_i

i i

i 

 (5)

PROOF:Note that

.) ˆ ( ˆ

2 )

ˆ

( 2

1 2 1

2 2

1



_i X_iC_i 



_iEX_i 



_iC_iE X_i



_iC_i E

E    (6)

But





. )]

ˆ ( [ ) ˆ ( ˆ

and , ˆ

,

2 2

1 1

2 1

2 2 2 1

1

2 2

 

   

 

 _  



 

  







i i

C

i

j j

i i

i C i

i i i

E V

E

C C

C C EX

Summing and substituting into (6) yield that . )

ˆ

( 2 ( )2 2

1 

 

 



i

i i

C C C i Xi Ci

E 

The result now follows.

Let us briefly discuss the relations between ˆ₁ ˆandˆˆ . Sinceˆ is the UMVUE of, and ifE(ˆˆ|ˆ)(ˆ), then must be the identity function, i.e. 

 

ˆ ˆ. Hence

. ) ˆ ( )

ˆ | ˆˆ ( ˆ )

ˆˆ ˆ ( ) ˆˆ ,ˆ

(_ _ _E __ _2 _E __E _ _ _2 _V _

Cov  

   

   

 (7)

Henceˆ andˆˆ are positively correlated with correlation coefficient equal to 2

/ 1 ) ˆˆ (

) ˆ

( _}

{ ) ˆˆ ,ˆ (

 

  

V V

 . In the special case ,1 (ˆ) / 2 2/( 1)

1  C  V  n n

C  n    , thus

 

  

2 1

1 )

1 /( 2 1

1 ˆˆ

,ˆ

   

      

n

n asn. (8)

We may combineˆ andˆˆ to obtain yet another unbiased estimate of.

Note thatE(ˆˆ/ˆ)E_1_ˆ E(ˆˆ|ˆ)_1.Thus





ˆ ,1

) ˆ

( 2

2

2 

    

 

 _

   

 

  



 

i i i

i i

i _E X C

C C

C

and this simplifies to





2 2_. 2

2











    

   

i i i

i i

i _C

C C

X X E C

(4)

Hence

















_ 

 

 _

 



i

i i

i i i

i C

C C

X X C C

2 2

* 1

 (9)

is an unbiased estimate of.

Sometime in sampling from Poisson distribution we do inverse sampling where the sample size is random. Let N be an integer-valued random variable not necessarily independent of Xi’s. We then estimateby

, 0 , ˆ

1 1 1

,  



 

 _r

C X C

N i

r i N i

r i r i N

r

 ₍₁₀₎

and

. ) ˆ (

) ( ˆˆ

1

2 , 0 2

21





 _



 _iN _i _i _N

i i

N

i i

N _C _C X C

C



 (11)

Clearly,E

 

ˆr,N E



E



r,N |N





,and

 





















.

| ˆ |

ˆ

2 1

1 1

1 2 ,

, ,

   

     





 

N _r i

N _r

i N

r N

r

C C E

N E

V N V

E

V     ₍₁₂₎

In the usual case whenC1 Cn 1, we get that

 

, E{1/N}. V rN 

On the other hand,

 



2



.

) | ˆˆ ( )

ˆˆ

( N E_V N N _ E N1  E N11

V     (13)

Thus the correlation coefficient betweenˆ andN ˆˆNis , 2 1

1

) (

) ( 2 1

1 ˆˆ

, ˆ

1 1

1 

 

 

K E

E

N N N

N  _

     

 



provided that

. 0 )

( )

(1 1₁ _ _

 K

E

E _N _N

Finally, we can easily deduce the following unbiased estimate:

. 1

1 1

2

* _

           

   



N N X

X

N i N

i N

(5)

To test H0:0vs. H1: 0or 0or 0, set any of the following test

statistics that are all asymptotically normal:

  

ˆ ˆ

0 1

    C_i

Z or

  

ˆˆ

ˆ ₀

2

    C_i

Z . (15)

Next, let us consider the two-sample case: let X_i ~P(C_i1),i ,1,mand n

j D P

Yj ~ ( j2), 1,, . Assume theXi’s and theYj’s are independent. We want to test

2 0 1

0: R

H  , where R0 is a known constant. Again,ˆ1



Xi



Ci and



 Y_i D_i

2

ˆ

 . Further, underH0, we have the likelihood function (with2 ) given

by:

. )

, , , , ,

( 0 ( ) 0

2

0 i j i j i i

i

X Y X D C R

i Y i

i X

i _e _R

Y D X C R

D C Y X

L _ __ _    _  

(16)

Hence (16) leads to the null estimate of :

 



j i

D C

R

Y X

0 0

ˆ

 . (17)

Set the test statistic:

) 1 )(

( ) 1 (

ˆ

ˆ ˆ

0 0

0

0 0

2 1 0 3



 



 

  

j i

j j

i i

j

i C D

R D C

R

Y X

D Y C

X R

D C

R R Z



 

. (18)

Reject H0 when |Z3 |Zor Z/2 according to the alternative. If we would like an

unbiased variance null estimate for we use the following:

THEOREM (2): UnderH0, the following is an unbiased estimate for the common

parameter0











   

 

j j j

i i i

j i

j

i _X _C _Y _D

D C

R D

C R

D C

R

] ) ˆ (

) ˆ (

][ ) (

) (

[

ˆˆ 2 2

2 2

2 0 2 0

0

0  

 , (19)

whereˆ0 is as given in (17).

PROOF: Let



  



j j j

i i i

D Y C

R

X 2 2

0 ˆ) ( ˆ)

(

ˆ  



). (

ˆ 2 ) (

ˆ

0 2

2 2 0 2 2

2



    

(6)

The result follows by noticing that:

, )

( 2 2 2 2 2

0 0

2



X_i  Y_j R C_i R C_i  D_j  D_j

E     (21)

) (

) ˆ

( 2

0 2 2 0

2 2

0 2 0

0







_



i 



i



i 



i



j j

i

i _R _CR _D C R C C R C D

C E

R     , (22)

) (

1 )

ˆ

( 2 2 2 2 2

0



 



 _j _j _j _i _j

j i

j

jY _R _C _D D D D C D

D

E     (23)

and

2 0

2 2₎ ₍ˆ₎ ₍ ₍ˆ₎₎

ˆ

(    

 



Ci



Dj

R E

V

E . (24)

The result follows from (21) to (24).

A test statistic based onˆˆ0is:

) 1 (

ˆˆ

ˆ ˆ

0 0

2 1 0 4





 

j

i D

C R R Z



 

. (25)

RejectH0if|Z4|Z or Z/2 according to the alternative.

3. Normal Inference

Let X1,X2,,Xn denote a random sample from normal distribution such that

) , (

~ _ _2

i

i N C

X ,i1,,n. Then the likelihood function is given by

]. ) (

2 1 exp[ )

( ) 2 ( ) , , ,

( 2

2 2

2 _

 

 

 _i

i i

n n

C X C

X

L    



 (26)

Hence the MLE of  and_2 _{are respectively:}



 ₂

ˆ

i i i

C _C

X C

 and__ˆ2 1 ₍ __ˆ₎2

i

i i

C X

n





 . (27)

Clearly, ˆis sufficient, complete and unbiased for  so it is the UMVUE, while,

2

2 ₍ ₁₎_/

ˆ 

 n n

E   _{, thus the UMVUE for}_2_is 2 ₁_/( ₁₎ ₍ __ˆ₎2

i

i i

C n X C

S  



 _.

Furthermore, we can verify that ˆand 2

C

S are independent and ˆC is normal with mean

and variance 2_/



2

i

C

 and ₍ ₁₎ 2_/_2

C

S

n is _2(_n_1)_{as expected, since}

2 2

2 ₍ ₎ ₍_ˆ ₎

) ˆ

(   



    



Xi Ci Xi Ci n C .

(7)

Let us turn our attention to a general setting of the two sample case: let ~ ( , 2)

1 



i i N C

X ,

m

i1,, and ~ ( , 2)

2  

j

i N D

Y , j1,,n, where C1,,Cm,D1,,Dn and  are known constant. We want to test H₀:₁₂ for given values and. Note that the above setting include most known cases where: (i)  1, 0 is testingH0:12,

(ii)  1, given in testingH₀:₁₂  , (iii) 0 and given is testing

2 1 0: 

H . As shown above the UMVUE’s for 1and 2are, respectively,



 ₂

1

ˆ

i i i

C X C

 _and



 ₂

2

ˆ

j j j

D Y D

 _{. Now, we need to set an unbiased null variance estimate}

which is independent of ˆ1andˆ2. This is done as the following:

THEOREM (3): Under H₀:₁ ₂  and with ₂ , the MLE estimates ofand

2

 (adjusted to be unbiased) are:



  



2 2

2

0 ₁

1 ˆ

j i

i j

j i

i

D C

C Y

D X

C

 

 



 , (28)

and

] ) ˆ ( 1 ) ˆ

( [ 2 1

ˆ 2 ₀ 2

0 2

0  _ _



  





j j j

i

i Xi Ci C Y D

n

m     

 . (29)

PROOF:

UnderH0, the likelihood function of the two samples is given by:

) , ; , , , , , ,

(_X _Y _C _D_ _ _ _ _2

L

)]} (

1 ) (

[ 2

1 exp{ )

( ) 2 1

( 2

2 2

2 _

    

 





  





  

j j j

i i

n n m n

m _X _C _C _Y _D ₍₃₀₎

Taking the log of L and differentiating with respect to results in (28). Then differentiating with respect to _2 _{results in} 2

0 1 2

*

0 ˆ

 _ __

n

mm n . Correcting the bias comes through evaluating the following:

] ) ˆ (

1 (

1 ) ˆ

( [

ˆ 2

0 2

0  _ _ 

 

 



  









j j j

i i

i Xi C C Y D Y D

E E

) 1

)( ˆ ( 2 ) ( 1 )

( 2 2 2

0



  



j j

i

j j

i i

D C

V Y V X

V

 

 

) (

2 _m__n

(8)

But

) 1

( ) ˆ (

2 2

2

2 0







j j

i D

C V

 



 . The result now follows.

Hence, set the test statistic

(32)

4. Other Examples and Final Remarks (A) Other examples:

Extending the MLE’s and the hypotheses testing presented earlier for the non-iid samples from the Poisson and the normal distributions can be used in various other cases. We illustrate this by two further examples.

Example (1): Let X_i ~exp(C_i,) with p.d.f. ( ) 1exp[ ( )]

 

 i

i x x C

f     _{, where}



} max {

1i n Ci

x __ and 0, . The likelihood function is:





 _ _

 n

i Xi Ci

C X

L₍ _, _;__,_₎ ₁_ 1_exp[ ₍₁_₎₍ __)]_,_min ₍ ₎__

i i

i _CX . (33)

Hence the MLE of is ˆ min₁__i__n



Xi Ci



and of isˆ (1/n)_i



XiCiˆ



.But clearly,

 



 

i

C

E ˆ   and

 

ˆ 1.

n n

E  

This leads to the UMVUE’s of and given by:

   

 

       







 i

i i n i

UB _nn _CX  C

 min ˆ

1 ˆ

1 (34)

and







 ˆ

1 1

ˆ _i _i

i

UB _n_  X C . (35)

WhenC1 C2 Cn  ,1 we get the known unbiased estimate of ,

1 / ˆ (1)

  

n n X X

 .

Example (2): LetXi ~ fi

 

x _1_C_i ,0xCi , =1,2,…,n.

1,2,...,

i 

 



i

). 2 (

~ 1

ˆ

ˆ ˆ

2 2 2

0

2

1 _ _

   



n m t

D C

T

j j

i 

 

(9)

Thus the likelihood function is    Ci

C X

L( , ;)  , Ci

i n

i X

1

1 ( )

max__ 

 . (36)

Hence the MLE of is ). ( max

ˆ 1

1in XiCi



 (37)

To find the UMVUE of we need to adjust for bias. Note that

 

.

1 )

( )

ˆ ( ˆ

0





    _



i i i

i _C

C dx

x F du

u P

E      ₍₃₈₎

Hence max( )

1

ˆ 1

1 i n iCi i

i

UB _C X

C

 

    

  

 



 is the UMVUE of .

(B) Some Final Remarks:

(i) In the normal two-sample case, there is a major outstanding problem known as the

“Behrens-Fisher” which is that when there is no estimate for that

can lead to a t-statistic, cf, Rohatgi and Saleh (2001). There are many suggestions of partial solutions. In this context, we offered one when we can assume 2

2 2

1 

  , see

Section 2. Intermediate, approximate or partial solutions have been offered in the literature. For example, when one is known, say , or in the general case of , both unknown, Satterthwaite (1941) and (1946) proposed a method that provides an

approximate _2 _for _with

. Hence could be estimated by replacing

and by and , respectively, see also Ames and Webster (1991) and Moser and Stevens (1992). The case when is given was discussed by Maity and Sherman

(2006). The value of is obtained via matching

* 2

2 2

1 2

2 2

1

*₎₍ _/ _/ ₎ ₍ _/ _/ ₎ ₂

(  m nV S mS n   . This gives

)]} 1 ( /[ { ) / /

( 4

2 2 2 2 2

1

* _ _ _m__ _n _ _n _n_

 . Then one replaces with its estimate . The

above method is based on matching the variance of the proposed estimate of (standardized) to the variance of a Chi-square and solving for the induced parameter and then estimate it. Needless to say matching variance is not enough to guarantee matching of distributions. In a subsequent work we will provide a new method based on matching the distributions.



 



1 2

 



2 2 2



1 2

  

 ₁ ₁ ₂

1

2 2 2 2

1 2 S1 S2

m n m n

 





   

 

   

   



 

₂



₂ 1

2 2 2

2 2

1 2

1 1

m n

m n m n

 





 

   

_  _ _  _

 

  _ _ 

2 1

 2

2

 2

1

S 2

2

S

1 



2 2

 2

2

S



2 2 2



1 2

  



m n 2



(10)

(ii) Note that the essence of the Behrens-Fisher problem for testing ,

, known constants is that the estimate of

n n

Y Y

V₍ __ __₎_12 __212 _{which is a}

weighted sum of the and Chi-squares with unknown weights

 



12



12 2 22 1

1 , ₁



    

  

 



n m

m m n m

K     and



  

1 2 2 2 2 1 2 2

2 , ₁



    

  

 



n m

n n n m

K     . When

we replaceK1

 

m,n , by their estimates based on and we losethe

independence and the problem becomes intractable. We shall provide in the near future some plausible solutions based on what the weighted sum of Chi-squares looks like.

(iii) There are situations where even for the model we discussed above, namely that the parameters are known to beproportionate, there may not be an explicit solution. However, in some of these cases approximate solutions may be possible. Let us illustrate this via the example of nonidentical Bernoulli trials: let X1,...,Xnbe n independent Bernoulli rv’s such that P(Xi 1) pi Cip, i1,...,n. As before assumeCi 0, i1,...,n to be known. Thus, the likelihood function is:



 _ 



i

X i X

i X

i ip i C p i

C p

C X

L₍ _, _, ₎ ₍₁ ₎1 ₍₃₉₎

Taking the log of L(X,C,p)and differentiating with respect topwe get 0

) 1 )( 1

(  

 



i

i i

i

i i

X p C

p C

X (40)

It is not difficult to show that the MLE of p exists and is unique but it is not explicit. If we need an explicit solution, we can use the approximation CCpip Cip i n

i , 1,...,

1   , thus

solve

, 0 ) 1

(  





iXi p iCi Xi

giving an approximate estimate .

) 1 ( ˆ







i i i

i i

X C

X

p Using the well known formula

, ) (

) ( ) (

Y E

X E Y X

E  we get that

}. 1

{ ) ˆ

( ₂

2





 

i i i i

i i

C p C

C p p

p

E (41)

Thus if



_C 2/



_C __K _0 i

i then E(pˆ) p{11pKpK}.Hencepˆ is asymptotically unbiased ifK=0 while it is positively biased ifK>0 in which case, the estimate

, } ˆ 1 { ˆ

ˆˆ _ _p _ pK 1

p (42)

0: 2 1

H   

 



m1





n1



1,2

i 2

1

S 2

2

(11)

is asymptotically unbiased. Finally, using the well known formula }. ) ( ) (

) cov( 2 )] ( [

) ( )]

( [

) ( { ] ) (

) ( [ )

( 1

2 2

2

Y E X E

Y X Y

E Y V X

E X V Y

E X E Y X

V    (43)

We get that

}. 2 ]

) 1 ( [

) 1 ( )

(

) 1

( { ] ) 1 ( [ ) ˆ

( ₂

3

2 2

2



 

 



i i

i

i i

i i i

i i

i

C p

C C

p C C

p C

p

p C C p

C C

C p

p

V (44)

References

1. Ames, M. H. and Webster, J. T. (1991), “On Estimating Approximate Degrees of Freedom,”The American Statistician, 45, 45-50.

2. Bernard, G. A. (1982), “A New Approach to the Behrens-Fisher Problem,”

Utilitas Mathematica, 218, 261-277.

3. Chamberlin, S. R., and Sprott, D. A. (1989), “The Estimation of a Location Parameter when the Scale Parameter is Confined to a Finite Range: The Notion of Generalized Ancillary Statistic,”Biometrika, 76, 609-612.

4. Casella, G., and Berger, R. L. (2002), “Statistical Inference,” 2nd Ed., Duxbury, Pacific Grove, CA.

5. Detre, K., and White, C. (1970), “The Comparison for Two Poisson Distributed Observations,”Biometrics, 26, 851-854.

6. Krishnamoorthy, K., and Thomson, J. (2004). “A more Powerful Test for comparing Two Poisson Means,” Journal of Statistical Planning and Inference, 119, 23-35.

7. Maity, A., and Sherman, M. (2006). “The Two-Sample T Test with One Variance known,”The American Statistician, 60, 163-166.

8. Moser, B. K., and Stevens, G. R. (1992). “Homogeneity of Variance in the Two Sample Means Test,”The American Statistician, 46, 19-21.

9. Moser, B. K., Stevens, G. R., and Walts, C. L. (1989). “The Two Sample t Test Versus Satterthwaite’s Approximate F-Test,” Communications in Statistics-Theory and Methods, 18, 3963-3975.

10. Ng, H. K. T., and Tang, M. L. (2005). “Testing the Equality of Two Poisson Means Using the Rate Ratio,”Statistics in Medicine, 24, 955-965.

11. Rohatgi, V. K., and Saleh, Md. K. E. (2001). “An Introduction to Probability and Statistics,” 2ndEd., Wiley&Sons, New York, NY.

12. Satterthwaite, F. E. (1941). “Synthesis of Variance,”Psychometrika, 6, 309-316. 13. Satterthwaite, F. E. (1946). “An Approximatie Distribution of Estimates of

Variance Component,”Biometric Bulletin, 6, 110-114.

14. Sichel, H. S. (1973). “On a Significance Test for Two Poisson Variables,”

Applied Statistics, 22, 50-58.

15. Sprott, D. A., and Farewell, V. T. (1993). “The Differences Between Two Normal Means,”The American Statisticians, 47, 126-128.