In this section, We summarized some of the known results about the effect of the measurement error in linear model. Fuller (1987) gave a comprehensive overview of measurement error modeling and adjusted estimators for linear models. And see Carroll et al (2006) for detailed coverage of nonlinear models.
The multiple linear regression model is defined as: Yi=β0+βtXi+i i= 1, . . . , n
LetXi= (xi1, xi2, . . . , xip)0 be thep× 1 covariates subject to possible measure- ment error andβ = (β1, β2, . . . , βp)0 is the coefficient parameter. Andi iid∼ N(0, σ2).
Now suppose that we have additive measurement error:
Wi =Xi+Ui
Where Wi is unbiased for Xi, and Ui is independent random variable with E(Ui|Xi) =0, V ar(Ui|Xi) = Σuu. And Xi i Ui ∼ M V N µx 0 0 , Σxx 0 0 0 σ2 0 0 0 Σuu Then Xi Wi ∼ M V N µx µx , Σxx Σxx Σxx Σxx+Σuu
And the conditional distribution of X given W is:
X|W∼ Np
(µx+Σxx(Σxx+Σuu)−1(W− µx)),Σxx− Σxx(Σxx+Σuu)−1Σxx
Hence, the conditional mean and variance are:
E(X|W) = µx+Σxx(Σxx+Σuu)−1(W− µx)
V ar(X|W) = Σxx− Σxx(Σxx+Σuu)−1Σxx With nondifferential measurement error,
E(Y,W) = E{E(Y|X,W)|W}
= E{E(Y|X)|W}
= E(β0+βxtX|W)
= β0+βxt[µx− Σxx(Σxx+Σuu)−1µx] +βxt[Σxx(Σxx+Σuu)−1]W = βw0+βwtW
The naive least squares regression of Y on W without adjusting for the mea- surement error will get a consistent estimate not of βx, but
βwt =βxt[Σxx(Σxx+Σuu)−1] The residual variance of this regression of Y on W is:
V ar(Y|W) = βxtV ar(X|W)βx+σ2
= βxt[Σxx− Σxx(Σxx+Σuu)−1Σxx]βx+σ2
A.2.0.1 Simple Linear Regression Model
When p = 1, we have the simple linear regression model, and: E(X|W) = σ 2 u σx2 +σu2µx+ σx2 σ2x+σ2uW V ar(X|W) = σ 2 uσx2 σx2 +σu2 With nondifferential measurement error,
E(Y, W) = β0+ βxσ 2 u σx2 +σu2µx+ βxσ2x σ2x+σu2W = β0∗ +βx∗ W
The naive least squares regression of Y on W without adjusting for the mea- surement error will get a consistent estimate not of βx, but instead of βx∗ = λβx,
where
λ= σ
2 x
σ2x+σ2u <1
The residual variance of this regression of Y on W is: V ar(Y|W) = σ∗2
= βx2V ar(X|W) +σ2 = λβx2σu2 +σ2
The variance of the slope estimator calculated from the true data (Y, X) would be
V ar( ˆβx) =σ2/Sxx =σ2/nσx2
The variance of the naive slope estimator calculated from the data (Y, W) would be
The naive estimate of the slope can be asymptotically less variable than the true data estimator as long as:
V ar( ˆβx∗ ) < V ar( ˆβx) σ∗2/nσw2 < σ2/nσx2
λβx2 < σ 2 σ2x
As pointed out by Buzas, Stefanski and Tosteson (2005) that this inequality is possible when σ2 is large, orσu2 is large , orβx2 is small. Note that this phenomenon cannot occur with Berkson error, for which the variance of the naive estimator is never less than the variance of the true-data estimator asymptotically. (Carroll, et al., 2006)
VITA
Name
Juan Xiong
Education
DOCTOR OF PHILOSOPHY in Statistics, 2010
The University of Western Ontario London, Ontario, Canada
Supervisor: Wenqing He
MASTER OF SCIENCE in Biostatistics, 2006
The University of Western Ontario London, Ontario, Canada
Supervisor: Wenqing He
BACHELOR OF SCIENCE in Information and Computer Science, 2005
East China Normal University Shanghai, China
Supervisor: Xuecheng Pang
Employment
Teaching Assistant, September 2006 - December 2009
Department of Statistical and Actuarial Sciences, The University of Western Ontario
Research Assistant, September 2005 - November 2010
Department of Statistical and Actuarial Sciences, The University of Western Ontario
Research Interests
Publications
Peer-Reviewed Journals/Proceedings
• W. He, J. Xiong and G. Y. Yi (2010). SIMEX R package for accelerated failure time models with covariates measurement error. Conditionally accepted by Journal of Statistical Software.
• J. Xiong, W. He (2009). Survival relevant gene selection in microarray data analysis with gene expression subject to measurement error. In JSM Proceed- ings, Biostatistics Section. Washington, DC: American Statistical Association.
• W. He, G. Y. Yi and J. Xiong (2007). Accelerated failure time models with covariates subject to measurement error,Statistics in Medicine,26, 4817 - 4832. Submissions/In Progress
• J. Xiong and W. He (2010). Survival relevant gene selection in microarray data analysis with gene expression subject to measurement error. Submitted.
• A. Yuan, G. Chen, W. He and J. Xiong (2009). Bayesian frequentist hybrid model with application to the analysis of gene copy number changes. Submitted.
Conference Presentations
• Juan Xiong and Wenqing He (2010). Predicting of survival time by combining mismeasured gene expression data from different platforms. The Annual Gen- eral Meeting of the Statistical Society of Canada, Quebec City, QC, Canada, May 23 May 26.
• Juan Xiong and Wenqing He (2009). Poster presentation at the Research and Education Day of the Department of Oncology, University of Western Ontario, The Best Western Lamplighter Inn, London, ON, Canada, June 12.
• Juan Xiong and Wenqing He (2009). Survival relevant gene selection in mi- croarray data analysis with gene expression subject to measurement error. The Annual General Meeting of the Statistical Society of Canada, Vancouver, BC, Canada, May 3 June 3.
• Juan Xiong (2006). Accelerated failure time models with mismeasured covari- ates. The Second Annual MSc Day of the Department of Statistics &Actuarial Sciences, University of Western Ontario, London, ON, Canada, July 27.
Scholarships and Academic Awards
• Western Graduate Research Scholarship, The University of Western Ontario, Canada, (2005-2010)
• 3rd-class Academic Excellence Scholarship, East China Normal University, China, (2003 academic year)
• Outstanding Student Award, East China Normal University, China, (2002 aca- demic year)
• 1st-class Academic Excellence Scholarship, East China Normal University, China, (2002 academic year)