Conclusions and Proposed Extensions - Bayesian Aspects of Classification Procedures

We used the Neyman-Pearson lemma to show that a popular classification procedure based on scoring can be made better in terms of the AUC criterion when the under- lying populations have different variances. We proposed a quadratic recalibration which maximizes the AUC and contains the usual procedure based on raw scores as a special case when the population variances are equal. Our results are based on the bi-normal population assumption for the scores, which can be appropriate in many real-world settings. The increase in AUC grows as the difference in the variances of the two populations increases, with an increase of 25 % recorded for the Restaurant Patron Tipping data in our illustration and modest improvements in AUC for other

common reference datasets. We hope to extend our work by investigating the procedure for data sets in which the scores are sample averages from samples of various sizes, as this is a natural setting for normal scores with unequal variances.

Chapter 4 Conclusion

In this dissertation, we have explored three specific areas of Bayesian classification procedures. The first chapter focused on a new classification procedure using a nonparametric mixture prior distribution and empirical Bayes techniques to minimize a loss function that applies to many scientific settings. The second chapter turns to a popular criterion for evaluating classifiers, the false discovery rate, and gives a way of estimating Bayesian versions, the pFDR and local false discovery rate, using a nonparametric mixture prior. In the last chapter, we look at the AUC criterion in classification problems with normal observations, which can arise frequently when many covariates are combined into summary classification scores through averaging or regression techniques.

There are many interesting questions in the field of our work that can be explored further. For example, better ways of controlling local false discovery rates, and not just the FDR, can be useful. The sense in which an error rate is controlled is also

open for additional work because current techniques focus on providing bounds on expected error rate values, while in applications, more attention to sample-specific statements may also be needed. Work by Jin and Cai (2007) suggests that it may be possible to make the techniques proposed in Chapter 2 of this dissertation more general by providing estimates of the noise distribution because misspecification error can lead to inaccurate estimates of local false discovery rates. It would also be interesting to extend local FDR techniques to interaction effects in model selection in ways similar to the hierarchical FDR model proposed by Yekutieli (2008).

Bibliography

[1] Benjamini, Y. and Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing.Journal of the Royal Statistical Society. Series B (Methodological), 57(1): 289–300, 1995.

[2] Benjamini, Y., Krieger, A.M., and Yekutieli, D. Adaptive linear step-up False Discovery Rate controlling procedures. Biometrika, 93(3): 491–507, Sep 2006. [3] Benjamini Y. and Yekutieli, D. The control of the False Discovery Rate in

multiple testing under dependency. The Annals of Statistics, 29(4): 1165–1188, 2001.

[4] Brown, L.D. Admissible estimators, recurrent diffusions, and insoluble boundary value problems. The Annals of Mathematical Statistics, 42(3): 855-903, 1971. [5] Cai, T. and Jin, J. Optimal rates of convergence for estimating the null density

and proportion of non-null effects in large-scale multiple testing. The Annals of Statistics, 38 : 100–145, 2010.

[6] Duncan, D.B. A Bayesian approach to multiple comparisons. Technometrics, 7: 171–222, 1965.

[7] Efron B., Tibshirani R., Storey J.D., and Tusher V. Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association, 96 (456): 1151–1160, 2001.

[8] Efron, B. Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypohtesis. Journal of the American Statistical Association, 99, 96-104, 2004. [9] Efron, B. Local false discovery rates. Technical report. Division of Biostatistics,

Stanford University, 2005.

[10] Efron, B. Empirical Bayes modeling, computation, and accuracy. Technical report. Stanford University, 2013.

[11] Efron, B. and Morris, C. Data analysis using Stein’s estimator and its general- izations. Journal of the American Statistical Association, 70: 311-319, 1975. [12] Gasch A. P, Spellman P.T., Kao C.M, Carmel-Harel O., Eisen M.B., Storz G.,

and Botstein D. Genomic expression progams in the response of yeast cells to environmental changes. Molecular Biology of the Cell, 11: 4241–4257, 2000. [13] George, E. I. and Foster, D. P. Calibration and empirical Bayes variable selec-

tion. Biometrika, 87(4): 731–747, 2000.

dient descent. Proceedings of the twenty-first international conference on machine learning, 49–, 2004.

[15] Jin, J. and Cai, T. Estimating the null and the proportion of non-null effects in large-scale multiple comparisons. Journal of the American Statistical Association, 102 495-506, 2007.

[16] Johnstone I.M. and Silverman B.W. Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences. The Annals of Statistics, 32(4): 1594–1649, 2004.

[17] Lehmann, E. and Romano, J. Testing Statistical Hypotheses. Springer Texts in Statistics, 2005.

[18] Pepe, M. An interpretation for the ROC curve and inference using GLM procedures. Biometrics, 56: 352–359, 2000.

[19] Pepe, M. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, 2003.

[20] Raykar, V.C. and Zhao, L.H. Nonparametric prior for adaptive sparsity. Pro- ceedings of the 13th International Conference on Artificial Intelligence and Statis- tics. 629–636, 2010.

[21] Raykar, V.C. and Zhao, L.H. Empirical Bayesian thresholding for sparse signals using mixture loss functions. Statistica Sinica, 21 449–474, 2011.

[22] Scott, J.G. and Berger, J.O. An exploration of aspects of Bayesian multiple testing. Journal of Statistical Planning and Inference, 136: 2144-2162, 2006. [23] Storey, J.D. A direct approach to false discovery rates. Journal of the Royal

Statistical Society. Series B, 64(3): 479–498, 2002.

[24] Storey, J.D. The positive false discovery rate: a Bayesian interpretation and the q-value. The Annals of Statistics, 31(6): 2013–2035, 2003.

[25] Wand, M.P. and Jones, M.C. Kernel Smoothing. Chapman and Hall/CRC,

1995.

[26] Yekutieli, D. Hierarchical False Discovery Rate controlling methodology. Jour- nal of the American Statistical Association, 103 (481): 309–316, 2008.

[27] Zhang, C.-H. Empirical Bayes and compound estimation of normal means. Statistica Sinica, 7: 181–193, 1997.

In document Bayesian Aspects of Classification Procedures (Page 68-75)