Note that from the definition of E in (2.29), we have
kEk ≤ kUI:DVJ0c:k+kUIc:DVJ0:k+kUIc:DVJ0c:k+kZIJk . r X l=1 (dlkUIlkkVJclk+dlkUIclkkVJ lk+dlkUIclkkVJclk) + p |I|+p|J| by Lemma 8 . r X l=1 (dlkVJ−clk+dlkUI−clk+dlkUI−clkkVJ−clk) +p|I+|+p|J+| by Lemma 1 . d1kVJ−clk+d1kUI−clk+ p |I+|+p|J+| by (2.28) = o(d1), by (2.28) and (2.27),
By the convexity of the mapx7→x2, and plugging in (2.28) and (2.27), we get kEk2 . d21kVJ−c lk2+d21kUI−c lk2+|I+|+|J+| . d21squ u p pvlog(pu∨pv) d2 1 !1−qu/2 +d21sqv v p pulog(pu∨pv) d2 1 !1−qv/2 +squ u d2 1 p pvlog(pu∨pv) !qu/2 +squ u d2 1 p pvlog(pu∨pv) !qu/2 . d21squ u p pvlog(pu∨pv) d2 1 !1−qu/2 +d21sqv v p pulog(pu∨pv) d2 1 !1−qv/2 ,
Bibliography
G. I. Allen, L. Grosenick, and J. Taylor. A generalized least squares matrix decomposition. Rice University Technical Report No. TR2011-03, 2011.
O. Alter, P. O. Brown, and D. Botstein. Processing and modeling genome-wide expression data using singular value decomposition. Proc. Natl. Acad. Sci., 97 (18):10101–10106, 2001.
P. Assouad. Deux remarques sur l’estimation. CR Acad. Sci. Paris Ser. I Math., 296(1021-1024):23, 1983.
F. Bach, J. Mairal, and J. Ponce. Convex sparse matrix factorizations. CoRR, abs/0812.1869, 2008.
P. J. Bickel, F. Gotze, and W. R. Van Zwet. Resampling fewer than n observations: gains, losses, and remedies for losses. Statist. Sinica, 7:1–32, 1997.
K.R. Davidson and S. Szarek. Handbook on the Geometry of Banach Spaces, volume 1, chapter Local operator theory, random matrices and Banach spaces, pages 317–366. Elsevier Science, 2001.
D. L. Donoho. Unconditional bases are optimal bases for data compression and for statistical estimation.Applied and Computational Harmonic Analysis, pages 100–115, 1993.
J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc., 96:1348–1360, 2001.
K. R. Gabriel. Journal de la Societe Francaise de Statistique, 143(3):5–56, 2002.
G. H. Golub and C. F. Van Loan. Matrix computations (3rd ed.). Johns Hopkins University Press, 1996. ISBN 0801854148.
P. D. Hoff. Model averaging and dimension selection for the singular value decom- position. J. Am. Stat. Assoc., 102(478):674–685, 2007.
S. Holm. A simple sequentially rejective multiple test procedure.Scand. J. Statist., 6:65–70, 1979.
J. Z. Huang, H. Shen, and A. Buja. The analysis of two-way functional data using two-way regularized singular value decompositions. J. Am. Stat. Assoc., 104 (488):1609–1620, 2009.
I. M. Johnstone. Gaussian estimation: Sequence and multiresolution models. Available athttp://www-stat.stanford.edu/~imj/, 2011.
I. M. Johnstone and A. Y. Lu. On consistency and sparsity for principal com- ponents analysis in high dimensions. J. Am. Stat. Assoc., 104(486):682–693, 2009.
M. Lee, H. Shen, J. Z. Huang, and J. S. Marron. Biclustering via sparse singular value decomposition. Biometrics, 66:1087–1095, 2010a.
M. Lee, H. Shen, J. Z. Huang, and J. S. Marron. R code for LSHM, 2010b. URL
Y. Liu, D. N. N. Hayes, A. Nobel, and J. S. Marron. Statistical significance of clustering for High-Dimension, Low-Sample size data. J. Am. Stat. Assoc., 103 (483):1281–1293, 2008.
A. Y. Lu. Sparse principal component analysis for functional data. PhD thesis, Stanford University, Stanford, CA, 2002.
Z. Ma. Sparse principal component analysis and iterative thresholding. 2011.
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online Learning for Matrix Factor- ization and Sparse Coding. J. Mach. Learn. Res., 11(1):19–60, 2010.
S. Mallat. A Wavelet Tour of Signal Processing: The Sparse Way. Academic Press, 2009.
B. Nadler. Discussion of “On consistency and sparsity for principal components analysis in high dimensions” by Johnstone and Lu. J. Am. Stat. Assoc., 104 (486):694–697, 2009.
A. B. Owen and P. O. Perry. Bi-cross-validation of the SVD and the nonnegative matrix factorization. Ann. Appl. Stat., 3:564–594, 2009.
D. Paul. Asymptotics of sample eigenstruture for a large dimensional spiked co- variance model. Stat. Sinica, 17(4):1617–1642, 2007.
D. Paul and I. M. Johnstone. Augmented sparse principal component analysis for high dimensional data. Preprint, available athttp://anson.ucdavis.edu/ ~debashis/techrep/augmented-spca.pdf, 2007.
H. S. Prasantha, H. L. Shashidhara, and K. N. Balasubramanya Murthy. Image compression using SVD. In Proceedings of the International Conference on
Computational Intelligence and Multimedia Applications, pages 143–145. IEEE Computer Society, 2007.
A. Shabalin and A. Nobel. Reconstruction of a low-rank matrix in the presence of gaussian noise. Preprint, availabel at http://arxiv.org/abs/1007.4148, 2010.
D. Shen, H. Shen, and J. S. Marron. Consistency of sparse PCA in high dimension, low sample size contexts. 2011.
H. Shen and J. H. Huang. Sparse principal component analysis via regularized low rank matrix approximation. J. Multivariate Anal., 99:1015–1034, 2008.
M. Sill, S. Kaiser, A. Benner, and A. Kopp-Schneider. Robust biclustering by sparse singular value decomposition incorporating stability selection. Bioinfor-
matics, 27(15):2089–2097, 2011.
A. Thomasian, V. Castelli, and C. Li. Clustering and singular value decomposition for approximate indexing in high dimensional spaces. In Proceedings of the
seventh international conference on Information and knowledge management,
pages 201–207, 1998.
A. B. Tsybakov. Introduction to nonparametric estimation. Springer Verlag, 2009.
P. A. Wedin. Perturbation bounds in connection with singular value decomposi- tion. BIT, 12:99–111, 1972.
D. Witten, R. Tibshirani, and S. Gross. PMA: Penalized Multivariate Analysis, 2010. URL http://CRAN.R-project.org/package=PMA. R package version 1.0.7.
D. M. Witten and R. Tibshirani. A framework for feature selection in clustering.
J. Am. Stat. Assoc., 105(490):713–726, 2010.
D. M. Witten, R. Tibshirani, and T. Hastie. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation anal- ysis. Biostatistics, 10:515–534, 2009.
S. Wold. Cross-validatory estimation of the number of components in factor and principal component models. Technometrics, 20(4):397–405, 1978.
D. Yang, Z. Ma, and A. Buja. A sparse SVD method for high-dimensional data. Preprint, availabel at http://arxiv.org/abs/1112.2433, 2011.
D. Yang, Z. Ma, and A. Buja. Near optimal sparse SVD in high dimensions. 2012.
W. Zheng, S. Z. Li, J. H. Lai, and S. Liao. On constrained sparse matrix factor- ization. Computer Vision, IEEE International Conference on, 0:1–8, 2007.
H. Zou, T. Hastie, and R. Tibshirani. Sparse principal component analysis. J.