Rayleigh Ratios and the
The most important property of symmetric matrices is that they have real eigenvalues and that they can be di-agonalized with respect to an orthogonal matrix.
Thus, if A is an n ⇥ n symmetric matrix, then it has n real eigenvalues 1, . . . , n (not necessarily distinct), and there is an orthonormal basis of eigenvectors (u1, . . . , un)
(for a proof, see Gallier ).
Another fact that is used frequently in optimization prob-lem is that the eigenvalues of a symmetric matrix are characterized in terms of what is known as the Rayleigh ratio, defined by R(A)(x) = x>Ax x>x , x 2 R n, x 6 = 0.
The following proposition is often used to prove the cor-rectness of various optimization or approximation prob-lems (for example PCA).
Proposition A.1. (Rayleigh–Ritz) If A is a symmet-ric n ⇥ n matrix with eigenvalues 1 2 · · · n and if (u1, . . . , un) is any orthonormal basis of eigen-vectors of A, where ui is a unit eigenvector associated with i, then
x>x = n
(with the maximum attained for x = un), and
x>x = n k
(with the maximum attained for x = un k), where
1 k n 1.
Equivalently, if Vk is the subspace spanned by
(u1, . . . , uk), then
k = max x6=0,x2Vk
For our purposes, we also need the version of Proposition A.1 applying to min instead of max.
Proposition A.2. (Rayleigh–Ritz) If A is a symmet-ric n ⇥ n matrix with eigenvalues 1 2 · · · n and if (u1, . . . , un) is any orthonormal basis of eigen-vectors of A, where ui is a unit eigenvector associated with i, then
x>x = 1
(with the minimum attained for x = u1), and
x>x = i
(with the minimum attained for x = ui), where 2
Equivalently, if Wk = Vk?1 denotes the subspace spanned by (uk, . . . , un) (with V0 = (0)), then k = min x6=0,x2Wk x>Ax x>x = x6=0,xmin2V? k 1 x>Ax x>x , k = 1, . . . , n.
Propositions A.1 and A.2 together are known as the
As an application of Propositions A.1 and A.2, we give a proof of a proposition which is the key to the proof of Theorem 2.2.
Given an n ⇥ n symmetric matrix A and an m ⇥ m symmetric B, with m n, if 1 2 · · · n are the eigenvalues of A and µ1 µ2 · · · µm are the
eigenvalues of B, then we say that the eigenvalues of B
interlace the eigenvalues of A if
i µi n m+i, i = 1, . . . , m.
The following proposition is known as the Poincar´e sep-aration theorem.
Proposition A.3. Let A be an n ⇥n symmetric ma-trix, R be an n ⇥ m matrix such that R>R = I (with
m n), and let B = R>AR (an m⇥m matrix). The following properties hold:
(a) The eigenvalues of B interlace the eigenvalues of
(b) If 1 2 · · · n are the eigenvalues of A and
µ1 µ2 · · · µm are the eigenvalues of B, and if i = µi, then there is an eigenvector v of B with eigenvalue µi such that Rv is an eigenvector of A
with eigenvalue i.
Observe that Proposition A.3 implies that
1 + · · · + m tr(R>AR) n m+1 + · · · + n.
For the sake of completeness, we also prove the Courant– Fischer characterization of the eigenvalues of a symmetric matrix.
Theorem A.4. (Courant–Fischer) Let A be a
sym-metric n⇥n matrix with eigenvalues 1 2 · · · n and let (u1, . . . , un) be any orthonormal basis of
eigenvectors of A, where ui is a unit eigenvector as-sociated with i. If Vk denotes the set of subspaces of
Rn of dimension k, then k = max W2Vn k+1 min x2W,x6=0 x>Ax x>x k = min W2Vk max x2W,x6=0 x>Ax x>x .
 Nikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. Machine Learning, 56:89– 113, 2004.
 Mikhail Belkin and Partha Niyogi. Laplacian eigen-maps for dimensionality reduction and data represen-tation. Neural Computation, 15:1373–1396, 2003.  Fan R. K. Chung. Spectral Graph Theory,
vol-ume 92 of Regional Conference Series in Mathe-matics. AMS, first edition, 1997.
 Eric D. Demaine and Nicole Immorlica. Correla-tion clustering with partial informaCorrela-tion. In S. Arora et al., editor, Working Notes of the 6th Interna-tional Workshop on Approximation Algorithms for Combinatorial Problems, LNCS Vol. 2764, pages 1–13. Springer, 2003.
 Jean H. Gallier. Discrete Mathematics. Universi-text. Springer Verlag, first edition, 2011.
 Jean H. Gallier. Geometric Methods and Appli-cations, For Computer Science and Engineering. TAM, Vol. 38. Springer, second edition, 2011.
 Chris Godsil and Gordon Royle. Algebraic Graph Theory. GTM No. 207. Springer Verlag, first edition, 2001.
 H. Golub, Gene and F. Van Loan, Charles. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.
 Frank Harary. On the notion of balance of a signed graph. Michigan Math. J., 2(2):143–146, 1953.
 Jao Ping Hou. Bounds for the least laplacian eigen-value of a signed graph. Acta Mathematica Sinica, 21(4):955–960, 2005.
 Ravikrishna Kolluri, Jonathan R. Shewchuk, and James F. O’Brien. Spectral surface reconstruction from noisy point clouds. In Symposium on Geome-try Processing, pages 11–21. ACM Press, July 2004.  J´erˆome Kunegis, Stephan Schmidt, Andreas Lom-matzsch, J¨urgen Lerner, Ernesto William De Luca, and Sahin Albayrak. Spectral analysis of signed graphs for clustering, prediction and visualization. In
 Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.
 Daniel Spielman. Spectral graph theory. In Uwe Naumannn and Olaf Schenk, editors, Combinatorial Scientific Computing. CRC Press, 2012.
 von Luxburg Ulrike. A tutorial on spectral clustering.
Statistics and Computing, 17(4):395–416, 2007.  Stella X. Yu. Computational Models of Perceptual
Organization. PhD thesis, Carnegie Mellon Univer-sity, Pittsburgh, PA 15213, USA, 2003. Dissertation.  Stella X. Yu and Jianbo Shi. Multiclass spectral clus-tering. In 9th International Conference on Com-puter Vision, Nice, France, October 13-16. IEEE, 2003.