Orthogonal Projection Methods - General Projection Methods

4.3 General Projection Methods

4.3.1 Orthogonal Projection Methods

LetAbe ann×ncomplex matrix and_Kbe anm-dimensional subspace ofCn. As a notational convention we will denote by the same symbolAthe matrix and the linear application inCn _{that it represents. We consider the eigenvalue problem:}

findubelonging toCn_and_λ_{belonging to}_C_{such that}

Au = λu. (4.16)

An orthogonal projection technique onto the subspace_K seeks an approximate eigenpair˜λ,u˜to the above problem, withλ˜ inCandu˜in_K,such that the following Galerkin condition is satisfied:

Au˜−λ˜u˜⊥ K, (4.17)

or, equivalently,

(Au˜−λ˜u, v˜ ) = 0, ∀v∈ K. (4.18) Assume that some orthonormal basis_{_{v1, v2, . . . , vm}}of_Kis available and denote byV the matrix with column vectorsv1, v2, . . . , vm. Then we can solve the approximate problem numerically by translating it into this basis. Letting

u = V y, (4.19)

equation (4.19) becomes

(AV y−λV y, vj˜ ) = 0, j= 1, . . . , m.

Therefore,yandλ˜must satisfy

with

Bm = VHAV.

If we denote byAmthe linear transformation of rankmdefined byAm=PKAPK then we observe that the restriction of this operator to the subspace_Kis repre- sented by the matrixBmwith respect to the basis V. The following is a procedure for computing numerically the Galerkin approximations to the eigenvalues/eigenvectors ofAknown as the Rayleigh-Ritz procedure.

ALGORITHM4.5 Rayleigh-Ritz Procedure:

1. Compute an orthonormal basis{vi}i=1,...,m of the subspaceK. LetV = [v1, v2, . . . , vm].

2. ComputeBm=VH_AV_;

3. Compute the eigenvalues of Bm and select the k desired ones ˜λi, i = 1,2, . . . , k,wherek≤m.

4. Compute the eigenvectorsyi, i= 1, . . . , k,ofBmassociated withλi, i˜ = 1, . . . , k,and the corresponding approximate eigenvectors ofA,ui˜ =V yi, i= 1, . . . , k.

The above process only requires basic linear algebra computations. The numerical solution of them×meigenvalue problem in steps 3 and 4 can be treated by standard library subroutines such as those in EISPACK. Another important note is that in step 4 one can replace eigenvectors by Schur vectors to get approximate Schur vectorsui˜ instead of approximate eigenvectors. Schur vectorsyican be obtained in a numerically stable way and, in general, eigenvectors are more sensitive to rounding errors than are Schur vectors.

We can reformulate orthogonal projection methods in terms of projection op- erators as follows. Defining_P_K to be the orthogonal projector onto the subspace

K,then the Galerkin condition (4.17) can be rewritten as

PK(A˜u−λ˜u˜) = 0, λ˜∈C, u˜∈ K or,

PKAu˜= ˜λu ,˜ λ˜∈C, u˜∈ K. (4.21) Note that we have replaced the original problem (4.16) by an eigenvalue problem for the linear transformation_PKA|Kwhich is fromKtoK. Another formulation of the above equation is

PKAPKu˜= ˜λu ,˜ λ˜∈C, u˜∈C

n _(4.22)

which involves the natural extension

of the linear operatorA′

m=PKA|Kto the whole space. In addition to the eigenvalues and eigenvectors ofA′

m, Am has zero as a trivial eigenvalue with every

vector of the orthogonal complement of_K,being an eigenvector. Equation (4.21) will be referred to as the Galerkin approximate problem.

The following proposition examines what happens in the particular case when the subspace_Kis invariant underA.

Proposition 4.3 If_Kis invariant under Athen every approximate eigenvalue / (right) eigenvector pair obtained from the orthogonal projection method onto_K is exact.

Proof. An approximate eigenpair˜λ,u˜is defined by

PK(Au˜−˜λu˜) = 0,

whereu˜is a nonzero vector in_Kand˜λ ∈ C. If_Kis invariant underAthenAu˜

belongs to_Kand therefore_PKAu˜=Au˜. Then the above equation becomes Au˜−˜λu˜= 0,

showing that the pairλ,˜ u˜is exact.

An important quantity for the convergence properties of projection methods is the distance_k(I− PK)uk2 of the exact eigenvectoru,supposed of norm 1, from the subspace_K. This quantity plays a key role in the analysis of projection methods. First, it is clear that the eigenvectorucannot be well approximated from

Kif_k(I− PK)uk2is not small because we have ku˜−uk2≥ k(I− PK)uk2.

The fundamental quantity_k(I− PK)uk2 can also be interpreted as the sine of the acute angle between the eigenvectoruand the subspace_K. It is also the gap between the space_Kand the linear span ofu. The following theorem establishes an upper bound for the residual norm of the exact eigenpair with respect to the approximate operatorAm,using this angle.

Theorem 4.3 Letγ = kPKA(I− PK)k2. Then the residual norms of the pairs λ,PKuandλ, ufor the linear operatorAmsatisfy respectively

k(Am−λI)PKuk2≤γk(I− PK)uk2 (4.23) k(Am−λI)uk2≤

λ2₊_γ2_k₍_I_{− P}

K)uk2. (4.24)

Proof. For the first inequality we use the definition ofAmto get

k(Am−λI)PKuk2 = kPK(A−λI)(u−(I− PK)u)k2

= kPK(A−λI)(I− PK)uk2

= kPK(A−λI)(I− PK)(I− PK)uk2 ≤ γk(I− PK)uk2.

As for the second inequality we simply notice that

(Am−λI)u = (Am−λI)PKu+ (Am−λI)(I− PK)u

= (Am−λI)PKu−λ(I− PK)u .

Using the previous inequality and the fact that the two vectors on the right hand side are orthogonal to each other we get

k(Am−λI)uk22 = k(Am−λI)PKuk 2 2+|λ|2k(I− PK)uk 2 2 ≤ (γ2+|λ|2)k(I− PK)uk 2 2

which completes the proof.

Note that γ is bounded from above by _kAk2. A good approximation can therefore be achieved by the projection method in case the distance_k(I− PK)uk2 is small, provided the approximate eigenproblem is well conditioned. Unfortu- nately, in contrast with the Hermitian case the fact that the residual norm is small does not in any way guarantee that the eigenpair is accurate, because of potential difficulties related to the conditioning of the eigenvalue.

If we translate the inequality (4.23) into matrix form by expressing everything in an orthonormal basisV of_K,we would write_PK = V V

H _{and immediately}

obtain

k(VH_AV ₋_λI₎_VH_u_k2_≤_γ_k₍_I₋_{V V}H₎_u_k2,

which shows thatλcan be considered as an approximate eigenvalue forBm =

VH_AV _{with residual of the order of}₍_I_{− P}

K)u. If we scale the vectorV H_u_to

make it of 2-norm unity, and denote the result byyu we can rewrite the above equality as

k(VHAV −λI)yuk2≤γk(I− PK)uk2 kPKuk2

≡γtanθ(u,K).

The above inequality gives a more explicit relation between the residual norm and the angle betweenuand the subspace_K.

4.3.2 The Hermitian Case

The approximate eigenvalues computed from orthogonal projection methods in the particular case where the matrixAis Hermitian, satisfy strong optimality properties which follow from the Min-Max principle and the Courant characterization seen in Chapter 1. These properties follow by observing that (Amx, x)is the same as(Ax, x)whenxruns in the subspace_K. Thus, if we label the eigenvalues decreasingly, i.e.,λ1≥λ2≥. . .≥λn,we have

˜ λ1 = max x∈K,x6=0 (PKAPKx, x) (x, x) =x∈Kmax,x6=0 (PKAx,PKx) (x, x) = max x∈K, x6=0 (Ax, x) (x, x) (4.25)

This is because_PKx=xfor any element inK. Similarly, we can show that ˜ λm= min x∈K,x6=0 (Ax, x) (x, x) .

More generally, we have the following result.

Proposition 4.4 Thei−th largest approximate eigenvalue of a Hermitian matrix

A, obtained from an orthogonal projection

method onto a subspace_K,satisfies,

˜ λi= max S⊆K dim(S)=i min x∈S,x6=0 (Ax, x) (x, x) . (4.26)

As an immediate consequence we obtain the following corollary. Corollary 4.1 Fori= 1,2, . . . , mthe following inequality holds

λi ≥λi˜ . (4.27)

Proof. This is because,

˜ λi= max S⊆K dim(S)=i min x∈S,x6=0 (Ax, x) (x, x) ≤ Smax⊆Cn dim(S)=i min x∈S,x6=0 (Ax, x) (x, x) =λi.

A similar argument based on the Courant characterization results in the following theorem.

Theorem 4.4 The approximate eigenvalue˜λiand the corresponding eigenvector

uiare such that

˜ λ1= (Au1,˜ u1˜ ) (˜u1,u1˜ ) =x∈Kmax,x6=0 (Ax, x) (x, x) . and fori >1: ˜ λi= (Aui,˜ ui˜ ) (˜ui,ui˜ ) = x∈Kmax,x6=0, ˜ uH 1x=...=˜uHi−1x=0 (Ax, x) (x, x) (4.28)

One may suspect that the general bounds seen earlier for non-Hermitian ma- trices may be improved for the Hermitian case. This is indeed the case. We begin by proving the following lemma.

Lemma 4.1 LetAbe a Hermitian matrix anduan eigenvector ofAassociated with the eigenvalue λ. Then the Rayleigh quotient µ ≡ µA(PKu)satisfies the

inequality |λ−µ| ≤ kA−λIkk(I− PK)uk 2 2 kPKuk 2 2 . (4.29)

Proof. From the equality

(A−λI)PKu= (A−λI)(u−(I− PK)u) =−(A−λI)(I− PK)u and the fact thatAis Hermitian we get,

|λ−µ| = |((A−λI)PKu,PKu) (PKu,PKu) | = |((A−λI)(I− PK)u,(I− PK)u) (PKu,PKu) | .

The result follows from a direct application of the Cauchy-Schwarz inequality

Assuming as usual that the eigenvalues are labeled decreasingly, and letting

µ1=µA(PKu1),we can get from (4.25) that

0≤λ1−λ1˜ ≤λ1−µ1≤ kA−λ1Ik2k (I− PK)u1k 2 2 kPKu1k 2 2 .

A similar result can be shown for the smallest eigenvalue. We can extend this inequality to the other eigenvalues at the price of a little complication in the equa- tions. In what follows we will denote by Qi˜ the sum of the spectral projectors associated with the approximate eigenvalues˜λ1,˜λ2, . . . ,λi˜−1. For any given vec-

torx, (I −Qi˜ )xwill be the vector obtained by orthogonalizing xagainst the firsti−1approximate eigenvectors. We consider a candidate vector of the form

(I−Qi˜ )PKuiin an attempt to use an argument similar to the one for the largest eigenvalue. This is a vector obtained by projectingui onto the subspace_Kand then stripping it off its components in the firsti−1approximate eigenvectors. Lemma 4.2 LetQi˜ be the sum of the spectral projectors associated with the ap- proximate eigenvaluesλ1,˜ λ2, . . . ,˜ λi˜−1and defineµi=µA(xi),where

xi= (I−Qi˜ )PKui k(I−Qi˜ )PKuik2

Then

|λi−µi| ≤ kA−λiIk2 kQiuik˜ 2 2+k(I− PK)uik 2 2 k(I−Qi˜ )PKuik 2 2 . (4.30)

Proof. To simplify notation we setα= 1/k(I−Qi˜ )PKuik2. Then we write,

(A−λiI)xi= (A−λiI)(xi−αui) ,

and proceed as in the previous case to get,

Applying the Cauchy-Schwarz inequality to the above equation, we get

|λi−µi|=kA−λiIk2kxi−αuik2 2 .

We can rewrite_kxi−αuik2 2as kxi−αuik22 = α2k(I−Qi˜ )PKui−uik 2 2 = α2k(I−Qi˜ )(PKui−ui)−Qiuik˜ 2 2.

Using the orthogonality of the two vectors inside the norm bars, this equality becomes kxi−αuik2 2 = α2 k(I−Qi˜ )(PKui−ui)k 2 2+kQiuik˜ 22 ≤ α2k(I− PK)uik 2 2+kQiui˜ k22 .

This establishes the desired result.

The vectorxihas been constructed in such a way that it is orthogonal to all previous approximate eigenvectorsu1, . . . ,˜ ui˜−1. We can therefore exploit the Courant

characterization (4.28) to prove the following result.

Theorem 4.5 LetQi˜ be the sum of the spectral projectors associated with the approximate eigenvaluesλ1,˜ λ2, . . . ,˜ ˜λi−1. Then the error between the i-th exact

and approximate eigenvaluesλiand˜λiis such that

0≤λi−λi˜ ≤ kA−λiIk2kQiui˜ k 2 2+k(I− PK)uik 2 2 k(I−Qi˜ )PKuik 2 2 . (4.31)

Proof. By (4.28) and the fact thatxi belongs to_Kand is orthogonal to the first

i−1approximate eigenvectors we immediately get

0≤λi−˜λi≤λi−µi.

The result follows from the previous lemma.

We point out that the above result is valid fori= 1, provided we defineQ1˜ = 0. The quantities_kQiui˜ k2represent the cosines of the acute angle betweenuiand the

span of the previous approximate eigenvectors. In the ideal situation this should be zero. In addition, we should mention that the error bound is semi-a-priori, since it will require the knowledge of previous eigenvectors in order to get an idea of the quantity_k_Qiuik2˜ .

We now turn our attention to the eigenvectors.

Theorem 4.6 Letγ = kPKA(I− PK)k2, and consider any eigenvalue λofA

with associated eigenvectoru. Let˜λbe the approximate eigenvalue closest toλ

andδthe distance betweenλand the set of approximate eigenvalues other than

λ. Then there exists an approximate eigenvectoru˜associated with˜λsuch that

sin [θ(u,u˜)]≤ r

1 +γ

Proof. K u z ˜ u vcosφ wsinφ θ ω φ

Figure 4.1: Projections of the eigenvectoruonto_Kand then ontou˜.

Let us define the two vectors

v= PKu kPKuk2

and w= (I− PK)u k(I− PK)uk2

(4.33) and denote byφ the angle betweenuand_P_Ku, as defined bycosφ =kPKuk2. Then, clearly

u=vcosφ+wsinφ,

which, upon multiplying both sides by(A−λI)leads to

(A−λI)v cosφ+ (A−λI)w sinφ= 0.

We now project both sides onto_K,and take the norms of the resulting vector to obtain

kPK(A−λI)vk2 cosφ=kPK(A−λI)wk2 sinφ . (4.34) For the-right-hand side note that

kPK(A−λI)wk2 = kPK(A−λI)(I− PK)wk2

= kPKA(I− PK)wk2≤γ . (4.35) For the left-hand-side, we decomposevfurther as

v= ˜u cosω+z sinω,

in whichu˜is a unit vector from the eigenspace associated with˜λ, zis a unit vector in_Kthat is orthogonal tou,˜ andω is the acute angle betweenv andu˜. We then obtain,

PK(A−λI)v = PK(A−λI)[cosωu˜+ sinωz]

The eigenvalues of the restriction of_PK(A−λI)to the orthogonal ofu˜areλj˜ −λ, forj= 1,2, . . . m,and˜λj 6= ˜λ. Therefore, sincezis orthogonal tou,˜ we have

kPK(A−λI)zk2≥δ>0. (4.37) The two vectors in the right hand side of (4.36) are orthogonal and by (4.37),

kPK(A−λI)vk 2

2 = |λ˜−λ|2cos2ω+ sin2ωkPK(A−λI)zk 2 2

≥ δ2 sin2ω (4.38)

To complete the proof we refer to Figure 4.1. The projection ofuontou˜is the projection ontou˜of the projection ofuonto_K. Its length iscosφcosωand as a result the sine of the angleθbetweenuandu˜is given by

sin2θ = 1−cos2φ cos2ω

= 1−cos2_φ ₍₁₋_sin2_ω₎

= sin2φ+ sin2ω cos2φ . (4.39) Combining (4.34), (4.35), (4.38) we obtain that

sinω cosφ≤ γ_δ sinφ

which together with (4.39) yields the desired result.

This is a rather remarkable result given that it is so general. It tells us among other things that the only condition we need in order to guarantee that a projection method will deliver a good approximation in the Hermitian case is that the angle between the exact eigenvector and the subspace_Kbe sufficiently small.

As a consequence of the above result we can establish bounds on eigenvalues that are somewhat simpler than those of Theorem 4.5. This results from the following proposition.

Proposition 4.5 The eigenvaluesλand˜λin Theorem 4.6 are such that

|λ−λ˜| ≤ kA−λIk2sin2θ(u,u˜). (4.40)

Proof. We start with the simple observation thatλ˜−λ= ((A−λI)˜u,u˜). Letting

α= (u,u˜) = cosθ(u,u˜)we can write

λ−λ= ((A−λI)(˜u−αu),u˜) = ((A−λI)(˜u−αu),u˜−αu)

The result follows immediatly by taking absolute values, exploiting the Cauchy- Schwarz inequality, and observing that_ku˜−αuk2= sinθ(u,u˜).

In document Numerical Methods for Large Eigenvalue Problems - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials (Page 107-116)