5.4 Acceleration Techniques
5.4.2 Deflation by Preconditioning
The next class of methods also attempt to utilize spectral information gained in prior restart cycles to accelerate convergence. Instead of augmenting the Krylov space, the same information is used here to construct a sequence of preconditioners which can be improved as more accurate spectral information becomes available. The first such approach was introduced by Erhel et al. (1996).
To motivate this approach, assume U is an orthonormal basis of an A-invariant sub-space U of dimension k, i.e.,
AU = U AU, AU ∈ Ck×k.
Note that AU is the specific representation of the orthogonal section AU with respect to the basis U . Denoting by U⊥ an orthonormal basis of the orthogonal complement U⊥, we can represent the action of A as
AU U⊥ = U U⊥AU U∗AU⊥ O U⊥∗AU⊥
.
Under the assumption that k is small, it is feasible solve systems involving AU directly, and thus to precondition by M defined as
MU U⊥ = U U⊥AU O O In−k
(5.3)
at each step of the iteration. The resulting right-preconditioned operator is given by AM−1U U⊥ = U U⊥Ik U∗AU⊥
O U⊥∗AU⊥
, i.e., AM−1 = PU + APU⊥. (5.4)
We now compare this preconditioning scheme with Morgan’s method of augmenting the Krylov space Km(A, r0) by the A-invariant subspace U .
Theorem 5.4.2. Let rmM denote the MR residual with respect to the correction space U + Km(A, r0), where U is an A-invariant subspace, and let rmE denote the MR residual with respect to the correction space Km(AM−1, r0) resulting from preconditioning A from the right by M as defined in (5.3). Then there holds
0 =kPUrmMk ≤ kPUrmEk and kPU⊥rmMk ≤ kPU⊥rmEk, (5.5) and therefore krmMk ≤ krmEk. If, in addition, also U⊥ is A-invariant, then, PUr0 = 0 implies rmE = rmM.
Proof. The left set of inequalities in (5.5) follow from PUrmM = 0, which is a restatement of the fact that augmenting with an invariant subspaceU eliminates U from the residual (Lemma 5.2.3).
We next recall that AU⊥ = PU⊥APU⊥ is the orthogonal section of A onto U⊥(cf. the remark following Lemma 4.3.3). Since rmE = r0− AM−1c, for some c ∈ Km(AM−1, r0) we obtain using (5.4)
PU⊥rmE = PU⊥r0− PU⊥AM−1c = PU⊥r0− PU⊥APU⊥c = PU⊥r0− AU⊥PU⊥c.
Moreover, AM−1U = U together with Lemma 4.3.3 yield
PU⊥c ∈ PU⊥Km(AM−1, r0) =Km(PU⊥AM−1, PU⊥r0) =Km(AU⊥, PU⊥r0).
The last two statements show that PU⊥rmE is of the form PU⊥r0 − AU⊥ec with ce ∈ Km(AU⊥, PU⊥r0). On the other hand, by Proposition 5.2.3, there holds
krmMk = min
c∈Km(AU ⊥,PU ⊥r0)kPU⊥r0− AU⊥ck,
i.e., krmMk minimizes all expressions of this form, yielding the right inequality of (5.5).
Next, assuming AU⊥ = U⊥, (5.4) implies AM−1r0 = AU⊥r0 for r0 ∈ U⊥, and thus Km(AM−1, r0) = Km(AU⊥, PU⊥r0), which shows that in this case both methods minimize over the same subspace, hence rmE = rmM.
We note that the assumption PUr0 = 0 is not restrictive, as this can be enforced by adding the correction U A−1U U∗r0 to x0 and the preconditioner is built upon the premise that AU is easily invertible. However, since PUr0 = 0 by no means implies that PUrmE = 0 for m > 0, it cannot be guaranteed that krmEk = krmMk even for such a special choice of initial residual unless AU⊥ =U⊥. In the finite-dimensional case, the condition thatU⊥ be invariant whenever U is invariant—i.e., that all invariant spaces also reduce A—is a characterization for A to be normal. Hence, these two approaches are equivalent when A is normal and U is invariant.
The availability of an (exactly) A-invariant subspace U , on the other hand, is an as-sumption that can rarely be satisfied in practice. For a non-invariant U one can nonethe-less still define the preconditioner as in (5.3), where now AU is defined as AU := U∗AU , resulting in
AM−1U U⊥ = U U⊥
I U∗AU⊥
U⊥∗AU A−1U U⊥∗AU⊥
,
based on the heuristic that U⊥∗AU A−1U will be small whenever U is nearly A-invariant.
In Erhel et al. (1996) such nearly A-invariant spaces are obtained as the span of selected Ritz or harmonic Ritz vectors determined from Krylov spaces generated during previous cycles.
Baglama et al. (1998) propose a similar algorithm, which preconditions by (5.3) from the left, leading to the preconditioned operator
M−1AU U⊥ = U U⊥ I A−1U U∗AU⊥ O U⊥∗AU ⊥
, or M−1A = PU + APU⊥+ (A−1− I)PUAPU⊥,
where we have again assumed that we are in the idealizd case of anwhere U is exactly A-invariant. The MR correction of the left-preconditioned system is the solution of the minimization problem
kM−1rmBk = min{kM−1(r0− AM−1c)k : c ∈ Km(AM−1, r0)} (cf. Section 4.6).
From (5.3), it is evident that,
M−1 = A−1PU + PU⊥ and, consequently, if AU = U ,
PU⊥M−1v = PU⊥v , for all v .
These are the essential ingredients for showing that Proposition 5.4.2 holds in exactly the same way with rmE in place of rmB. The construction of an approximately invariant subspace U is accomplished by Baglama et al. (1998) by employing the IRA process (cf.
Section 4.3.3).
Kharchenko & Yeremin (1995) suggest another adaptive right preconditioner fM : After each GMRES cycle the Ritz values and the corresponding left2 and right Ritz vectors of
2Left Ritz vectors are defined by A∗z˜j− ¯θjz˜j ⊥ Kmand can be obtained from the left eigenvectors of Hm.
A with respect Km are extracted. The aim is to obtain a preconditioner such that the extremal eigenvalues of A, which are approximated by the Ritz values, are translated to one (or at least to a small cluster around one) in the transition from A to A fM−1.
The extremal Ritz values are partitioned into, say, k subsets Θj of nearby Ritz values.
For each Θj, a rank-one transformation of the form I + vjv˜j∗ is constructed, where vj and ˜vj are linear combinations of the associated right and left Ritz vectors. These linear combinations are chosen to translate simultaneously all Ritz values of Θj into a small cluster around one, while satisfying certain stability criteria. One preconditioning step now consists of successive multiplication by these rank-one matrices, i.e.,
Mf−1 = (I + v1ve1∗)· · · (I + vkvek∗) = I + VkVek∗, Vk=v1, . . . , vk , Vek= ˜v1. . . ˜vk . For the last equation we have made use of the fact that ˜vj∗vi = 0 for i 6= j, since all eigenvalues of Hm have geometric multiplicity one. Note that, if Θj has a small diameter and the Ritz values contained in Θj are good approximations of eigenvalues of A, then vj and vej are approximate right and left eigenvectors of A. Moreover, the implementation described in Kharchenko & Yeremin (1995) ensures that the diagonal matrix D := eVk∗Vk ∈ Ck×k is nonsingular.
To compare this approach with the preconditioners presented thus far, we choose biorthonormal bases U and eU ofU := span{v1, . . . , vk} and fU := span{ev1, . . . ,vek} such that U∗U = I, which are given e.g. by
U = VkSe−1 and U := ee VkD−HSeH,
with eS any matrix that satisfies Vk∗Vk = eSHS. In this notation the preconditioner fe M is given by
Mf−1 = I + U eSD eS−1Ue∗ = I + U S ˜U∗, S := eSD eS−1.
We let U⊥ denote an orthonormal basis of U⊥ and make the idealizing assumptions that both U and fU are invariant with respect to A and A∗, respectively, i.e.,
AU = U AU and Ue∗A = AUUe∗,
and that the eigenvalues corresponding toU (respectively fU ) are translated exactly to 1.
Substituting this in the definition of the preconditioner, we obtain using the biorthonor-mality of U and eU ,
Mf−1U U⊥ = U U⊥I + S SUe∗U⊥ O In−k
and
A fM−1U U⊥ = U U⊥AU(I + S) AUS eU∗U⊥+ U∗AU⊥
O U⊥∗AU⊥
.
In addition, our assumptions imply AU(I + S) = I, i.e., S = A−1U − I and AUS eU∗ = (I− AU) eU∗ = eU∗(I− A), resulting in
A fM−1U U⊥ = U U⊥ I Ue∗(I− A)U⊥+ U∗AU⊥ O U⊥∗AU⊥
.
This leads to A fM−1 = PU+PUUe⊥(I−A)PU⊥+APU⊥as the analogue to (5.4), where PUUe⊥ denotes the oblique projection onto U along fU . Thus, in view of PU⊥A fM−1 = AU⊥, the statement made in Theorem 5.4.2 holds also for this preconditioning approach.