• No results found

3.5

Connections With Existing Methods

In this section we discuss the similarities and disparities between DMBR and other methods published in the literature. We show that most of these methods can be deduced from Algorithm 3.3.1 by simply choosing Fj properly. InSubsection 3.5.1we discuss the connections with BGMRES-R [103].

3.5.1

Connections with BGMRES-R

We now consider a comparison between DMBR and BGMRES-R [103]. For the sake of generality we consider a variant with variable preconditioner even though in [103] the algorithm is considered in the case of fixed preconditioner only. Recalling our previous discussion in the beginning of Section 3.2, BGMRES-R generates the inexact (block) Arnoldi relation

AZj =Vj+1Lj+ ˜Qj,

where Lj = Vj+1H AZj. Instead of storing ˜Qj ∈ Cn×sj however, BGMRES-R uses a clever algebraic manipulation storing the QR decomposition

˜

Qj = PjGj,

where Pj∈ Cn×dj+1 and Gj ∈ Cdj+1×sj. Using our notation, without considering major algebraic details, we state that

Hj="LGj j #

.

Moreover, BGMRES-R computes a Fj matrix according toAlgorithm 3.4.1, and deflates ˆVj+1 and ˆHj at the end of every iteration. We show in Algorithm 3.5.3 a simplified pseudocode for both methods, highlighting the differences.

Notice that in BGMRES-R, because there is no deflation at the beginning of the first iteration, it holds that FH 2 Λˆ1= " ˆ Λ0 FH 2 0(p0−n1)×p # = " ˆ Λ0 0p1×p # , and thus, for every j we have that

FH

j+1Λˆj= ˆΛj,

that is, it is unnecessary to deflate ˆΛj in BGMRES-R. Both methods are thus algebraically equivalent for every cycle in which DMBR chooses k1 = p0, and not equivalent otherwise. The lack of deflation at the beginning of the first iteration of BGMRES-R brings some crucial drawbacks for BGMRES-R, as we highlight next:

• When the deflations happen early in the convergence history. According to our numerical experi- ments, this behaviour seems to be common among the tested problems. The value of kj tends to quickly decrease in the first cycles, having km small (often equal to one) at the end of the cycle (cf. Figure 3.1). Considering the extreme case in which km= 1, BGMRES-R then performs in the following cycle p0+ m − 1matrix vector and preconditioner applications, whereas DMBR performs only m. Also, notice that in such a case [Vj Pj+1]has 2p0+jcolumns in BGMRES-R and j +p0−1 columns in DMBR, meaning that BGMRES-R has a more expensive orthogonalization than DMBR. • When the restart size m is small. Since in BGMRES-R k1 = p0, a small restart makes the afore-

Algorithm 3.5.3:Comparison between DMBR and BGMRES-R

1 Choose X0, m and a convergence criterion with its scaling matrix; 2 for cycle = 1, . . . , m do 3 Compute R0= B − AX0; 4 Vˆ1Λˆ0= R0(thin QR decomposition); DMBR 5aDefine k1= p0andV1= ˆV1; 6afor j = 1, . . . , m do 7a Choose kjandFj; 8a Deflate: Vj Pj−1 = ˆVjFj; 9a Deflate: Λj=FjHΛˆj−1; 10a Deflate: Hj−1=FjHHˆj−1; 11a ApplyAlgorithm 3.2.1obtaining

AZj= ˆVj+1Hˆj; 12a Set ˆΛj= " Λj 0(kj−nj)×p # ;

13a Solve: min

ˆ Λj− ˆHjY F;

14a if full convergence detected then break; 15a ;

16a Choose kj+1andFj+1;

17a Deflate: Vj+1 Pj = ˆVj+1Fj+1;

18a Deflate: Hj=Fj+1H Hˆj;

19aend for

BGMRES-R 5bDefine k1= p0andV1= ˆV1; 6bfor j = 1, . . . , m do 7b Choose kjandFj; 8b Deflate: Vj Pj−1 = ˆVjFj; 9b Deflate: Λj=FjHΛˆj−1; 10b Deflate: Hj−1=FjHHˆj−1;

11b ApplyAlgorithm 3.2.1obtaining

AZj= ˆVj+1Hˆj; 12b Set ˆΛj= " ˆ Λj−1 0(kj−nj)×p # ; 13b Solve: min ˆ Λj− ˆHjY F;

14b if full convergence detected then break; 15b ; 16b Choose kj+1andFj+1; 17b Deflate: Vj+1 Pj = ˆVj+1Fj+1; 18b Deflate: Hj=Fj+1H Hˆj; 19bend for 20 Update R0= B − AX0; 21 X0= X0+ZmYm; 22 end for

• When the number of right-hand sides p0 is large. Same as above.

One of the main novelties of DMBR over BGMRES-R is hence the deflation of ˆΛj, which allows the deflation steps to take place in the beginning of the iteration while still minimizing the norm of the true residual R0− AXj. Other novelties we propose are the truncation (that is, setting kmax < p0 in

Algorithm 3.4.1) and the flexible preconditioner, as well as decoupling the deflation strategy from the method itself (that is, allowing any unitary Fj to be chosen). We refer toSection 3.9for more details on the numerical experiments and the practical difference between the behaviour of the two methods.

3.5.2

Connections with BFGMRESD

We now consider now a comparison between DMBR and BFGMRESD (as well as BFGMREST) [23, 78,

96]. Recalling what was mentioned in the beginning of this chapter, BFGMRESD performs a block size reduction of the (scaled) initial residual R0 relying on its (near) rank deficiency. Rewriting (3.1.1) with our notation, BFGMRESD computes

R0= ˆV1Λˆ0 (QR decomposition) (3.5.1)

ˆ

Λ0D0= U+Σ+W+H+ U−Σ−W−H (singular values decomposition) (3.5.2) where (3.5.2) has exactly the same dimensions as (3.4.1)-(3.4.3) (assuming H0Y0= 0k1×p) and where D0

is the chosen nonsingular scaling matrix. Denoting the structures of BFGMRESD with a # superscript, it then proceeds setting

Λ#1 = Σ+W+H V#

3.5. CONNECTIONS WITH EXISTING METHODS 41 Notice however that during the first iteration of DMBR using Algorithm 3.4.1 we have that F1 = F1=U+ U−and thus Λ1=U+ U− Hˆ Λ0= " Σ+W+H Σ−W−H # D0−1 V 1 P0 = ˆV1U+ U− , that is, V1=V #

1 and the first k1 rows of Λ1 are equal to Λ # 1D

−1

0 . Therefore, apart from the scaling, we can rewrite BFGMRESD using DMBR framework: P0 is discarded as well as the last d1 rows of Λ1, and p0is defined as k1and the deflation happens during the first iteration (or right before the first iteration). We show inAlgorithm 3.5.6a comparison between both methods.

Algorithm 3.5.6:Comparison between DMBR and BFGMRESD

1 Choose X0, m and a convergence criterion with its scaling matrix; 2 for cycle = 1, . . . , m do 3 Compute R0= B − AX0; 4 Vˆ1Λˆ0= R0(thin QR decomposition); DMBR 5aChoose k1andF1; 6aDeflate:V1 P0 = ˆV1F1; 7aDeflate: Λ1=F1HΛˆ0;

8aDiscard P0 and last d1 rows of Λ1, and set

p0= k1;

9afor j = 1, . . . , m do 10a Choose kjandFj;

11a Deflate:Vj Pj−1 = ˆVjFj; 12a Deflate: Λj=FjHΛˆj−1; 13a Deflate:Hj−1=FjHHˆj−1; 14a ApplyAlgorithm 3.2.1obtaining

AZj= ˆVj+1Hˆj; 15a Set ˆΛj= " Λj 0(kj−nj)×p # ;

16a Solve: min

ˆ Λj− ˆHjY F;

17a if full convergence detected then break; 18a ;

19aend for

BFGMRESD

5bChoose k1andF1; 6bDeflate:V1 P0 = ˆV1F1; 7bDeflate: Λ1=F1HΛˆ0;

8bDiscard P0 and last d1 rows of Λ1, and set

p0= k1; 9bfor j = 1, . . . , m do 10b Choose kjandFj; 11b Deflate:Vj Pj−1 = ˆVjFj; 12b Deflate: Λj=FjHΛˆj−1; 13b Deflate:Hj−1=FjHHˆj−1;

14b ApplyAlgorithm 2.5.1obtaining

AZj=Vj+1Hj;ˆ|; 15b Set Λj= " Λj 0(kj−nj)×p # ; 16b Solve: min Λj−HjY F; ˆ| ;

17b if full convergence detected then break; 18b ;

19bend for 20 Update R0= B − AX0;

21 X0= X0+ZmYm; 22 end for

Thus one cycle of DMBR is algebraically equivalent to BFGMRESD only when no deflation occurs (in which case both algorithms are equivalent to BFGMRES). This happens because BFGMRESD generates a different correction subspace, since it discards P0 and does not orthogonalize V2 against it.

Another remark is that the truncation of Λ#

1 means in other words that the least squares problem minimizes only (a linear combination of) k1 columns of the true block residual Rj (namely those which did not converge yet) neglect the remaining ones, which is what we are aiming for in a deflated scenario. This is however not true in the case of BFGMREST, which consists of BFGMRESD where kmax < p0. BFGMREST truncates Λ#

1 even if the singular values of the scaled true residual are not small, meaning that BFGMREST potentially neglect (a linear combination of) columns of Rjwhich did not converge yet. As demonstrated in Section 3.4, this is not the case for DMBR, which always minimizes the norm of the whole true block residual Rjevery iteration, regardless of the chosen kj or kmax. This behaviour is one

of the main contributions of DMBR over BFGMREST for the case of truncated scenario, and has shown to provide a considerable computational gain in our numerical experiments. Naturally, another feature present in DMBR and not in BFGMRESD or BFGMREST is the possibility of deflation every iteration.