5.5 H LU Decomposition
5.5.1 Sparse Matrices
Note thatUiα,iα =6 0 for all iα is required for the solvability of a system LUx =b.
The triangular matrices can be expressed by block-triangular matrices as follows: off-diagonal blocks: L|τ×σ=0 forτ<σ and U|τ×σ=0 forτ>σ,
diagonal blocks: L|τ×τ = I and U|τ×τ ∈ F(τ×τ)forτ×τ∈ P,
whereU|τ×τis no longer triangular. The advantage of the block-triangular decom- position is that it may be well defined even if the standard LU decomposition does not exist.
The solution of the system Ax = b given the factorisation A = LU is carried out by solving the systems Ly = b and Ux = y. The first system is treated by the procedureForward_Substitutionwhere the input data areL, I, andb, and the output is
y. These parameters require thatτ∈T(I×I,p), andy,b∈CI, andL satisfies (5.5.1) with P⊂T(I×I). The procedure Forward_Substitution(L,I,y,b)is called with τ= I and the input vectorbis overwritten. The solutiony|τ of L|τ×τy|τ =b|τ is computed
by the procedure which is showed in the appendix B.1 of LU decomposition.
The second system Ux = y is solved using the procedure Backward_Substitution
where the input parameters areτ,y, andU (satisfying (5.5.1)), and the output is the vector x∈CI. This procedure is similar to theForward_Substitutionprocedure and it
is described in the appendix B.1. The vectoryin this procedure is overwritten. The complete solution of the system LUx = b is performed by the following procedure:
procedure Solve_LU(L,U,I,x,b); {L,U,I,binput ;x output}
beginx :=b;
Forward_Substitution(L,I,x,x);
Backward_Substitution(U,I,x,x);
end;
5.5H-LU Decomposition 67
For example, taking the two sons of the index set I, the matrices have the structure
A11 A12 A21 A22 = L11 0 L21 L22 U11 U12 0 U22 .
The subtasks are the following:
(i) ComputeL11andU11as factors of the LU decomposition of A21, (ii) Use L11U12 = A12to computeU12,
(iii) UseL21U11 = A21to computeU21,
(iv) ComputeL22andU22as LU decomposition of L22U22 = A22−L21U12.
The proceduresForward_M(L11,U12,A12,τ1,τ2)andForwardT_M(U22,L22,A22,τ1, τ2)are used to solve the problems (ii) and (iii), respectively. They are shown in the appendix B.2 of LU decomposition and their descriptions are in [Hackbusch, 2016, section 7.6.3]. The right-hand side in the LU decomposition in the problem (iv) can be solved by using the usual formatted multiplication.
The LU factors of L11U11 = · · · and L22U22 = · · · have to be determined. A recursion is defined and it is performed until the leaves are reached, and the usual LU decomposition for the full matrices is applied.
The desired LU factors of Aare produced by calling the procedureLU_Decomposi -tion(L,U,A,I). In general, the problem L|τ×τU|τ×τ = A|τ×τ forτ ∈ T(I×I,P)is
solved usingLU_Decomposition(L,U,A,τ).
procedure LU_Decomposition(L,U,A,τ);
ifτ×τ∈ PthencomputeLτ×τ andUτ×τ as LU factors of A|τ×τ else fori=1to#S(τ)do begin LU_Decomposition(L,U,A,τ[i]); for j=i+1to#S(τ)do begin ForwardT_M(U,L,A,τ[j],τ[i]); (5.5.2) Forward_M(L,U,A,τ[i],τ[j]); forr= i+1to#S(τ)do A|τ[j]×τ[r] := A|τ[j]×τ[r] L|τ0×σ[i]U|σ[i]×σ[j] end end;
The cost ofForward_Substitution(L,I,y,b)andBackward_Substitution(U,τ,x,y)can be verified and estimated by Lemma D.18 in [Hackbusch, 2016]. The combination of both costs give the cost of the LU decomposition:
68 Hierarchical Matrices
whereSH(r,P)is the storage cost of matrices inH(r,P)and it is described in Lemma
D.17 [Hackbusch, 2016].
The costs of solving the systems LX = Z and XU = Z are compared with a standard multiplication ofH-matrices in which one obtains
NForward_M(r,P) +NForwardT_M(r,P)≤ NMM(P,r,r),
where NMM(P,r,r) is in (5.4.6). The procedure in (5.5.2) that produces the LU de-
composition does not require more operations than the matrix-matrix multiplication:
NLUdecomposition(r,P)≤ NMM(P,r,r).
5.5.1 Sparse Matrices
The inverse of the finite element matrix from a general elliptic differential operator can be well approximated by H-matrices as is proved in [Hackbusch, 2016]. The finite element matrix in (2.5.10) is stored in sparse format. In terms of the storage cost and the cost of matrix-vector multiplication, this format is more convenient. However, the main reason to convert the matrixLinto theH(r,P)format is to apply the hierarchical matrix operations such as LU decomposition. The transfer is based on the following Lemma:
Lemma 5.5.1. Let H(r,P) ∈ CI×I be a an arbitrary H-matrix format, and P based on the
admissibility condition(5.2.5). Furthermore, let dist(τ,σ)be defined by(5.2.3)and(5.2.4b).
Then any finite element matrix is inH(r,P)for any r∈N0. Proof. See [Hackbusch, 2015, Lemma 9.4.a].
The transfer of a sparse matrix into the H-format is performed by the next two steps:
(i) Set M|b:=0 forb∈P+,
(ii) Copy the block matrix M|bforb∈P−as a full matrix.
One tries to optimise the H-LU decomposition for sparse matrices using the fol- lowing idea: find a permutationPsuch that the LU decomposition ofPAPHis sparser than for A. The details of sparsity pattern will be described in section 5.5.2. The LU decomposition of a matrix A ∈ CI×I is determined by the order of the index set I,
which is obtained from the cluster treeT(I)(Forward_Substitution Procedure). Hence, distinct cluster trees are needed for different permutations. The construction of clus- ter trees for sparse matrices will be introduced in 5.5.3.
The inverse process of a sparse finite element matrix yields a fully populated matrix. The inverse matrix can be well approximated byH-LU decomposition where the factors L and U are represented by hierarchical triangular matrices (5.5.1) in H(r,P)[Hackbusch, 2015, section 9.2.8], [Grasedyck et al., 2009]
5.5H-LU Decomposition 69
(a) (b)
Figure 5.5.1: (a) The graphs G1 and G2 and the separator γ. (b) The graphs repre- sented in the graph matrix.