http://ijmms.hindawi.com © Hindawi Publishing Corp.
A NOTE ON COMPUTING THE GENERALIZED INVERSE
A
(T ,S2)OF A MATRIX
A
XIEZHANG LI and YIMIN WEI
Received 10 May 2001 and in revised form 1 February 2002
The generalized inverseA(T ,S2) of a matrixAis a{2}-inverse ofAwith the prescribed range T and null spaceS. A representation for the generalized inverseA(T ,S2) has been recently developed with the conditionσ (GA|T)⊂(0,∞), whereGis a matrix withR(G)=T and N(G)=S. In this note, we remove the above condition. Three types of iterative methods forA(T ,S2) are presented ifσ (GA|T)is a subset of the open right half-plane and they are
extensions of existing computational procedures ofA(T ,S2), including special cases such as the weighted Moore-Penrose inverseA†M,Nand the Drazin inverseAD. Numerical examples
are given to illustrate our results.
2000 Mathematics Subject Classification: 15A09, 65F20.
1. Introduction. Given a complex matrixA∈Cm×n, any matrixX∈Cn×m satisfy-ingXAX=X is called a{2}-inverse ofA. LetT andS be subspaces ofCnand Cm, respectively. A matrixX∈Cn×mis called a{2}-inverse ofAwith the prescribed range T and null spaceS, denoted byA(T ,S2), if the following conditions are satisfied:
XAX=X, R(X)=T , N(X)=S, (1.1)
whereR(X)is the range ofXandN(X)is the null space ofX. It is a well-known fact [1] that if dimT =dimS⊥≤rank(A), then there exists a uniqueA(T ,S2) if and only if AT⊕S=Cm. It is obvious from the definition above thatAA(2)
T ,S=PAT ,S andA (2) T ,SA= PT ,(A∗S⊥)⊥, wherePS1,S2is the projector on the subspaceS1along the subspaceS2.
There are seven types of important{2}-inverses ofA: the Moore-Penrose inverse A†, the weighted Moore-Penrose inverseA†M,N, theW-weighed Drazin inverseAd,w, the Drazin inverseAD, the group inverse A#, the Bott-Duffin inverseA(−1)
(L) , and the gen-eralized Bott-Duffin inverseA((L)†). All of them are the special cases of the generalized inverseA(T ,S2) ofAfor specificT andS.
Lemma1.1. (a)LetA∈Cm×n[1]. Then, for the Moore-Penrose inverseA† and the weighted Moore-Penrose inverseA†MN,
(i) A†=A(R(A2)∗),N(A∗); (ii) A†M,N =A
(2)
R(N−1A∗M),N(N−1A∗M), whereNandM are Hermitian positive definite matrices of ordernandm, respectively;
(b) [1,2,3]LetA∈Cn×n. Then, for the Drazin inverseAD, the group inverseA#, the Bott-Duffin inverseA((L)−1), and the generalized Bott-Duffin inverseA
(†) (L), (iv) AD=A(2)
R(Ak),N(Ak),wherek=Ind(A); (v) in particular, whenInd(A)=1, A#=A(2)
R(A),N(A);
(vi) A((L)−1) = PL(APL+PL⊥)−1 = A(L,L2)⊥, whereL is a subspace of Cn such that AL⊕L⊥=CnandPLis the orthogonal projector onL;
(vii) A((L)†)=A(S)(−1)=A(S,S⊥2) , whereS=R(PLA).
The{2}-inverse has many applications, for example, the application in the iterative methods for solving nonlinear equations [1,9] and the applications to statistics [6,7]. In particular,{2}-inverse plays an important role in stable approximations of ill-posed problems and in linear and nonlinear problems involving rank-deficient generalized inverse [8,12]. In literature, researchers have proposed many numerical methods for computingA(T ,S2), see [2,3,11,13,15,16,18].
As usual, we denote the spectrum and the spectral radius ofAbyσ (A)andρ(A), respectively. The notation · stands for the spectral norm. The following theorem applied in this note is from the theory of semi-iterative method.
Theorem 1.2(see [5]). LetB∈Cn×n be a nonsingular matrix and let σ (B)⊂Ω, whereΩis a simply connected compact set excluding origin. If a sequence of polyno-mials{sm(z)}∞m=0uniformly converges to1/zonΩ, then{sm(B)}converges toB−1.
In this note, a representation for the generalized inverse A(T ,S2) with a condition σ (GA|T)⊂ {z: Re(z) >0}, whereGis a matrix withR(G)=T andN(G)=Sis pre-sented inSection 2. Euler-Knopp iterative method and semi-iterative methods forA(T ,S2) with linear convergence are derived inSection 3. Quadratically convergent methods forA(T ,S2) are developed inSection 4. Finally, numerical examples are given to illustrate our results.
2. Representation. In this section, we give a representation for the generalized in-verseA(T ,S2), which may be viewed as an application of the classical theory summability to the representation of generalized inverse.
Lemma2.1(see [13]). SupposeA∈Cm×n. LetT andS be subspaces ofCnandCm, respectively, such thatAT⊕S =Cm. Suppose that G∈Cn×m satisfiesR(G)=T and N(G)=S. Denote byA˜=(GA)|T the restriction ofGAonT. ThenInd(GA)=1and
A(T ,S2) =A˜−1G. (2.1)
It follows fromLemma 1.1that the existence ofGis assured for each of the common seven types of generalized inverses:A∗,N−1A∗M,A(W A)q,Ak,A,PL, andPS. Now we are in a position to establish a presentation theorem.
Theorem2.2. LetA,T,S,G, andA˜be as inLemma 2.1. Ifσ (A)˜ is contained in a
simply connected compact setΩexcluding origin and a polynomial sequence{sm(z)} uniformly converges to1/zonΩ, then
A(T ,S2) = lim m→∞sm
˜
Furthermore,
smA˜G−A(T ,S2)P
A(T ,S2)P ≤zmax∈σ (A)˜
zsm(z)−1+O(), (2.3)
where P is invertible such thatP−1GAP is the-Jordan canonical form of GA and BP= P−1Bfor eachB∈Cn×m.
Proof. Assume thatσ (A)˜ ⊂Ω. With applyingTheorem 1.2, we get
lim m→∞sm
˜
A=A˜−1 (2.4)
uniformly onΩ. It follows fromLemma 2.1that
lim m→∞sm
˜
AG=A˜−1G=A(2)
T ,S. (2.5)
The error can be written as
smA˜G−A(T ,S2) =
smA˜A˜−IA(T ,S2). (2.6)
SinceP is nonsingular such thatP−1GAP is the-Jordan canonical form ofGA, it is well known that
P−1GAP≤ρ(GA)+. (2.7)
Thus
smA˜G−AT ,S(2)P=P−1smA˜A˜−IP P−1A(T ,S2)
≤P−1smA˜A˜−IP A(T ,S2)P
≤
max z∈σ (A)˜
sm(z)z−1+O()
A(T ,S2)P.
(2.8)
The last inequality is based on the spectrum mapping sincesm(z)is a polynomial inz. This completes the proof.
In order to make use of this general error estimate inTheorem 2.2on specific ap-proximation procedures, it will be convenient to have lower and upper bounds for σ (A). This is given in the next lemma.˜
Lemma2.3. LetA,T,S,G, andA˜be as inLemma 2.1. Then for eachλ∈σ (A)˜ ,
1
(GA)#≤ |λ| ≤ GA. (2.9)
Proof. We only show the first inequality since the second is trivial. It follows from
Lemma 2.1that Ind(GA)=1. Then the Jordan canonical form ofGAis
GA=P
C 0 0 0
P−1, (GA)#=P
C−1 0 0 0
whereC is invertible. For eachλ∈σ (A),˜ 1/λ∈σ (A˜−1)since ˜Ais invertible. Conse-quently, we have
1
|λ|≤ρ ˜
A−1=ρC−1≤(GA)#, (2.11)
which leads to (2.9). This completes the proof.
Remark 2.4. Theorem 2.2 extends the representation of A(T ,S2) in [15] in which σ (GA|T)⊂(0,∞)is required. The theorem also recovers the representations ofADin [16] andA†M,N in [17] as special cases.
3. Iterative methods forA(T ,S2). In this section, we present applications ofTheorem
2.2andLemma 2.3in developing specific computational procedures for the general-ized inverseA(T ,S2) and estimating corresponding error bounds.
A well-known summability method is called the Euler-Knopp method. A series ∞
m=0am is said to be Euler-Knopp summable with parameterα >0 to the valuea if the sequence defined by
sm=α m
i=0 i
j=0 i j
(1−α)i−jαjaj (3.1)
converges toa. If we chooseam=(1−z)m,m≥0, then as the Euler-Knopp transform of the series∞m=0(1−z)m, we obtain a sequence{sm(z)}, where
sm(z)=α m
j=0
(1−αz)j. (3.2)
Clearly, limm→∞sm(z)=1/zuniformly on any compact subset of an open setEα:= {z: |1−αz|<1}. We assume thatσ (A)˜ ⊂ {z: Re(z) >0}. Denote
φ:= max λ∈σ (A)˜
|Argλ|: −π
2 <Argλ < π
2
. (3.3)
It follows fromLemma 2.3that
σA˜⊂z=r eiθ: r
1≤r≤r2,−φ≤θ≤φ
=: F , (3.4)
wherer1=1/(GA)#andr2= GA. It can be shown with the law of Sines that
F⊂w: |w−g| ≤g, forg=2 cosφGA . (3.5)
If a parameterαsatisfies
0< α <2 cosφ
thenσA˜⊂Eα. There is always a simply connected compact setΩsuch thatσ (A)˜ ⊂
Ω⊂Eα. Hencesm(z)of (3.2) uniformly converges to 1/zonΩ. It follows fromTheorem 2.2that
A(T ,S2)=α ∞
n=0
(I−αGA)nG. (3.7)
Notice that ifAmis themth partial sum, that is,Am=αmj=0(I−αGA)jG, then an iteration form for{Am}is given by
A0=αG, Am+1=(I−αGA)Am+αG, m≥0. (3.8)
For an error bound, we note that the sequence of polynomials{sm(z)}satisfies
zsm+1(z)−1=(1−αz)zsm(z)−1. (3.9)
Thus
zsm(z)−1= |1−αz|m+1≤βm+1 →0, (m→ ∞), (3.10)
where
β= max z∈σ (A)˜|
1−αz| ≤max
z∈F |1−αz|<1. (3.11)
Actually, by the maximum modular theorem, maxz∈F|1−αz| =maxz∈∂F|1−αz|. We denote four parts of∂Fas follows:
Γ1=
r1eiθ:−φ≤θ≤φ
, Γ2=
r eiφ:r
1≤r≤r2
,
Γ3=
r2eiθ:−φ≤θ≤φ
, Γ4=
r e−iφ:r
1≤r≤r2
. (3.12)
Ifz∈Γ1, then|1−αz|2=1−2αr1cosθ+α2r12and it is obvious that
max z∈Γ1|
1−αz| =1−αr1eiφ. (3.13)
With an analogous argument, we have
max z∈Γ3|
1−αz| =1−αr2eiφ. (3.14)
If z∈Γ2∪Γ4, then|1−αz|2=1−2αrcosφ+α2r2 is a quadratic function ofr on [r1, r2], which achieves its maximum at eitherr=r1orr=r2. So
max z∈Γ2∪Γ4|
1−αz| =max1−αr1eiφ,1−αr2eiφ. (3.15)
It follows fromTheorem 2.2that an error bound is given by
Am−A(T ,S2)P A(T ,S2)P
where
β≤max1−αeiφ/(GA)#,1−αeiφGA. (3.17)
Therefore, we have shown the following general convergence theorem.
Theorem 3.1. Let A, T, S, and G be as inLemma 2.1. Suppose the spectrum of
GA|Tis contained in the open right half-plane. Then the sequence{Am}of (3.8) linearly converges toA(T ,S2), if0< α <2 cosφ/GA, whereφis given by (3.3). Moreover, the relative error is bounded by (3.16).
We remark thatTheorem 3.1is an extension of corresponding results in [15,16]. The procedure of semi-iterative methods [5, 10] for solving a linear system can easily be extended to solve
X=HX+C, forC∈Cn×n. (3.18)
Ifρ(H) <1, then a sequence of matrices{Xm}, yielded by
X0=C; Xm+1=HXm+C (m≥0), (3.19)
converges to(I−H)−1C. In general, let 1∉σ (H). As usual, based on a sequence of polynomials{pm(z)}given by
pm(z)=
m
i=0
πm,izi, where m
i=0
πm,i=1, (3.20)
the corresponding semi-iterative method induced by{pm(z)}for the computation of (I−H)−1Cis defined as
Ym= m
i=0
πm,iXi, m≥0. (3.21)
Moreover, the matricesYmand the corresponding residual matricesRmare given by
Ym=pm(H)Y0+qm−1(H)C, Rm=pm(H)C−(I−H)Y0
, (3.22)
where
qm−1(z)=1−pm(z)/(1−z) withq−1(z)=0. (3.23)
If{qm(H)}converges to(I−H)−1, or equivalently, if{pm(H)}converges to 0, then the sequence{Ym}of (3.21) converges to(I−H)−1C. Especially, for H=I−GA|
T andC=G,{Ym}converges toA(T ,S2). With an application ofTheorem 1.2, we have the following corollary.
Especially,Ω1is either a complex segment[α, β]excluding 1 or a closed ellipse in the left half-plane{z: Re(z) <1}with fociαandβ. Let a sequence of polynomials
{pm(z)}given by
pm(z)=Tm
(z−δ)/ξ
Tm(1−δ)/ξ, δ= α+β
2 , ξ= β−α
2
, (3.24)
whereTmis the mth Chebyshev polynomial. The semi-iterative method induced by
{pm(z)}is the Chebyshev iterative method optimal for ellipseΩ1. The corresponding two-step stationary method with the same asymptotically optimal convergence rate is given by
Y0=G; Y1=µ
HY0+G
;
Ym+1=µ0HYm+G+µ1Ym+µ2Ym−1, (m≥1),
(3.25)
where
µ0=
4
1−β+√1−α2, µ1= α+β
2 µ0, µ2=1−µ0−µ1. (3.26)
The sequence{Ym}converges asymptotically optimally toA(T ,S2).
4. Quadratically convergent methods. Newton-Raphson method for finding the root 1/zof the functions(w)=w−1−zis given by
wm+1=wm2−zwm, for a suitablew0. (4.1)
Forα >0, a sequence of functions{sm(z)}is defined by
s0(z)=α, sm+1(z)=sm(z)2−zsm(z). (4.2)
Let z ∈σ (GA|T) and 0 < α <2 cosφ/GA. It follows from the recursive form zsm+1(z)−1= −[zsm(z)−1]2that
zsm(z)−1= |αz−1|2m
≤β2m →0, asm → ∞, (4.3)
where an upper bound ofβis given by (3.17).
The great attraction of the Newton-Raphson method is the generally quadratic na-ture of the convergence. Using the above facts in conjunction withLemma 2.3, we see that a sequence{sm(A)˜ }defined by
s0A˜=αI, sm+1A˜=smA˜2I−Asm˜ A˜ (4.4)
has the property that limm→∞sm(A)G˜ =AT ,S(2). If we setAm=sm(A)G, then˜
A0=αG, Am+1=Am2I−AAm. (4.5)
Corollary4.1. LetA,T,S, andGbe as inLemma 2.1. Suppose that the spectrum
ofσ (GA|T)⊂ {z: Re(z) >0}. Then the sequence{Am}of (4.5) quadratically converges toA(T ,S2), for0< α <2 cosφ/GA. Furthermore, an error bound is given by
Am−A(T ,S2)P≤β2m
+O()A(T ,S2)P, (4.6)
where an upper bound ofβis given in (3.17).
We remark thatCorollary 4.1is an extension of [4,13,15]. It covers iterative meth-ods forA†M,N in [17].
The Newton-Raphson procedure can be speeded up by the successive matrix squar-ing technique in [14] if two parallel processors are available. In fact, the sequence in (4.5) is mathematically equivalent to
A0=αG, P0=I−αGA, Am+1=I+PmAm, Pm+1=Pm2.
(4.7)
There are two matrix multiplications each step both in (4.5) and (4.7). However,Am+1 andPm+1in (4.7) can be calculated simultaneously.
Two algorithms given by (3.8) and (4.5) are also valid in the case when the spectrum of ˜Ais contained in the left half-plane with slight modification.
Moreover, all results in the previous two sections are valid without the restriction on σ (GA)ifGis substituted by another matrix. This is stated as the following corollary.
Corollary 4.2. LetA,T, S, andG be as inLemma 2.1. Then Theorem 3.1and
Corollaries3.2and4.1are valid without any restriction on the spectrum ofGA|T ifG is substituted by
G0=G(GAG)∗G. (4.8)
Proof. It suffices to show that
RG0
=R(G), NG0
=N(G), σG0A|T
⊂(0,∞). (4.9)
As a matter of fact (4.9) is a direct result of [4, Lemma 3.4].
We remark a disadvantage of the choiceG0of (4.8). In the case of computingADwith Ind(A)=k≥3,G0=Ak(A2k+1)∗Ak, the condition number ofG0A|Twill be extremely large since cond(G0A|T)=cond(A|T)4k+2. An accurate numerical solution cannot be obtained if there is any round-off error inA.
5. Examples. Three examples are given in this section to illustrate the computa-tions of three types ofA(T ,S2). All calculations were performed on a PC with MATLAB.
Example5.1. LetA andW be 20 by 10 and 10 by 10 random matrices with
Am−Am−1∞≤=10−10. Three special casesA†,A†M,N, andAd,w are computed in this example. The choices ofG, the number of iterations required and the norm of errors are listed inTable 5.1.
Table5.1. Newton-Raphson method forA†, A†M,N, andAd,w.
A(T ,S2) G m Am−Am−1∞ Am−A(T ,S2)∞
A† A∗ 11 3.25E-14 2.56E-15
A†M,N N−1A∗M 25 1.03E-15 3.09E-15
Ad,w AW A((W A)∗)4W A 36 1.96E-13 1.76E-07
It is remarked that the better accuracy ofAd,w never be achieved and 1.7E-07 is the best error ofAm−Ad,w∞even ifAm−Am−1∞≤=10−10is used as a stop crite-rion. This is because the condition number ofGW AW|T is as large as 1010. If 2-step semi-iterative method of (3.25) is applied to computeA(T ,S2), then{Ym}converges toA† after 54 iterations. However, the method fails to converge after 1500 iterations in other two cases because the segments[α, β]containingσ (GA|T)are[−187970,0.796]and [−355800,0.9997], respectively, so that the rate of asymptotic convergence is too slow.
Example5.2. LetAbe 8 by 8 matrix with a complex spectrum given by
A= 3 2 1
3 0 0 0 0 0 0
−14 1 0 0 0 0 0 0
−1 −1 3 4 −
3
4 0 0 0 0
−1 −1 −34 34 0 0 0 0
0 0 0 0 3
4 − 3
4 −1 −1
0 0 −1 0 −3
4 3
4 −1 −1
0 0 0 0 0 0 1 −14
0 0 0 0 0 0 1
3 3 2 . (5.1)
Table5.2. A(T ,S2) of a Toeplitz matrix.
A(T ,S2) G No. of it. by Newton’s No. of it. by SIM
A† A∗ 10 63
A†M,N N−1A∗M 11 75
Ad,w AW A((W A)∗)4W A 31 >1500
Example5.3. Letr and cbe a row vector and column vector, respectively, such
that
r1=c1=2.5,
rj=(−1)jj
16 + i(j−1)
j , forj=2,3, . . . ,16,
ck=(−101)kk, fork=2,3, . . . ,10.
(5.2)
A 10×16 complex Toeplitz matrixAis constructed byrandc. The stop criterion is the same as inExample 5.1.MandNare chosen positive definite diagonal matrix related toA, andWis a random matrix. The numbers of iterations by Newton’s method and SIM method forA(T ,S2) are shown inTable 5.2.
The data shows that Newton’s method is much faster than that of SIM.
Acknowledgments. The authors thank the referees for their important
sugges-tions. The second author was supported by the National Natural Science Foundation of China under grant No. 19901006 and Doctoral point Foundation of China. Part of the work was finished when Yimin Wei was visiting Harvard University and was supported by China Scholarship Council.
References
[1] A. Ben-Israel and T. N. E. Greville,Generalized Inverses: Theory and Applications, Pure and Applied Mathematics, John Wiley & Sons, New York, 1974.
[2] Y.-L. Chen,Finite algorithms for the(2)-generalized inverseA(T ,S2), Linear and Multilinear
Algebra40(1995), no. 1, 61–68.
[3] ,Iterative methods for computing the generalized inversesA(T ,S2) of a matrixA, Appl.
Math. Comput.75(1996), no. 2-3, 207–222.
[4] Y.-L. Chen and X. Chen,Representation and approximation of the outer inverseA(T ,S2) of a matrixA, Linear Algebra Appl.308(2000), no. 1–3, 85–107.
[5] M. Eiermann, W. Niethammer, and R. S. Varga,A study of semi-iterative methods for non-symmetric systems of linear equations, Numer. Math.47(1985), no. 4, 505–533. [6] A. J. Getson and F. C. Hsuan,{2}-Inverses and Their Statistical Application, Lecture Notes
in Statistics, vol. 47, Springer-Verlag, New York, 1988.
[7] F. Hsuan, P. Langenberg, and A. J. Getson,The{2}-inverse with applications in statistics, Linear Algebra Appl.70(1985), 241–248.
[8] M. Z. Nashed,Generalized Inverses and Applications, Academic Press, New York, 1976. [9] M. Z. Nashed and X. Chen,Convergence of Newton-like methods for singular operator
equations using outer inverses, Numer. Math.66(1993), no. 2, 235–257.
[11] P. S. Stanimirovi´c, Limit representations of generalized inverses and related methods, Appl. Math. Comput.103(1999), no. 1, 51–68.
[12] P-Å. Wedin,Perturbation results and condition numbers for outer inverses and especially for projections, Tech. Report UMINF 124.85, S-901 87, Inst. of Info. Proc. University of Umeå, Umeå, Sweden, 1985.
[13] Y. Wei,A characterization and representation of the generalized inverseA(T ,S2) and its
applications, Linear Algebra Appl.280(1998), no. 2-3, 87–96.
[14] ,Successive matrix squaring algorithm for computing the Drazin inverse, Appl. Math. Comput.108(2000), no. 2-3, 67–75.
[15] Y. Wei and H. Wu,The representation and approximation for the generalized inverseA(T ,S2), to appear in Appl. Math. Comput.
[16] ,The representation and approximation for Drazin inverse, J. Comput. Appl. Math. 126(2000), no. 1-2, 417–432.
[17] ,The representation and approximation for the weighted Moore-Penrose inverse, Appl. Math. Comput.121(2001), no. 1, 17–28.
[18] ,(T−S)splittings methods for computing the generalized inverseA(T ,S2) and
rectan-gular systems, Int. J. Comput. Math.77(2001), 401–424.
Xiezhang Li: Department of Mathematics and Computer Science, Georgia Southern
University, Statesboro, GA30460, USA
E-mail address:[email protected]
Yimin Wei: Department of Mathematics, Fudan University, Shanghai200433, China