APPROXIMATING THE INVERSE OF A SYMMETRIC POSITIVE DEFINITE MATRIX
Gordon Simons and Yi-Ching Yao Abstract
It is shown for an
n
n
symmetric positive denite matrixT
= (t
i;j) with negative o-diagonal elements, positive row sums and satisfying certain bounding conditions that its inverse is well approximated, uniformly to order 1=n
2;
by a matrixS
= (s
i;j);
wheres
i;j =i;j=t
i;i+ 1=t::;
i;j being the Kronecker delta function, andt::
being the sum of the elements ofT:
An explicit bound on the approximation error is provided.Gordon Simons Yi-Ching Yao
Dept. of Statistics Inst. of Statistical Science Univ. of North Carolina Academia Sinica
Chapel Hill, NC 27599-3260 Taipei, Taiwan
simons@stat.unc.edu yao@stat.sinica.edu.tw
Gordon Simons and Yi-Ching Yao 1. INTRODUCTION
We are concerned here with
n
n
symmetric matricesT
= (t
i;j) which have negative o-diagonal elements and positive row (and column) sums, i.e.,t
i;j =t
j;i; t
i;j<
0 fori
6=j;
and Xnk=1
t
i;k>
0 fori;j
= 1;:::;n:
Such matrices must be positive denite and hence fall into the class of M-matrices. (See, e.g., [2]
for the denition and properties of M-matrices.) It is convenient to introduce an array
u
i;j ni;j=1 of positive numbers dened in terms of T as follows:
u
i;j =?t
i;j fori
6=j;
andu
i;i=Xnk=1
t
i;k; i;j
= 1;:::;n:
Then we have
u
i;j>
0; u
i;j =u
j;i; t
i;j =?u
i;j fori
6=j;
andt
i;i =Xnk=1
u
i;k; i;j
= 1;:::;n:
(1) Moreover, it is convenient to introduce the notationm
= mini;ju
i;j; M
= maxi;ju
i;j; t::
= Xni;j=1
t
i;j =Xnk=1
u
k;k>
0;
(2)k
A
k= maxi;j ja
i;j j for a general matrixA
= (a
i;j);
and then
n
symmetric positive denite matrixS
= (s
i;j);
withs
i;j = i;jt
i;i + 1t::;
where
i;j denotes the Kronecker delta function.THEOREM.
k
T
?1?S
kC
(m;M
)n
2;
where
C
(m;M
) = (1 +M m
)M
m
2:
The authors [5] use this theorem while establishing the asymptotic normality of a vector- valued estimator arising in a study of the Bradley-Terry model for paired comparisons. Depending on
n;
which goes to innity in the asymptotic limit, we need to consider the inverseT
?1 of a matrixT
satisfying (1) withm
andM
being bounded away from 0 and innity. Since it isApproximating the Inverse of a Matrix
impossible to obtain this inverse explicitly, except for a few special cases, we show that the approximate inverse
S
is a workable substitute, with the attendant errors going to zero at the rate 1=n
2 asn
!1:
Computing and estimating the inverse of a matrix has been extensively studied and described in the literature. See [1], [3], [4] and references therein. In [3], the characterization of inverses of symmetric tridiagonal and block tridiagonal matrices is discussed, which gives rise to stable algorithms for computing their inverses. [1] and [4] derive, among other things, upper and lower bounds for the elements of the inverse of a symmetric positive denite matrix. In particular, for a symmetric positive denite matrix
A
= (a
i;j) of dimension n, the following bounds on the diagonal elements ofA
?1 are given in [1] and [4]: 1 + (?a
i;i)2 (a
i;i?Pnk=1
a
i;k2)?
A
?1i;i 1 ? (a
i;i?)2 (Pnk=1a
i;k2?a
i;i);
where
n and 0<
1, 1 and n being the smallest and largest eigenvalues ofA
, respectively.The next section contains the proof of the theorem, and some remarks are given in Section 3.
2. PROOF OF THE THEOREM
Note that
T
?1?S
= (T
?1?S
)(I
n?TS
) +S
(I
n?TS
);
where
I
n is then
n
identity matrix. LettingV
=I
n?TS
andW
=SV
, we haveT
?1?S
= (T
?1?S
)V
+W:
Thus the task is to show that k
F
kC
(m;M
);
where the matricesF
=n
2(T
?1?S
) andG
=n
2W
satisfy the recursionF
=F V
+G:
(3)By the denitions of
S
,V
= (v
i;j) andW
= (w
i;j), it follows from (1) and (2) thatv
i;j =i;j?Xnk=1
t
i;ks
k;j=
i;j?Xn k=1t
i;k k;jt
j;j + 1t::
=
i;j?t
i;jt
j;j ?u
i;it::
= (1?
i;j)u
i;jt
j;j ?u
i;it:: ;
(4)
Gordon Simons and Yi-Ching Yao
and
w
i;j=Xnk=1
s
i;kv
k;j=Xnk=1
i;kt
i;i + 1t::
(1?
k;j)u
k;jt
j;j ?u
k;kt::
=Xn
k=1
i;kt
i;i
(1?
k;j)u
k;jt
j;j ?u
k;kt::
+Xn
k=1
t::
1
(1?
k;j)u
k;jt
j;j ?u
k;kt::
= (1?
i;j)u
i;jt
i;it
j;j ?u
i;it
i;it::
?u
j;jt
j;jt:: :
(5)
Again by (1) and (2), we have
0
< u t
i;ii;jt
j;jM
m
2n
2;
0< u t
i;ii;it::
M m
2n
2;
so thatj
w
i;jja
n
2 andjw
i;j?w
i;kja
n
2 fori;j;k
= 1;:::;n;
where
a
= 2M=m
2. Equivalently, in terms of the elements ofG
= (g
i;j):j
g
i;jja
and jg
i;j?g
i;kja; i;j;k
= 1;:::;n:
(6) We now turn our attention to (3), expressed in terms of the matrix elementsf
i;j andg
i;j inF
andG;
respectively, and the formula forv
i;j in (4):f
i;j =Xnk=1
f
i;k(1?k;j)u
k;jt
j;j ? nX
k=1
f
i;ku
k;kt::
+g
i;j; i;j
= 1;:::;n:
(7) The task is to show jf
i;j jC
(m;M
) for alli
andj:
Two things are readily apparent in (7). To begin with, apart from the factor (1?
k;j) in the rst sum, which equals one except whenk
=j;
the rst and second sums are weighted averages off
i;k; k
= 1;:::;n
; the positive weights utk;jj;j and ut::k;k each add to unity in the index k. Secondly, the indexi
plays no essential role in the relationship; it can be viewed as xed. If we takei
to be xed and notationally suppress it in (7), then (7) assumes the form ofn
linear equations in then
unknownsf
1;:::;f
n:f
j =Xnk=1
f
k(1?k;j)u
k;jt
j;j ? nX
k=1
f
ku
k;kt::
+g
j; j
= 1;:::;n:
(8) Instead of solving these equations, we will show that under the bounding conditionsg
ja; g
jg
ka; j;k
= 1;:::;n;
Approximating the Inverse of a Matrix
(see (6)) any solution of (8) must satisfy the inequalities
j
f
j j 1 21 +
M m
a; j
= 1;:::;n;
(9)so that j
f
j jC
(m;M
); j
= 1;:::;n;
thereby completing the proof.Let
and be such thatf
= max1knf
k andf
= min1knf
k:
With no loss of generality, assumef
jf
j:
(Otherwise, we may reverse the signs of thef
k's and proceed analogously.) There are two cases to consider:Case I: f
0:
Thenf
=Xnk=1
f
k(1?k;)u
k;t
; ? nX
k=1
f
ku
k;kt::
+g
n
X
k=1
f
ku
k;t
; ? nX
k=1
f
ku
k;kt::
+g
=Xn
k=1
f
ku
k;t
; ?u
k;kt::
+
g
f
X k2Au
k;t
; ?u
k;kt::
+
g
;
where
A
=k
:u
k;=t
;> u
k;k=t::
:
Let denote the cardinality ofA;
and observe thatX
k2A
u
k;t
; ?u
k;kt::
M
+M m
(n
?) ?m
m
+M
(n
?)M
?m
M
+m;
(10)the rst inequality being an immediate consequence of the constraints
m
u
i;jM
(see (2)) and the sum formulas in (1) and (2), the second inequality taking into account that the middle expression in (10) is a concave function of (when viewed as a continuous variable between 0 andn
), with its maximum occurring at =n=
2:
Thus,f
f
X k2Au
k;t
; ?u
k;kt::
+
g
f
M
?m
M
+m
+g
f
M
?m M
+m
+a;
so that
f
1 21 +
M m
a
=C
(m;M
);
thereby establishing (9) and completing the proof.Gordon Simons and Yi-Ching Yao Case II: f
<
0:
Leth
k =f
k?f
0; k
= 1;:::;n:
Thenh
=f
?f
n
X
k=1
f
ku
k;t
; ? nX
k=1
f
ku
k;t
; +g
?g
= Xn
k=1
h
ku
k;t
; ? nX
k=1
h
ku
k;t
; +g
?g
= Xn
k=1
h
ku
k;t
; ?u
k;t
;+
g
?g
h
X k2Au
k;t
; ?u
k;t
;+
g
?g
;
where
A
=k
:u
k;=t
;> u
k;=t
;:
The argument from this point proceeds analogously to that for Case I. Letting denote the cardinality ofA;
one obtainsX
k2A
u
k;t
; ?u
k;t
;
M
+M m
(n
?) ?m
m
+M
(n
?)M
?m M
+m;
which leads to
h
h
M
?m
M
+m
+g
?g
h
M
?m M
+m
+a;
so that
f
h
1 21 +
M m
a;
thereby establishing (9) and completing the proof.
3. REMARKS
While our proof of the theorem is somewhat long, we do not see how to simplify it by using any of the well-known properties of M-matrices.
The bound C(m;Mn2 ) on the approximation error is a product of two factors, one depending on
m
andM
, the other onn:
For largen;
withm
andM
held bounded away from 0 and innity, the elements ofS
(and hence ofT
?1) are all of order 1=n
, and the errors (i.e., the elements ofT
?1?S
) are uniformlyO
(1=n
2) asn
!1. This fact is crucially used in the work of [5].A particular case of the matrix
T
, described below, shows that the factor 1=n
2 is best possible in the sense that any bound of the forC
e(m;M
)=
(n
) requires(
n
) =O
(n
2) asn
!1; no faster growth rate thann
2 is allowed. On the other hand, it is natural to ask whether the factorC
(m;M
) is best possible. To clarify the issue, for given integern
and givenm
andApproximating the Inverse of a Matrix
M;
(0< m
M <
1);
letQ
n(m;M
) denote the set ofn
n
symmetric positive denite matrices satisfying (1) withm
u
i;jM; i;j
= 1;:::;n;
and deneC
o(m;M
) = supfn
2 kT
?1?S
k:T
2Q
n(m;M
); n
= 1;
2;:::
g:
It follows from the theorem that
C
o(m;M
)C
(m;M
) = (1+Mm) mM2:
But for the special matrixT
satisfying (1) withu
1;1 =M
andu
i;j =m
for all other (i;j
);
we nd that(
T
?1)i;j =8
>
>
>
>
<
>
>
>
>
: 2
2M+(n?1)m for
i
=j
= 1;
1
2M+(n?1)m for
i
= 1;j
6= 1 ori
6= 1;j
= 1;
3M+(2n?1)m
(n+1)m(2M+(n? 1)m) for
i
=j
6= 1;
M+nm
(n+1)m(2M+(n? 1)m) for 1 6=
i
6=j
6= 1:
So (
T
?1?S
)1;1 = (M+(n?1)m?2)(2MM+(n? 1)m);
from which it follows thatC
o(m;M
) 2mM2:
The same matrixT
justies the constraint on(
n
) described above.The gap between 2mM2 and (1+Mm)mM2 suggests that there might be room for improvement in our bound. Indeed, by computer, we have numerically inverted a very large number of matrices of various dimensions (some as large as 300300) and found that the inequality
n
2 kT
?1?S
k 2mM2 holds in all cases. It would therefore be interesting to see whetherC
o(m;M
) = 2mM2:
We nish with one nal observation. Surprisingly, it is possible to evaluate the second sum
in (7) explicitly: n
X
k=1
f
i;ku
k;kt::
=?n
2u
i;it
i;it::;
which is identical to, and permits a cancellation with, one of the three terms dening
g
i;j (cf., (5)). To obtain this, one multiplies both sides of (7) byt
j;j;
adds overj
(j
= 1;:::;n
);
and carries out the suggested algebra. While we have not found much use for this identity, it does show thatf
;
appearing in the proof of the theorem, is strictly negative. Since, as it turns out,f
can be positive or negative, neither case described in the proof is super uous.REFERENCES
[1] G. H. Golub and Z. Strakos, Estimates in quadratic formulas, Numerical Algorithms,
8
(1994), 241-268.[2] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis (1991), Cambridge University Press, New York.
[3] G. Meurant, A review of the inverse of symmetric tridiagonal and block tridiagonal matrices, SIAM J. Matrix Anal. Appl.,
13
(1992), 707-728.Gordon Simons and Yi-Ching Yao
[4] P. D. Robinson and A. J. Wathen, Variational bounds on the entries of the inverse of a matrix, IMA Journal of Numerical Anal.,
12
(1992), 463-486.[5] G. Simons and Y-C Yao, A large sample study of the Bradley-Terry model for paired com- parisons, in preparation.