Numerical experiments - On computation of matrix logarithm times a vector

In this section we report some numerical examples from different matrices case with a different range of size. The algorithm have been implemented in C++ using LAPACK and BLAS for dense linear algebra operation [2] and SuiteSparse [7] for sparse-matrix dense-vector multiplication.

In all our experiments we have an square matrix A of dimension n and a vector b. We are interested in providing a suitable approximation of f = log(A)b. We are going denote f_directthe approximation of f when the matrix log(A) has computed and then it has been evaluated at the vector b. On the other hand, when f has been approximated by an Arnoldi approximation of order m, it has been denoted by f_m. Therefore the error estimators considered in our experiments are

ε_a = kf_m− f_directk, ε⁰_a= kf_m− f_m−1k, (absolute error), (3.8)

ε_r = kf_m− f_directk

kf_directk , ε⁰_r= kf_m− f_m−1k

kf_mk , (relative error). (3.9) The values ε_a and ε_r are not always computable in terms of a reasonable CPU time but ε⁰_a and ε⁰_r may be.

3.5.1 Random case

Let us start taking an artificial example which is built from an n-by-n matrix A =b a_b_ijsuch that _ba_ij follows an standard normal distribution. Now define the matrix A =a_ij as

a_ij =





 P

i6=j

|baij| i = j,

ba_ij i 6= j.

(3.10)

If b = (b₁, . . . , b_n) is so that b_i also follows an standard normal distribution for all i, then we will approximate the value

log A kAk

! b kbk.

We want to compute the principal logarithm of this matrix time a normal random and unit vector b. To do so, we consider three different methods; the Arnoldi method, the uniform Chebyshev rational method and the rational Arnoldi method whose poles are the ones of the uniform Chebyshev rational

approximation (2.6) in § 2.3 and the last pole is the infinity. Due to small matrix size, we compute logarithm of the whole matrix by Algorithm 2.2 and f_direct denotes the evaluation at b. In order to distinguish if the com-putations of the approximated values f_m are being doing appropriately we consider the expressions (3.9) and (3.8).

Figure 3.5 shows that in some how ε⁰_a and ε⁰_r may be used to approximate ε_a and ε_r respectively. The Figure 3.3 shows how the rational Arnoldi method is more accurate although more CPU time is required. On the other hand, Arnoldi method and uniform Chebyshev rational method have a similar ac-curacy while the latter seems to be more fast.

1x10^-16

Figure 3.3: Principal logarithm of the matrix A/kAk, defined in (3.10), times a normal random and unit vector using Arnoldi approximation, uniform ra-tional approximation and rara-tional Arnoldi approximation with order m and poles (2.6). The first two figures illustrate the values (3.8) and (3.9) for dif-ferent values of n respectively while the last figure shows the CPU time with respect to m

3.5.2 Matrix collection

As harder and real examples, let us take them from ‘The University of Florida Sparse Matrix Collection’ [8] a large set of sparse matrices that arise in real applications. It is widely used by the numerical linear algebra community for the development and performance evaluation of matrix algorithms. In order to have a first impression of the numerical approximation of log(A)b, let us take, for instance, the matrix HB/bcsstk01 whose size is 48 and f_direct needs around 2 min of CPU time. The Figure 3.8 shows the computation using rational Arnoldi approximation and the Figure 3.4 has been computed by Arnoldi approximation. Although the latter is faster, the former is more accurate.

Figure 3.4: Principal logarithm of the matrix HB/bcsstk01 times a normal random and unit vector using Arnoldi approximation with order m. The first two figures illustrate the values (3.8) and (3.9) respectively while the last figure shows the CPU time with respect to m

1x10^-12 1x10^-10 1x10^-8 1x10^-6 0.0001 0.01 1 100

0 5 10 15 20 25 30 35 40 45

absolute errors

εa εa'

1x10^-14 1x10^-12 1x10^-10 1x10^-8 1x10^-6 0.0001 0.01 1

0 5 10 15 20 25 30 35 40 45

relative errors

εr εr'

0 0.05 0.1 0.15 0.2 0.25

0 5 10 15 20 25 30 35 40 45

CPU time [sec]

Figure 3.5: Principal logarithm of the matrix HB/bcsstk01 times a normal random and unit vector using rational Arnoldi approximation with order m and poles (2.6). The first two figures illustrate the values (3.8) and (3.9) respectively while the last figure shows the CPU time with respect to m

On the other hand, Figure 3.6 shows the case of the matrix HB/bcsstk16 whose size is 4884. In this case, the evaluation vector b has also been chosen following a normal distribution in each coordinate and then it has been done unitary. The poles has also been the ones defined in the rational Chebyshev approximation of the principal logarithm § 2.3. In this case, it seems that the error is stagnated around 10⁻⁷ and the direct approximation needs around 8 minutes of CPU time.

Figure 3.6: Principal logarithm of the 4884-by-4884 matrix HB/bcsstk16 times a normal random and unit vector using Arnoldi approximation and ra-tional Arnoldi approximation with order m and poles (2.6). The first column of figures show the error values (3.8) and (3.9) in logarithmic scale while second column represents the CPU time that has been required for each value of m

We can also try to consider harder cases such as the matrix TKK/cbuckle which has a size of 13681 and the direct approximation needs 110 minutes.

The result is summarised in the Figure 3.7 whose performance as one expects in terms of error and CPU time, that is, at each increased value of m the estimated error is decreased while the CPU time is increased.

1x10^-6

Figure 3.7: Principal logarithm of the matrix TKK/cbuckle, of size 13681, times a normal random and unit vector using Arnoldi approximation and ra-tional Arnoldi approximation with order m and poles (2.6). The left column plots show the error values ε⁰_a and ε⁰_r in logarithmic scale while second plot represents the CPU time that has been required for each value of m

Finally, the hardest matrix executed has been AMD/G3 circuit. It has size 1585478 the direct approximation is unfeasible in terms of memory and computational effort. The Figure 3.8 shows that it requires a really large CPU time because of the bottleneck in the rational Arnoldi decomposition.

0.0001

Figure 3.8: Principal logarithm of the matrix AMD/G3 circuit, of size 1585478, times a normal random and unit vector using Arnoldi approxima-tion and raapproxima-tional Arnoldi approximaapproxima-tion with order m and poles (2.6). The left plot shows the error values ε⁰_a and ε⁰_r in logarithmic scale while second plot represents the CPU time that has been required for each value of m

Conclusions

The research carried out in this thesis have achieved the main goals that were pointed at the beginning of the project though some modifications have suffered. The project deals with the most recent and active research in mat-rix functions. At the beginning, we spent a lot of time looking into the state of the art of matrix functions while intermediate implementations were codi-fied as well as new concepts were learnt during this whole process. Once we understood the different definitions of matrix functions, their properties and the general techniques that allows us to approximate the matrix functions, we realised that the main gap and also the newest results in this area was the matrix function times a vector. I had the honour of working with Dr. Ken Hayami in this last part.

From the first part we decided to choose the principal logarithm as matrix function although the techniques used for computing the matrix logarithm times a vector are general and they can be applied to any matrix function un-der slightly modifications from a theoretical point of view. These techniques concern with Krylov subspace and the associated Arnoldi decompositions.

Firstly we focused on the polynomial case because it is the most natural and the first one that we can find in the literature, specially if one has studied iterative solvers for linear systems. After that, we moved to the rational case which includes the polynomial case. This rational Arnoldi approximation has turned to be a really good method in terms of accuracy although the CPU time might be improved. Iterative solvers for linear systems and eigenvalue problems have been studied from very beginning in numerical linear algebra and many techniques have been analysed deeply, they can be a baseline and an inspiration to new techniques. The first point to be consider may be the solution of the linear systems like (A − αI)x = b where the parameter α and vector b may be changed in consecutive sequence of linear systems. They can be solved at the same time using the Kyrlov property of invariability under scaling and translations. The second point is concerned with precondition-ers in the different linear systems that are needed to construct the rational Arnoldi approximation.

Bibliography

[1] Afanasjew M., Eiermann, M., Ernst, O., and Guttel, S. Im-plementation of a restarted Krylov subspace method for the evaluation of matrix functions. Linear Algebra and its Applications 429, 10 (2008), 2293–2314.

[2] Anderson, E. LAPACK users’ guide. Society for Industrial and Ap-plied Mathematics, Philadelphia :, 1999.

[3] Arsigny, V., Fillard, P., Pennec, X., and Ayache, N. Geomet-ric means in a novel vector space structure on symmetGeomet-ric positive-definite matrices. SIAM J. Matrix Analysis Applications 29, 1 (2006), 328–347.

[4] Berljafa, M., and G¨uttel, S. Generalized Rational Krylov De-compositions with an Application to Rational Approximation. SIAM Journal on Matrix Analysis and Applications 36, 2 (jan 2015), 894–916.

[5] Berljafa, M., and G¨uttel, S. Parallelization of the rational Arnoldi algorithm. Accepted for publication in SIAM J. Sci. Comput., 2017 (2015), 24.

[6] Cooke, R., and Arnold, V. Ordinary Differential Equations.

Springer Textbook. Springer Berlin Heidelberg, 1992.

[7] Davis, T. A. Direct Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, jan 2006.

[8] Davis, T. A., and Hu, Y. The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software 38, 1 (nov 2011), 1–25.

[9] Eiermann, M., and Ernst, O. G. A restarted Krylov subspace method for the evaluation of matrix functions. SIAM J. Numer. Anal.

44, 6 (2006), 2481—-2504 (electronic).

[10] Eld´en, L., and Wittmeyer-Koch, L. Numerical analysis: an in-troduction. Academic Press, 1990.

[11] Gantmacher, F. The theory of matrices. No. 1 in The Theory of Matrices. Chelsea Pub. Co., 1960.

[12] Golub, G., Nash, S., and Van Loan, C. A Hessenberg-Schur method for the problem AX + XB= C. IEEE Transactions on Automatic Control 24, 6 (dec 1979), 909–913.

[13] Golub, G. H., and Van Loan, C. F. Matrix Computations, 4th ed.

2013.

[14] G¨uttel, S. Rational Krylov approximation of matrix functions: Nu-merical methods and optimal pole selection. GAMM-Mitteilungen 36, 1 (aug 2013), 8–31.

[15] Higham, N. J. Accuracy and Stability of Numerical Algorithms. 2002.

[16] Higham, N. J. Functions of Matrices. Society for Industrial and Ap-plied Mathematics, jan 2008.

[17] Horn, R. A. Topics in Matrix Analysis. Cambridge University Press, New York, NY, USA, 1986.

[18] Israel, R. B., Rosenthal, J. S., and Wei, J. Z. Finding generat-ors for markov chains via empirical transition matrices, with applications to credit ratings. Mathematical Finance 11, 2 (2001), 245–265.

[19] Kelisky, R. P., and Rivlin, T. J. A rational approximation to the logarithm. Mathematics of Computation 22, 101 (jan 1968), 128–128.

[20] Knuth, D. E. The Art of Computer Programming, Volume 2 (3rd Ed.): Seminumerical Algorithms. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1997.

[21] Lehoucq, R. B., and Meerbergen, K. Using Generalized Cayley Transformations within an Inexact Rational Krylov Sequence Method.

SIAM Journal on Matrix Analysis and Applications 20, 1 (jan 1998), 131–148.

[22] Osborne, E. E. On pre-conditioning of matrices. J. ACM 7, 4 (Oct.

1960), 338–345.

[23] Paterson, M. S., and Stockmeyer, L. J. On the number of non-scalar multiplications necessary to evaluate polynomials. SIAM Journal on Computing (1973).

[24] Rivlin, T., and Shapiro, H. A unified approach to certain problems of approximation and minimization. Journal of the Society for Industrial

& Applied . . . 9, 4 (dec 1961), 670–699.

[25] Roger A. Horn, C. R. J. Topics in matrix analysis. Cambridge University Press, 1994.

[26] Ruhe, A. Rational krylov: A practical algorithm for large sparse non-symmetric matrix pencils. SIAM Journal on Scientific Computing 19, 5 (1998), 1535–1551.

[27] Saad, Y. Iterative Methods for Sparse Linear Systems. Notes 3, 2nd Edition (2003), xviii+528.

[28] Salsa, S. Partial Differential Equations in Action, vol. 99 of UNITEXT. Springer International Publishing, Cham, 2016.

[29] Skoogh, D. A parallel rational Krylov algorithm for eigenvalue compu-tations. Springer Berlin Heidelberg, Berlin, Heidelberg, 1998, pp. 521–

526.

[30] van der Vorst, H. A. Iterative Krylov Methods for Large Linear Systems, vol. 13. Cambridge University Press, 2003.

[31] Wilf, H. S., Enslein, K., and Ralston, A. Mathematical methods for digital computers. Wiley Wiley & Sons Inc New York ; London, 1960.

In document On computation of matrix logarithm times a vector (Page 62-73)