Nested iteration methods for nonlinear matrix problems

(1)

Nested iteration methods for

nonlinear matrix problems

Geneste iteratie methoden voor

niet-lineaire matrix problemen

(met een samenvatting in het Nederlands)

Proefschrift

ter verkrijging van de graad van doctor aan de Uni-versiteit Utrecht op gezag van de Rector Magnifi-cus, Prof. dr. W. H. Gispen, ingevolge het besluit van het College van Promoties in het openbaar te verdedigen op maandag 22 september 2003 des ocht-ends te 10.30 uur

door

Jasper van den Eshof

(2)

Universiteit Utrecht Co-promotor: Dr. G.L.G. Sleijpen

Faculteit der Wiskunde en Informatica Universiteit Utrecht

Het onderzoek beschreven in dit proefschrift is financieel mogelijk gemaakt door de Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO).

Mathematics Subject Classification: 65F15, 65F10, 65F50.

Van den Eshof, Jasper

Nested iteration methods for nonlinear matrix problems

Proefschrift Universiteit Utrecht – Met een samenvatting in het Nederlands.

(3)

Chapter 1

Introduction

Today’s applications in scientific computing require the more and more complex coupling of various building blocks dedicated to specific visualization and numer-ical tasks. The Numlab scientific computing workbench [85] aims at providing its users with the possibility to construct rapidly and conveniently applications for scientific computing and visualization problems. A key issue here is the availability of flexible (and relevant) modules. Particularly focusing on the numerical part, we see that in large scale problems iterative solution methods often play a pivotal role. Iteration methods are solution methods that proceed by, starting with an initial guess, repeatedly improving the last obtained approximation until it satisfies the required accuracy. Traditionally, iteration methods are seen as the opposite of, what are called, direct methods (although more and more there grows a common consensus among researchers that clever combinations of the classes can lead to very good solvers). Whereas for direct methods there is now a vast amount of literature studying the efficiency, stability and accuracy of various methods, the situation for iterative methods is still not as advanced. General purpose imple-mentations of iterative methods are often not readily available. In fact, in practice often experts are needed to tune the specific methods on a problem by problem basis.

The general goal of this thesis is to prepare iterative methods for use in a scientific computing laboratory. As a concrete starting point we will study the numerical solution of two matrix problems:

• the computation of eigenvectors, and their corresponding eigenvalues, of large sparse matrices

• the multiplication of Green’s function with a source vector in simulations in quantum chromodynamics with overlap fermions.

(10)

itera-tion methods. This means that in each iteraitera-tion step of a certain method a second iteration method is invoked to solve some subproblem. The nesting aspect is the focus of this thesis and is what makes these problems in particular interesting for a scientific computing laboratory because of the necessity of coupling different it-eration methods (or computational kernels). There are various reasons for nesting iteration methods.

For example, in some iterative methods the cost of an iteration step grows lin-early with the iteration number. This can happen if for acceleration purposes a subspace is formed that is spanned by the approximations computed so far. The dimension of this subspace increases with every step of the iteration method, thereby increasing with every step the amount of work involved in the construction of an appropriate basis for this subspace. Introducing additional information in the process may reduce the number of required iterations and therefore result in a significant reduction of the overall cost. The necessary information might be obtained by partly solving the original problem or some relevant subproblem by invoking a second iteration method. This raises the question how to couple the two levels of iteration or, posed differently, how accurately the embedded solver has to solve the subproblem. As a concrete example, we study, in the first part of this thesis, modern iterative solvers for computing a few eigenvectors and eigenvalues of large sparse matrices.

A different situation where nested iteration schemes appear naturally is when the problem to be solved consists of relatively simple problems that are coupled. An example where this occurs is the so-called Stokes problem fromcomputational fluid

dynamics. The Stokes problem consists of two coupled linear systems of equations.

The velocity can be computed by solving a linear system which involves the un-known pressure and the pressure depends again, in some discretizations, on the velocity by a second linear system. Here, two-level iteration methods are some-times used in practical solution strategies. As a different and concrete example, we will study a linear problem that occurs in large scale simulations in quantum

chromodynamics, the physical theory that describes the strong interaction between

elementary particles. The challenge here is that the system of equations is only given implicitly by a matrix function that must be computed with an iterative method. The solution method that is used in practical simulations for this prob-lem is a standard iteration method for linear system that invokes a second iteration method for dealing with the matrix function. Again, we have to ask ourselves the question how accurately we have to compute this matrix function given a required and predefined precision for the whole problem.

In this thesis we study the two separate building blocks that make up a two-level iteration method for both problems and, in particular, the tuning of the coupling between the two levels of iteration which is important for a scientific working environment. Our study should lead to strategies for automatically optimizing this coupling. This puts also emphasis on the individual components. For example, we need good termination criteria for the iterative methods. The outline of this thesis is as follows. Chapter 2 and Chapter 3 are dedicated to the eigenvalue

(11)

1.1. The eigenvalue problem 3

problem. Chapter 4 and Chapter 5 are concerned with simulations in quantum chromodynamics with overlap fermions. In the remainder of this chapter we give a short summary of the work in this thesis and summarize some of our contributions.

1.1

The eigenvalue problem

The origins of eigenvalue problems are diverse and range from search engines for searching the Internet to the stability analysis of large structures. The problem can often be formulated as an algebraic eigenvalue problem where one tries to find a nonzero vectorxand a scalarλsuch that

Ax=λx.

Usually only a very small number of eigenvectors and their corresponding eigen-values are needed. In this case iterative projection methods (sometimes called

subspace methods) come into the picture. These methods compute iteratively an

approximation to the desired eigenpair by building up a subspace and in every it-eration step they extract an approximation to the sought-after eigenpair from this subspace. Two components can be identified. First, there is the computation of appropriate vectors to expand the subspace which may require the (approximate) solution of a linear system. If this is done iteratively we refer to this part as the

inner iteration. The other key ingredient is the collection of the expansion vectors

in a subspace and the subsequent extraction of good approximations to the wanted eigenpair by an extraction technique. It generally involves an orthogonalization method for constructing an orthogonal basis for the subspace. We term this the

outer iteration.

The identified structure of iterative projection methods is also reflected by the outline of the first part of this thesis. In Chapter 2 we focus on the extraction of useful eigenvector/eigenvalue approximations from a given subspace and issues concerning the computation of the expansion vectors are treated in Chapter 3.

1.1.1

Chapter 2: The subspace extraction

We ask ourselves the following main question in Chapter 2: suppose a given sub-space contains a good approximation to the eigenvector, how can we extract eigen-vector approximations from that subspace? We will review the Rayleigh-Ritz method and prove that for eigenvalues that are in some sense in the ‘exterior’ of the spectrum, the approximations generated by the Rayleigh-Ritz method are guaranteed to be useful. We study this in more detail, for the situation thatA is real symmetric, by deriving a priori error bounds for the eigenvector approxima-tions expressed in terms of the eigenvalues of the matrixA and the angle of the subspace with the eigenvector of interest.

(12)

For eigenvalues that are in the interior of the spectrum, i.e., ‘interior eigenvalues’, the Rayleigh-Ritz method always constructs good approximations to the eigenval-ues but the approximation to the eigenvectors might be useless. This means that alternative extraction methods must be considered for these type of eigenvectors. One such alternative is the recently proposed harmonic Rayleigh-Ritz method, which can be seen as a variant of the Rayleigh-Ritz method. This method involves a parameter that should be chosen appropriately depending on the location of the part of the spectrum that is of interest. In literature, many numerical exper-iments have been reported showing that harmonic Rayleigh-Ritz indeed resolves the problems of standard Rayleigh-Ritz for interior eigenvalues. Despite this prac-tical success, the effect of the parameter on the method is complex and not well understood. This raises the question how to choose this parameter when we are interested, for example, in the eigenpair with its eigenvalue close to some target value.

In Chapter 2 we address this question by showing that the harmonic Rayleigh-Ritz approximations of interest are equal to the approximations of the classical Rayleigh-Ritz method when applied to the transformed eigenvalue problem

(A−τ I)2_x_{= (}_λ₋_τ₎2_x,

for a specific value ofτ. We also use this relation to give a comparison of harmonic Rayleigh-Ritz and an alternative extraction method, specially designed for interior eigenvalues, known as the refined Rayleigh-Ritz method. For more theoretical purposes, we use the demonstrated equivalence to derive a posteriori and a priori error bounds for the eigenpair approximations of this method.

The harmonic Rayleigh-Ritz method generates a whole set of approximations to eigenpairs of the matrixAjust as the Rayleigh-Ritz method. From this set an ap-propriate approximation to the eigenpair of interest must be selected. We conclude Chapter 2 by discussing a new criterion for doing this.

1.1.2

Chapter 3: Simple vector iterations

A reliable and robust extraction method results in an approximation from the subspace to the eigenpair of interest. Based on this approximation, we want to construct a vector to expand our subspace with. This is typically accomplished by iteratively solving some linear system. This is the inner iteration of the iterative projection method and is the subject of Chapter 3.

As a basis for an expansion strategy we will discuss simple vector iterations like inverse iteration and Rayleigh quotient iteration. These methods repeatedly com-pute a new approximation to an eigenvector based only on the approximation from the previous step. The important observation is that we can apply one step of some simple iteration scheme to the extracted approximation from the subspace and ex-pand the subspace with the resulting vector. Many powerful subspace methods

(13)

1.1. The eigenvalue problem 5

are based on this idea and sometimes the resulting methods are seen as accelerated versions of the simpler iterations. Therefore, we study in Chapter 3 simple vector iterations.

Of particular importance is Rayleigh quotient iteration which computes a new approximationu0_{based on a known approximation}_u_{by solving the linear system}

(A−ϑI)u0=u withϑ=u

T_Au

uT_u . (1.1.1)

The matrix on the left is ill conditioned ifϑis close to an eigenvalue and therefore accurate approximations to u0 _{are often too expensive to determine in practice.}

Rayleigh quotient iteration has appealing local convergence properties that are, unfortunately, lost when the matrix A in (1.1.1) is replaced by a nearby matrix that allows a cheaper computation of an approximateu0_.

In Chapter 3 we propose a simple vector iteration that is based on thecorrection

equationof the Jacobi-Davidson method. This iteration is mathematically

equiv-alent to Rayleigh quotient iteration if the correction equation is solved exactly. However, it is observed in literature that this correction equation is more robust with respect to replacing the exact matrixAwith a nearby matrix. We will explain this by relating this iteration scheme to Rayleigh quotient iteration on a nearby matrix that possesses an eigenvector that is, with every step of the iteration, in-creasingly closer to the wanted eigenpair. This connection leads to convergence bounds for the simple iteration when the matrixA is replaced with some nearby matrix.

The correction equation may be solved with a (preconditioned) iterative solver for linear systems. Iterative solvers are usually terminated when a given relative residual precision for the linear system has been obtained. We discuss the effect of this criterion when used in the simple iteration. This confirms the results of Dembo et al. [26] for the more general class of inexact Newton methods. Their results show that higher order convergence can be achieved by working with an increasingly smaller tolerance. As a consequence, this gives a suitable sequence of tolerances for use in the Jacobi-Davidson method.

In our final section we discuss some numerical experiments where this strategy is applied for the full Jacobi-Davidson method, that is, an additional outer iteration is added to the simple iteration which contains the subspace acceleration. In these cases a trade-off has to be made between the amount of work that is spent in the inner iterations and in the outer iteration. Obviously, solving the correction equa-tion very accurately is not efficient. Conversely, if the correcequa-tion equaequa-tion is solved less accurately then the number of outer iterations grows and therefore, for exam-ple, the cost for the orthogonalization of the basis for the subspace increases. We show by several numerical experiments that an improved condition, as discussed for the simple iteration, might also be useful for the complete method.

(14)

1.2

The overlap operator in quantum

chromody-namics

In the second part of this thesis we start our discussion on numerical techniques for the overlap formulation inquantum chromodynamics(QCD), the physical theory that describes the strong interaction between elementary particles. This overlap formulation initiated a lot of research in solving linear systems of the form

(rG5− sign(Q))x=b (r≥1), (1.2.1)

whereQandG5 are sparse Hermitian indefinite matrices. In today’s simulations

the dimension of Q and G5 is in the order of one to ten million. The matrix

sign(Q) is the so-called matrix sign function or, more precisely, if we have the eigenvalue/eigenvector decomposition,

Q=XDX∗, withD= diag(λ1, . . . , λn),

then the matrix sign function is defined as

sign(Q) :=Xsign(D)X∗₌_X_{diag( sign(}_λ

1), . . . ,sign(λn))X∗,

where sign(t) is the standard sign function. Solving the full problem, that is solving (1.2.1) for x, given G5 and Q, requires the solution of a simple linear

system coupled to the nonlinear problem of computing the matrix sign function. Although the matrixQis very sparse, the linear system in (1.2.1) is dense. The solution method that we consider, which is the method of choice in practical simulations, consists of applying a standard iterative solver for linear systems to (1.2.1). This is the outer iteration. One of the main advantages of using an iterative solver for this problem is that the matrixrG5− sign(Q) has not to be

known and stored explicitly which is, due to the density and large dimension of this matrix, not feasible. Instead, we need to compute the product of this matrix with some vector in every outer iteration step. (Nevertheless still a computationally demanding task.) Vector iteration methods for computing the product of sign(Q) with a generic vector are discussed in Chapter 4. In Chapter 5 we study the impact of an approximate matrix-vector product on various iterative solvers for linear systems which should lead to strategies for tuning the precision of the matrix sign function times vector.

1.2.1

Chapter 4: Numerical methods for the overlap

oper-ator

In Chapter 4 we focus on the computation of the product of the matrix sign function with a generic vector, say y. The methods that we will consider are vector iteration methods that compute, in stepk, an approximation of the form

(15)

1.2. The overlap operator in quantum chromodynamics 7

wherepis a polynomial of degree less thank. We give a unified treatment of, and propose various improvements to, a number of methods that have been considered previously in literature. We consider, among others, methods based on Chebyshev polynomials, Lanczos approximations and methods exploiting themulti-shift Con-jugate Gradient method. Special emphasis is put on explicit accuracy bounds on the inner iterations. This is important in order to be able to tune the precision of the computed matrix-vector product in every outer iteration step with strategies that we will propose in Chapter 5. We develop procedures for various approxima-tion methods that guarantee a given accuracy for the matrix-vector product. In one particular method, frequently used by physicists, the matrix sign function is approximated by a rational matrix function written as the sum of poles, this gives sign(Q)y≈ m X i=1 ωiQ(Q2+τiI)−1y.

The choice of the shifts τi, the weights ωi and the number of poles, m, depends

on the type of rational approximation used, the location of the eigenvalues of

Qand the required precision. This scheme reduces the problem to solvingm, so-called, shifted linear systems which may be efficiently accomplished with a method from the class ofmulti-shift Krylov subspace methods, which are variants of the standard iterative methods designed for solving families of shifted systems. The cost of this method depends, besides the standard cost of the conjugate gradient method, on the number shifted systems to be solved.

We will improve this method considerably by reducing the number of poles. First, we propose a new rational approximation based on the work of Zolotarev. This leads to a significant reduction of the number of necessary poles compared to rational approximation previously used in the computation of the overlap operator. Furthermore, we propose a modification of the multi-shift iterative solver that saves computational work by using an individual tolerance for each shifted system. Again, we are developing a procedure to guarantee a given accuracy. Chapter 4 is concluded with a comparative study with realistic configurations of the various improved methods on a parallel cluster computer. This shows that our new multi-shift approach based on Zolotarev’s work in combination with early termination of converged shifted systems is the most efficient.

1.2.2

Chapter 5: Inexact Krylov subspace methods

Matrix-vector products are an essential ingredient of iterative solvers for linear systems, in particular of the so-called Krylov subspace methods. In Chapter 5 we discuss the impact of an approximately computed matrix-vector product on a variety of iterative solvers for linear systems. Although this problem was motivated by the overlap formulation in quantum chromodynamics, we will give a very general

(16)

treatment of this problem for linear systems of the form

Ax=b.

Following nomenclature often used in literature, we will refer to Krylov sub-space methods with approximate matrix-vector product as inexact Krylov sub-space methods.

The ‘errors’ in the matrix-vector products essentially have two consequences: the accuracy of the iterative method is limited and, secondly, the convergence speed is altered. We investigate both aspects by studying the convergence behavior and smallest attainable value of the true residual, defined as b−Axk where xk

is the computed approximation in stepkof the iterative method. A consequence of working with an inexact matrix-vector product is that the computed residual

in step k, rk, usually is not a residual anymore corresponding to the computed

approximationxk, hence,rk6=b−Axk. We have that

kb−Axk

| {z }k2 ≤ kr|k−(b{z−Axk)}k2 + k|{z}rk k2,

true residual residual gap computed residual

where the first quantity on the right is commonly referred to as the norm of the

residual gap. This simple inequality forms the basis of our analysis. We argue that

the attainable accuracy is determined by the norm of the residual gap whereas convergence speed is determined by the computed residuals. In Chapter 5 we study the residual gap and convergence behavior of the computed residuals for various Krylov subspace methods, including stationary methods, like Chebyshev iteration, as well as several non-stationary methods as the Conjugate Gradient method and the GMRES method.

Bouras and Frayss´e present in a recent technical report [14] a large number of numerical experiments in which, in step k of the GMRES method, the matrix-vector product is computed with a relative precision given by

ε

kb−Axk−1k2

.

The value of ε is chosen in the order of the required residual precision. They empirically observe that, for this choice of the precision, the attainable precision of the inexact method is aboutε. Furthermore, they notice from their numerical experiments that the convergence speed of the perturbed method is approximately as fast as for the exact GMRES method. They refer to this choice for the relative precision as a relaxation strategy since it results in very accurate matrix-vector products in the early iterations but this precision is relaxed during the iteration process as soon as xk−1 becomes a better approximation to the exact solution.

Our analysis in Chapter 5 explains the success of this strategy and shows that it is essentially correct and optimal. Furthermore, we point out when a similar strategy is appropriate for other Krylov subspace methods as well, and for some Krylov methods an even more aggressive relaxation strategy is proposed.

(17)

1.2. The overlap operator in quantum chromodynamics 9

In the second part of Chapter 5 we discuss the computational advantages and drawbacks of the use of a relaxation strategy. We argue that the drawbacks can be overcome by preconditioning an inexact Krylov subspace method by another inexact Krylov subspace method set to a larger precision. This means, for example for the QCD problem, that we get, in total, a three-level iteration scheme. The nesting of inexact Krylov subspace methods can be a very effective tool in reducing the total cost of the matrix-vector multiplications, we demonstrate this for a Schur complement system that stems from a model that describes the steady barotropic flow in a homogeneous ocean with constant depth.

(18)

(19)

Chapter 2

Eigenvector approximations

from a subspace

The research in this chapter is published as part of: G. L. G. Sleijpen, J. van den Eshof, and P. Smit.

Optimal a priori error bounds for the Rayleigh-Ritz method.

Math. Comp., 72:677–684, 2003.

G. L. G. Sleijpen and J. van den Eshof.

On the use of harmonic Ritz pairs in approximating internal eigenpairs.

(20)

2.1

Introduction

In many scientific computations it is at some point necessary to compute an eigen-vector corresponding to some eigenvalue of a matrixA. Or, in other words, one wants to find an approximation to a pair (λ, x) (withx6= 0) that satisfies

Ax=λx.

Often the matrixAis of very large dimension but contains a few nonzero elements and only a small subset of the eigenvalues and eigenvectors is required. Iterative projection methods are designed for solving these large sparse eigenvalue problems and well-known examples of methods in this class include the Lanczos method [94, Chapter 13], the Davidson method [24] and Jacobi-Davidson [110], to mention only a few. There are two distinct aspects of these type of projection methods. The first is the step-by-step construction of a subspace that contains approximations to the sought-after eigenvectors. The second aspect is the extraction of good eigenvector approximations from that subspace by using a projection technique.

The subspace projection is sometimes viewed as a way to accelerate the conver-gence of a simple iteration method, in a similar fashion as, for example, GMRES for systems of linear equations can be seen as an accelerated version of Richard-son iteration. However, the situation for eigenvalue methods is often more delicate because frequently an approximate eigenpair from the subspace is used in the com-putation of a vector to expand the subspace or for restart purposes. For this reason the success of the solution method crucially depends on the success of extracting a good eigenvector approximation to a relevant eigenpair.

In this chapter we focus on the extraction phase. The expansion of the subspace is the subject of Chapter 3. This means that in this chapter we assume that we are given some subspace that contains a reasonable approximation to the eigenvector of interest to us which depends on the particular application. In the remainder of this section we outline the organization of this chapter.

The best-known method for forming approximations from a given subspace is the

Rayleigh-Ritz methodwhich we discuss in Section 2.2. We then review a result that

says that, if the subspace contains a good approximation to the wanted eigenvec-tor, the Rayleigh-Ritz method constructs at least one approximate eigenpair for which the approximate eigenvalue is close to the eigenvalue of interest. We con-sider how good the associated approximate eigenvector (called a Ritz vector) is as approximation to the eigenvector. This is the central question that we ask ourselves for Rayleigh-Ritz since it teaches us for which type of eigenvalues the Rayleigh-Ritz method is guaranteed to be an appropriate method. We will show that, in order for the eigenvector approximation to be relevant, it is sufficient that the target eigenvalue is in some sense an outlier in the spectrum. We will say that this eigenvalue is in the ‘exterior’ of the spectrum.

To get some insight into the behavior of the Ritz vectors as a function of the quality of the given subspace, we work out the details for the symmetric case by

(21)

2.1. Introduction 13

deriving error bounds for the Rayleigh-Ritz approximation to the eigenpair with the smallest eigenvalue. The bounds are expressed in terms of the eigenvalues of

A and the angle between the subspace and the eigenvector of interest. We may therefore call these bounds truly a priori. (Obviously, all results can be trans-formed to statements about the largest eigenvalue and corresponding eigenvector by replacingAwith−A.) This is the subject of Section 2.3.

In practical applications one is often searching for an eigenpair with the eigenvalue in some relevant region of the complex plane. For example, one is interested in the smallest eigenvalue or the one closest to some target value in the interior of the spectrum. Unfortunately, Rayleigh-Ritz is less suitable in this latter case. We discuss this in Section 2.3.4.

In particular for symmetric matrices there are various efforts to overcome the difficulties with finding interior eigenpairs. For example, Scott [102] argues that working with a shifted and inverted operator in Rayleigh-Ritz is preferable. Mor-gan points out in [86] that the necessary expensive inversion of the operator can be handled implicitly with a particular choice for the subspace. The resulting method has been given the nameharmonic Rayleigh-Ritz in [93]. Independently of this work, the eigenvalue approximations of this method (the harmonic Ritz values) had already received considerable attention in the special case that the subspace is a so-calledKrylov subspace. Then the harmonic Ritz values are equal to the roots ofKernelpolynomials which play an important role in the theory of iterative minimal residual methods for linear systems, see [35, 84] and [32, Section 2.5] for some recent work and references. For general subspaces, harmonic Ritz values have also been studied in the context of Lehmann’s optimal inclusion intervals for eigenvalues [81, 82, 94, 7]. The connection between these different areas of research was made in [93].

In Section 2.4 we give a definition of harmonic Rayleigh-Ritz with respect to some shift parameter and we summarize some useful properties in Section 2.5. Subsequently, in Section 2.6, we compare harmonic Rayleigh-Ritz torefined Ray-leigh-Ritz. Refined Rayleigh-Ritz, popularized by Jia [72], is another method to compute approximations from a subspace specially designed for eigenvectors with eigenvalues in the ‘interior’ of the spectrum. In Section 2.6.1 we give a relation that shows that both methods are equivalent in some sense. Although the relation between these two approaches is of interest on its own account, it turns out to be also useful in the rest of this chapter. If we vary the shift in the harmonic Rayleigh-Ritz method then the angle between the eigenvector approximation (theharmonic

Ritz vector) and the target eigenvector changes. As an application of the relation between harmonic and refined Rayleigh-Ritz we also discuss in Section 2.6 the question, what shift for harmonic Rayleigh-Ritz minimizes this angle. This should provide insight into the issue of choosing this shift parameter.

The subject of Section 2.7 is a priori error bounds for the harmonic Rayleigh-Ritz method. We generalize well-known error bounds for Rayleigh-Ritz to the harmonic Rayleigh-Ritz context and discuss some of their limitations. A posteriori error

(22)

bounds for the harmonic Ritz values are discussed in Section 2.8. By changing the shift in harmonic Rayleigh-Ritz different intervals can be obtained. Each interval contains at least one eigenvalue. We give a condition fora posteriori choosing a new shift that results in a smaller inclusion interval. Repeatedly relocating the shift using this condition will ultimately result in an, evidently appealing, optimal interval with respect to the given information. This interval can be used as an a posteriori error estimator.

So far we have assumed that we were able to identify the harmonic Ritz pair that has its approximate eigenvector close to the wanted eigenvector. When searching for the smallest and largest eigenvalues of a symmetric matrix with the Rayleigh-Ritz method, this is indeed not a difficult problem. However, when searching for an eigenpair with its eigenvalue closest to some target with harmonic Rayleigh-Ritz this is less obvious. For a particular shift, the harmonic Rayleigh-Ritz method produces a set of harmonic Ritz vectors. In practice, the eigenvector is unknown, and it is not obvious how to tell which vector from this set forms the best approxi-mation to the target eigenvector. The problem of selecting a well-suited harmonic Ritz vector for a given shift is treated in Section 2.9.

Although some of the results in this chapter have practical applications, the pur-pose of this chapter is to provide insight rather than algorithms.

2.2

Rayleigh-Ritz approximations

Let A ∈ Cn×n _{be a general matrix with eigenpairs (}_{λ, x}_{) and let} _V _∈ Cn×k

be a matrix, whose columns form an orthonormal basis for the k dimensional subspaceV. We are interested in techniques that compute approximations from a subspace to eigenpairs. The most important method in this class is the

Rayleigh-Ritz method.

The Rayleigh-Ritz method obtainsk approximate eigenpairs (ϑ, u), the so-called

Ritz pairs, by imposing the Ritz-Galerkin condition

Au−ϑu⊥ V with u∈ V\{0},

or equivalently,

V∗AV z−ϑz= 0 with u:=V z6= 0. (2.2.1) The valueϑ can be seen as an approximation to an eigenvalue ofA and is called

a Ritz value. The associated vector u (Ritz vector) forms an approximation to

an eigenvector ofA.1 From (2.2.1) it follows that ϑequals the so-calledRayleigh

1_{According to B.N. Parlett the terms}_{Ritz value}_and_{Ritz vector}_{are not correct in the} non-Hermitian case for historical reasons. He proposed to overcome this problem of nomenclature by adding quotations marks in the non-Hermitian case, i.e., use the terms“Ritz value”and “Ritz

(23)

2.2. Rayleigh-Ritz approximations 15

quotient,ρ(u), of the vector u,

ϑ=ρ(u) whereρ(v) := v

∗_Av

v∗_v .

We will assume throughout this chapter thatkuk2= 1.

In this chapter we assume that we are looking for some approximation to a partic-ular eigenpair that we denote with (λ, x). In order to be able to construct robust algorithms for eigenvector computation, we need reliable methods for extracting eigenvector approximations from a subspace to this eigenvectorxand similar for the eigenvalue. Therefore, we consider the following question: suppose that we are searching for an eigenpair (λ, x) and that∠(V, x) is small, is there then a Ritz pair (ϑ, u) such that|ϑ−λ|and∠(u, x) are small? For the Ritz values this question is answered positively by the following result.

Theorem 2.2.1 (Stewart and Jia [74]). There exists a Ritz valueϑ such that

|ϑ−λ| ≤4kAk2|tan∠(V, x)|1/k(2 +|tan∠(V, x)|)1−1/k.

This shows that if the angle between the subspaceVand the unknown eigenvector

xdecreases there is always a Ritz value getting closer and closer to the eigenvalue

λ. For the Ritz vectors the following result is well known. It was originally proved by Saad [98] for real symmetric matrices and later extended by Stewart to the general case [119].

Theorem 2.2.2 (Stewart [119]). Letube a Ritz vector with respect to the space

V. LetW be the orthogonal complement ofuin V. Then

sin2∠(u, x)≤ µ 1 + η 2 α2 ¶ sin2∠(V, x), (2.2.2) whereη:=kV V∗_A₍_I₋_{V V}∗₎_k 2andα:= infkzk2=1k(W ∗_AW₎_z₋_λz_k 2withW an

arbitrary orthogonal basis forW.

The problem with this bound is that the valueα in general cannot be bounded from below a priori. It can be shown [74] that a similar result holds with λ in the expression forαin Theorem 2.2.2 replaced byϑ. This gives the possibility of a posteriori checking the quality of the Ritz vector. Nevertheless, it can happen in practice that a good subspace V (i.e., ∠(V, x) small) results in a Ritz value close to the eigenvalue of interest, λ, but the theorem does not guarantee that the corresponding Ritz vector is a good approximation to xbecause αis small. Unfortunately, in practical situations it is observed that in these cases the Ritz vector can be totally irrelevant. We return to this in Section 2.3.4 (see also [102, 74, 86]).

Theorem 2.2.2 does not exclude that there are eigenvalues λ for which we, at forehand, can say that we can safely use the Ritz vector corresponding toϑas an

(24)

approximation to x. This means that we have to show that αis bounded from below ifϑis close to the target eigenvalue. This quantityαis unfortunately difficult to assess since it requires knowledge of the unknown Ritz vector. Therefore, we give the following variant of Theorem 2.2.2 that involves a quantityγ(·).

Theorem 2.2.3. Let (ϑ, u)be a Ritz pair with respect to the spaceV. If

γ(ϑ)>0whereγ(µ) := min z⊥x,z6=0 ¯ ¯ ¯ ¯z ∗_Az z∗_z −µ ¯ ¯ ¯ ¯, then |tan∠(u, x)| ≤ _γ₍η_ϑ₎|tan∠(V, x)| (2.2.3) whereη:=k(I−xx∗₎₍_A₋_ϑI₎₍_I₋_xx∗₎_k 2.

Proof. Without loss of generality we can assume that the matrix A is upper

tri-angular and is of the form

A= " λ r 0 R # , (2.2.4)

withr∈R1×n−1 _and_R _∈_Rn−1×n−1 _{upper triangular. Hence, the eigenvector of}

interest is simply the first standard basis vector: x=e1. LetxV be the projection

ofxonto the spaceV. For the moment we assume that we can write

u0:= (e∗1u)−1u= " 1 e # andx0V := (e∗1xV)−1xV = " 1 f # . (2.2.5)

The residual of the Ritz vectoru0_,

Au0−ϑu0= " λ−ϑ+re (R−ϑI)e # ,

is by definition orthogonal tou0 _and_x0

V. This results in the two equations

0 = λ−ϑ+re+e∗₍_R₋_ϑI₎_e

0 = λ−ϑ+re+f∗(R−ϑI)e.

Equating both expressions and taking the absolute value on both sides gives

γ(ϑ)kek22≤ |e∗(R−ϑI)e|=|f∗(R−ϑI)e| ≤ kfk2kek2kR−ϑIk2

from which (2.2.3) follows. It remains to be checked thatuis not perpendicular tox, or equivalently, u6= " 0 e # =:u0

for someewith kek2 = 1. This also implies thatxV is nonzero. The proof is by

contradiction: writing outu0∗₍_Au0₋_ϑu0_{) = 0 gives that}_e∗₍_R₋_ϑI₎_e_{= 0 which}

(25)

2.3. A priori error bounds for the Ritz pair 17

It follows from this theorem that ifγ(λ)>0 then the Ritz vector associated toϑ

is a good approximation toxifϑis close enough toλ. Theorem 2.2.1 shows that there is a Ritz value arbitrarily close to λif the quality of the subspace is high (that is,∠(V, x) is small). Hence, for this type of eigenvalue we can safely use the Rayleigh-Ritz approximation without extra precautions. The conditionγ(λ)>0 means thatλis, in some sense, an extreme eigenvalue. For example ifAis normal it says thatλis outside the convex hull of the other eigenvalues ofA. In particular for the real symmetric case it means that we can expect sensible approximations for the ‘smallest’ and ‘largest’ eigenpair.

The bound in Theorem 2.2.3 does not provide a true a priori error bound since it requires knowledge ofϑ. Therefore, it is difficult to interpret how the quality of the Ritz vector precisely depends on the quality of the subspace (∠(V, x)). In the next section we derive a priori error bounds for the Ritz vector in the real symmetric case when approximating the smallest eigenvalue.

2.3

A priori error bounds for the Ritz pair

From now on we assume thatA∈Rn×n_{is symmetric and}_V _{is real. The eigenpairs}

(λi, xi) ofAare numbered such that

λ1≤λ2≤ · · · ≤λn,

and we index the Ritz values in a similar fashion:

ϑ1≤ϑ2≤ · · · ≤ϑk−1≤ϑk .

From Theorem 2.2.3 we know that the Rayleigh-Ritz method can be safely used for finding an approximation to the first eigenpair. In this section we want to make this statement more precise and we are interested in the Ritz pair, (ϑV, uV),

for which sin2∠(uV, x1) is minimal over all Ritz vectors ui. This is the pair with

the Ritz vector that makes the smallest angle withx1over all Ritz vectors. In the

ideal case we would have thatuV is a multiple ofxV, wherexV is the normalized

projection of x1 on V. This would give sin2∠(uV, x1) = sin2∠(V, x1), which is

optimal. Unfortunately, the approximationuV is not a multiple of xV in general.

In this section we derive optimal upper bounds for the first Ritz pair. We will moreover show thatϑV equals ϑ1 given that the subspace contains a sufficiently

accurate approximation. This is the subject of Section 2.3.2. For convenience of the reader and comparison purposes, we start in the next subsection with dis-cussing some classical bounds for the first Ritz pair that can be found in literature. Besides our theoretical interest in a priori error bounds, the new, sharper bounds can be used to improvea prioriconvergence bounds for iterative eigenvalue meth-ods. Often, the analysis of these methods can be split in the construction of an

(26)

upper bound on sin2∠(V, x1) and the analysis of the error contributed by the

Rayleigh-Ritz method. For example, Theorem 1 in [98] gives a bound for the angle between x1 and Krylov subspaces. Combining this with the classical and

known error bounds discussed in the next section gives precisely the bound for the first eigenvector of Kaniel [75] for the Lanczos method. In literature, these bounds are often improved by (implicitly) constructing better bounds for sin2∠(V, x1). In

this section we focus on error bounds for the Rayleigh-Ritz method and our results are not restricted to a specific method.

2.3.1

A well-known upper bound

A first approach for obtaining a true a priori bound is suggested at the end of Section 11.9 in [94] where the elegant bounds of Kaniel [75] (see also [94, Theorem 11.9.2]) are the starting point. Using the notationε:= sin2∠(V, x1) these bounds

are summarized by the following theorem. Theorem 2.3.1 (Kaniel [75]).

ϑ1−λ1≤(λn−λ1)ε (2.3.1)

sin2∠(u1, x1)≤ ϑ1−λ1

λ2−λ1

. (2.3.2)

Furthermore, both inequalities are sharp.

We recall that for more general matrices we gave an error bound in Theorem 2.2.3 for the Ritz vector in case the corresponding Ritz value is close to an ‘extreme’ eigenvalue λ, that is γ(λ) > 0. An interesting question is, if we, for this more general situation, can also derive a bound in terms of the eigenvalues of the matrix and|λ−ϑ|as in (2.3.2). It turns out that this is not possible, not even forγ(λ)>0. To see this, letube some vector with ρ(u) =λand let uand Abe decomposed as in (2.2.5) and (2.2.4), respectively. Then,

0 =ρ(u)−λ=r∗e+e∗Re.

This equality does not imply that e= 0 if γ(λ)>0. Therefore, it follows from this example of the Rayleigh-Ritz method with a one dimensional subspace that it is in general not possible to derive error bounds for the Ritz vector in terms of the quantity|ϑ−λ|and the eigenvalues of the matrix only (unlessrin (2.2.4) is zero).

We return to the issue of deriving a priori error bounds for the Ritz vectors in the symmetric case. From Theorem 2.3.1 we can easily obtain an error bound for the first Ritz vector that is truly a priori, in other words it is expressed in terms of ε and the eigenvalues of A. The proof of this statement is a straightforward combination of (2.3.1) and (2.3.2).

(27)

2.3. A priori error bounds for the Ritz pair 19 Theorem 2.3.2. sin2∠(u1, x1)≤λn−λ1 λ2−λ1 ε= µ 1 +λn−λ2 λ2−λ1 ¶ ε. (2.3.3)

Although (2.3.3) is a combination of the sharp bounds (2.3.1) and (2.3.2), there is no guarantee that this bound is sharp itself. Since (2.3.2) attains equality ifu1

has a component in the direction ofx2, while for (2.3.1) equality is attained when

there is a component in the direction ofxn, it is suggested that (2.3.3) may not be

sharp. Indeed, in the next section we improve this bound and construct a sharp bound forε < λ2−λ1

λn−λ1. Notice that (2.3.3) is not useful when this condition onεis not fulfilled.

Another question that we address is whetherϑV equals ϑ1. This is important for

the selection problem, i.e., at some point, it is necessary to select the Ritz vector that makes the smallest angle withx1.

2.3.2

A sharp upper bound

In his PhD thesis [117] and in technical report [116], Smit addressed the problem of obtaining optimal bounds for the Rayleigh-Ritz process. He derived such bounds for the case dim(V) = 2 and generated approximations for the k dimensional case (k >2) by numerical experiments. On the basis of his numerical results, he conjectured that whenε < λ2−λ1

λn−λ1, the optimal bound for the k dimensional case equals the optimal bound for the two dimensional case. In this section we prove that this is indeed correct.

For convenience we use the following notation. LetδV := min sin2∠(uj, x1), where

the minimum is taken over all Ritz vectors, uj, with respect to V. Put εV :=

sin2∠(V, x1). Forε >0 we define

δk(ε) := max{δV|dim(V) =k, εV≤ε}.

The following lemma is an adaptation of Theorem 4.1 in [116]. We give a shorter proof and have added the statement thatϑV = ϑ1 in case ε < _λλ_n2−λ_−λ1₁ which we

need in the remainder of this section.

Lemma 2.3.1. If dim(V) = 2and0≤ε < λ2−λ1

λn−λ1, thenϑV=ϑ1< λ2. Furthermore, δ2(ε) = ( ₁ 2(1 +ε)− 1 2 p (1−ε)2₋_κε _if _{ε <} λ2−λ1 λn−λ1, 1 2(1 +ε) if ε≥ λ2−λ1 λn−λ1, withκ:= (λn−λ2)2 (λn−λ1)(λ2−λ1).

(28)

Proof. Let 0 < ε < 1 be given (the proof for ε = 0 and ε = 1 is obvious), and let V be such that sin2∠(V, x1) = ε. We derive a sharp upper bound for

the approximation to x1 by the Ritz vectors with respect to V. Because this

bound is monotonically increasing this gives an expression forδk(ε). Notice that

the Rayleigh-Ritz procedure is shift invariant and we are allowed to work with

A−λ1I.

Let (0, x1), (µ1, w1) and (µ2, w2) be the three Ritz pairs of the shifted matrix

A−λ1I with respect to the three dimensional subspace spanned by V and x1,

where we have numberedµ1andµ2such thatµ1≤µ2. The vectorsw1andw2are

normalized. It turns out that working withw1 andw2 simplifies the calculations

a bit.

We define for each pair (c, s)T _{on the unit circle a subspace}_V

sas the span of:

vs(1):=x1 √ 1−ε+cw1 √ ε+sw2 √ εandvs(2):=−sw1+cw2 .

For some pair (c0, s0)T we have thatV =Vs0.

With respect to this basis, the projected matrixAs:=VsT(A−λ1I)Vsis given by

As:= " ε(c2_µ 1+s2µ2) √εsc(µ2−µ1) √ εsc(µ2−µ1) c2µ2+s2µ1 # .

Notice that if s= 0 or c = 0 the vector xV is a Ritz vector and we can exclude

this situation from our analysis.

Letui:=Vszi, with zi= (ti,1)T a scaled eigenvector of the projected matrixAs.

Then sin2∠(ui, x1) = 1− t 2 i 1 +t2 i (1−ε) =1 +εt 2 i 1 +t2 i = 1 2(1 +ε)− 1 2(1−ε) t2 i −1 t2 i + 1 . (2.3.4) We are interested in the smallest possible value of max{|t1|,|t2|}. It suffices to

analyze the eigenvectors of

A0s:= 1 µ2c2 As= " ε(µ+τ2₎ √_ετ₍₁₋_µ₎ √ ετ(1−µ) 1 +τ2_µ # , where µ:=µ1/µ2andτ=s/c.

The ratio of the coordinates of the equationA0

s(t,1)T =ϑ0(t,1)T is given by:

tε(µ+τ2) +√ετ(1−µ) =t2√ετ(1−µ) +t(τ2µ+ 1) .

The vector (t,1)T _{is an eigenvector of}_A0

s if and only ift satisfies this equation.

We investigate the possible values fort. 1−t2 t =g(τ) := α τ +βτ, where α:= 1−εµ √ ε(1−µ), β := µ−ε √ ε(1−µ).

(29)

2.3. A priori error bounds for the Ritz pair 21

Because ε < 1 and µ ≤ 1 we have that α > 0. We start by giving a proof for

τ >0.

We first consider the case whereβ >0, or, equivalently,ε < µ. Then g(τ) takes values between 2√αβ and ∞. Hence, t takes values between 0 and √αβ+ 1− √

αβ and between −∞and −(√αβ+ 1 +√αβ). Becausez1 ⊥z2, we know that

t1 = −t−21 and it easily follows that there is a ti in each of the two intervals.

Define t1 to be in the negative interval and notice that |t1| > |t2|. The value

|t1|=√αβ+ 1 +√αβis the smallest possible value for max{|t1|,|t2|}, this gives:

t2₋₁ t2_{+ 1} = s αβ αβ+ 1 = s 1−(1−µ) 2 µ ε (1−ε)2.

Inserting this in (2.3.4) gives the expression forδk(ε) whenε < µandτ >0.

Now we show that in case ε < µ and τ > 0, ϑV equals ϑ1. Let (ti,1)T be an

eigenvector of As, then the second component of the vector As(ti,1)T gives an

expression forϑi:

ϑi=µ2c2¡ti√ετ(1−µ) + 1 +τ2µ¢.

If we recall the signs of ti, we have thatϑ1 < ϑ2 and because |t1| >|t2| we get

thatϑV=ϑ1.

Ifβ ≤0, or equivalentlyε≥µ, theng(τ) takes all values. Therefore, tcan take all values between the same bounds. Consequently, there is aτ for which t1= 1

andt2=−1 are solutions.This corresponds to the worst possible situation. In this

case we have two Ritz vectors,u1 andu2, that make the same angle with x1 and

sin2∠(ui, x1) = 1₂(1 +ε).

Forτ <0 the same reasoning can be used. The proof for the expression ofδk(ε) is

concluded by noting thatµ= λ2−λ1

λn−λ1 is the smallest possible value forµand this is the worst situation.

Notice that the bound (1 +ε)/2 holds for any orthogonal basis forV. So, if ε≥

λ2−λ1

λn−λ1, then the Ritz vectors are not guaranteed to contain a better approximation than, for example, simply the columns of the matrixV.

Now we are ready to give a proof for Conjecture 5.1 in [116]. This conjecture states thatδk(ε) =δ2(ε) in caseε < λ_λ2_n−λ_−λ1₁. As a consequence, the expression for

δk(ε) is given by the expression in Lemma 2.3.1.

Theorem 2.3.3. If ε∈[0,λ2−λ1

λn−λ1)thenϑV=ϑ1< λ2.

For allk∈ {2, . . . , n−1} and allε∈[0, λ2−λ1

λn−λ1), we have: δk(ε) := 1 2(1 +ε)− 1 2 p (1−ε)2₋_κε, _with_κ_:= (λn−λ2) 2 (λn−λ1)(λ2−λ1) . (2.3.5)

(30)

Proof. Assume that εV < _λλ_n2−λ_−λ1₁. Then ϑ1 < λ2 (see (2.3.1)). Consider the

space V0 _{spanned by} _u

1 and xU, wherexU is the normalized projection of xon

U := span(u2, . . . , uk). Notice thatu1andxUare Ritz vectors with respect to this

2 dimensional spaceV0_{. Lemma 2.3.1 states that for this 2 dimensional} _V0_{, the}

angle betweenu1andx1is less than the angle betweenxU andx1. Since the angle

betweenxU andx1 is smaller than the angle between any vector fromU andx1,

we may conclude that ϑV =ϑ1.

Note that εV=εV0 and δV=δV0 ≤δ2, which implies that δk≤δ2.

We now show that δ2 ≤ δk. Let dim(V) = 2, then select an orthogonal system

v3, . . . , vk that is orthogonal to u1, u2, and Au1−ϑ1u1. Then (ϑ1, u1) is also a

Ritz pair of the spaceV0 _{spanned by}_u

1, u2, v3, . . . , vk. Since ϑ1 < λ2, Cauchy’s

Theorem (Theorem 10.1.1 in [94]) guarantees that the extension does not introduce a Ritz value in [λ1, λ2). As argued above, ϑV =ϑV0 =ϑ1. Moreover εV0 ≤ εV.

Apparently,δV=δV0 ≤δ_k.

We have that δ2=δk and Lemma 2.3.1 now gives the expression forδk.

We recall that the restriction on ε in Theorem 2.3.3 in this situation does not make this bound more restrictive than the ‘classical’ bound (2.3.3) in the previous section.

2.3.3

Some results based on Theorem 2.3.3

We mention a few consequences of Theorem 2.3.3. The Corollaries 2.3.1 and 2.3.2 below generalize the Corollaries 4.3 and 4.4, respectively, in [116]. The first corol-lary describes the behavior of the upper bound (2.3.5) for smallε.

Corollary 2.3.1. For allk∈ {2, . . . , n−1}, we have:

δk(ε) =ε µ 1 + 1 4 (λn−λ2)2 (λn−λ1)(λ2−λ1) ¶ +O(ε2) (ε→0). (2.3.6) Proof. p (1−ε)2₋_κε_{= 1}₋_ε₋1 2κε+O(ε 2₎ _for_ε →0

Inserting this in (2.3.5) and using the definition ofκgives the required expression.

Inequality (2.3.3) is of a linear form. Using Theorem 2.3.3 we can improve this by at most a factor two. Corollary 2.3.2 gives a linear bound that equals (2.3.5) in

ε= 0 andε= λ2−λ1

λn−λ1. Notice thatδk(ε) is a convex function in this interval and, hence, this is the best linear bound possible.

(31)

2.3. A priori error bounds for the Ritz pair 23 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 2.1: Illustration of different bounds forλ1= 0,λ2= 1 withλn= 1.2 (left picture)

andλn = 5 (right picture). The bounds are along the vertical axis: Theorem 2.3.2 (dotted),

Theorem 2.3.3 (solid), Corollary 2.3.7 (dash dot) and Corollary 2.3.8 (dashed). εis along the horizontal axis.

Corollary 2.3.2. For allk∈ {2, . . . , n−1} and all ε∈[0, λ2−λ1

λn−λ1), we have: δk(ε)≤ε µ 1 +1 2 λn−λ2 λ2−λ1 ¶ . (2.3.7)

The next corollary gives an upper bound for δk(ε) that better approximates the

optimal bound (2.3.5) for smallεandκ≈1.

Corollary 2.3.3. For allk∈ {2, . . . , n−1} and all ε∈[0, λ2−λ1

λn−λ1), we have: δk(ε)≤ε+ κ 2 ε 1−ε, (2.3.8) withκ:= (λn−λ2)2 (λn−λ1)(λ2−λ1).

Proof. We rewrite the expression forδk(ε) in (2.3.5) for ε <_λλ2_n−λ_−λ1₁:

δk(ε) =ε+ 1 2(1−ε)(1− √ 1−α) =ε+1 2(1−ε) α 1 +√1−α withα= κε (1−ε)2.

Multiplying the nominator and denominator in the second term by 1−εand using

κε <(1−ε)2 _{gives the first inequality.}

To give some feeling for the quality of the different bounds, we have illustrated in Figure 2.1 the known bound (2.3.3) and the new bounds (2.3.5), (2.3.7), and (2.3.8) for a matrix withλ1= 0,λ2= 1 and two values forλn: λn= 1.2 andλn = 5. The

left picture shows that for well-conditioned eigenvectors ((λn−λ1)/(λ2−λ1)≈1),

our bounds do not improve much on the straightforward bound from the previous section. In the right picture, the ratio between spread and gap is a little larger

(32)

and the improvement is more apparent. Notice that the first two terms of the expansion ofδk(ε) in (2.3.6) provide a lower bound onδk(ε). This shows that the

classical bound (2.3.3) at best can be improved by a factor 4 and therefore is fairly sharp.

With respect to the problem of selection, choosing the smallest Ritz pair seems safe and guarantees correct selection asymptotically.

2.3.4

Discussion

Krylov subspaces tend to contain good approximations to the extremal eigenpairs even for small dimensional subspaces. In this situation the found Ritz vector can be deflated and our results can be applied to get statements for the next extremal eigenpair, and so on. However, if deflation is not possible, or not efficient, then other tools are required for eigenvalues that are more in the interior of the spectrum.

In general the situation is not so attractive for interior eigenvalues as indicated in the introduction of this chapter and by the results in Section 2.2. We continue this discussion now in more detail for the symmetric case.

Suppose that the angle∠(V, xj) betweenV andxj is small. Letλj be the

eigen-value corresponding to xj, where λj is possibly in the interior of the spectrum.

Then we know from Theorem 2.2.1 that there is a Ritz valueϑ close toλj. For

symmetric matricesAwe can very easily obtain an even sharper estimate using an application of the Bauer-Fike theorem (Theorem 4.5.1 [94]); this shows that there exists aϑsuch that

|ϑ−λj| ≤ k(VTAV −λjI)VTxVk2=kVT(AxV−λjxV)k2 ≤ max i |λj−λi| √ ε. A small residual, (VT_AV ₋_λ

jI)VTxV, is not sufficient for the existence of an

eigenvector ofVT_AV _{that makes a small angle with}_VT_x

V if there exist two Ritz

values that are close toλj [94, Section 11.6]. This is also expressed differently by

Theorem 2.2.2 which for symmetric matrices reads for anyl∈ {1, . . . , k}:

sin2∠(ul, xj)≤ µ 1 + kV V T_A₍_I₋_{V V}T₎_k2 2 mini6=l|λj−ϑi|2 ¶ sin2∠(V, xj). (2.3.9)

Since this bound is sharp (see Remark 3.4 in [78]) and since there is no guarantee that mini6=j|λj−ϑi|is not very small, the angle∠(u, xj) might not be very small

(33)

2.4. Harmonic Rayleigh-Ritz approximations 25

Example 2.3.1. Consider the matrix

A=    −β 0 β   .

The eigenvectors of this matrix are the standard basis vectors (ei) and we want an

approximation for the interior eigenpair (0, e2). Let the vectorsv1 andv2 spanV:

v1=e2 √ 1−ε+ r ε 2(e1+e3), v2= 1 √ 2(e3−e1).

We get withV = [v1, v2]that

VTAV = " _√ εβ √ εβ # .

We have that ϑ1,2 =±√εβ and u1,2 = 1₂(v1±v2). Therefore, sin2∠(u1,2, e2) =

(1 +ε)/2.

In this example we see that, ifε approaches zero, there are two Ritz values ap-proaching λ (even though λ is simple) and none of the corresponding Ritz vec-tors approximates the target eigenvector. This shows that it is not possible to give meaningful error bounds for eigenvectors with eigenvalues in the interior of the spectrum using information about∠(V, xj) and the spectrum of A only. In

practical applications of the Rayleigh-Ritz method, for example in the (Jacobi-)Davidson method when searching ‘interior’ eigenpairs, this lack of robustness of the Rayleigh-Ritz method can result in irregular convergence.

2.4

Harmonic Rayleigh-Ritz approximations

A simple strategy to make Rayleigh-Ritz robust for interior eigenpairs is to apply a spectral transformation such that the interesting eigenvalues are mapped to the exterior of the spectrum. For example, we can apply Rayleigh-Ritz to (A−ξI)2

if the interesting eigenvalues are close to some targetξ(e.g., [86]) which gives (A−ξI)2ubi−ϑbiubi⊥ V with bui ∈ V\{0},

or,

VT(A−ξI)2V_bzi−ϑbibzi= 0 with ubi:=Vbzi6= 0. (2.4.1)

This approach plays an important role in the rest of this chapter.

An alternative is to apply Rayleigh-Ritz to (A−σI)−1 _{for some shift} _σ _{in the}

(34)

(A−σI)V to prevent the explicit inversion of the matrix A−σI, which results in theharmonicRayleigh-Ritz method. For details see [86]. We use the following equivalent definition, see [110, Theorem 5.1], which does not require the existence of the inverse ofA−σI. The harmonic Ritz pairs (ϑei,eui) with respect to a shift

σare given by imposing the Petrov-Galerkin condition

(A−ϑeiI)uei⊥(A−σI)V with eui∈ V\{0},

or, equivalently, from the generalized eigenvalue problem

VT₍_A₋_σI₎2_V

e

zi−(ϑei−σ)VT(A−σI)Vezi= 0 with uei:=Vezi6= 0. (2.4.2)

Just as Ritz values equal the Rayleigh quotients of Ritz vectors, harmonic Ritz values,ϑei, are equal to theharmonicRayleigh quotients,ρeσ(uei), of the harmonic

Ritz vectorseui:

e

ϑi=ρeσ(uei) whereρeσ(v) :=σ+v

T₍_A₋_σI₎2_v

vT₍_A₋_σI₎_v . (2.4.3)

We notice that the harmonic Rayleigh quotient is sometimes also called Temple

quotient, cf. [21, Equation (8.31)]. In principle it can happen thatu_eT₍_A₋_σI₎

e

u= 0 in (2.4.3), in this case we will write ϑe= ∞ and (ϑe−σ)−1 _{= 0. Furthermore,}

the vectors_eui are normalized and, following the convention in [93], we index the

harmonic Ritz values as follows e

ϑ−l≤ · · · ≤ϑe−1< σ <ϑe1≤ · · · ≤ϑek−l.

Ideally we should write something likeϑeσ i andϑb

ξ

i to express the non-trivial

depen-dence of these values on the shiftsσandξ. However, we will drop the superscripts in order to clutter the notation not too much.

2.5

Useful properties of harmonic Rayleigh-Ritz

For the convenience of the reader, we summarize some properties of harmonic Rayleigh-Ritz with shift σ that turn out to be useful in the remainder of this chapter but are also of interest on their own account.

Lemma 2.5.1. Assume that (A−σI)V has full rank. Let k= dim(V)and letl,

m,k−l−m be the number of Ritz values of A w.r.t. V less than σ, equal to σ,

greater thanσ, respectively. Then there existkreciprocals of shifted harmonic Ritz

values(ϑei−σ)−1(see(2.4.2)), of whichl are negative,mequal zero andk−l−m

are positive.

There arek linear independentu_ei. More precisely,

e

(35)

2.5. Useful properties of harmonic Rayleigh-Ritz 27

Proof. The matrix VT₍_A₋_σI₎2_V _{is symmetric. Since it also has full rank it is}

strictly positive definite and the Cholesky decompositionLLT ₌_VT₍_A₋_σI₎2_V

exists. Then the z_ei from (2.4.2) equals yi = LTezi where ((ϑei −σ)−1, yi) is an

eigenpair ofB :=L−1_VT₍_A₋_σI₎_{V L}−T_{. The (}_ϑ_e

i−σ)−1 are real, possibly zero,

because of the symmetry of this operator. Sylvester’s law of inertia [94, Fact 1.6] shows that the number of positive, negative and zero eigenvalues ofVT₍_A₋_σI₎_V

equals these numbers for B. Finally, the (A−σI)- and (A−σI)2_{-orthogonality}

follow easily from the orthogonality andB-orthogonality of theyi.

This lemma shows that the number of Ritz values equal to the shiftσequals the number of infinite harmonic Ritz values.

2.5.1

A minmax characterization for harmonic Ritz values

A useful characterization of the harmonic Ritz values is the following formulation of the minmax property, see also [81].

Lemma 2.5.2. Assume that (A−σI)V has full rank. Letk= dim(V)and letl,

m,k−l−mbe the number of Ritz values less than σ, equal toσ, greater thanσ,

respectively. In this case

1 e

ϑj−σ

= max

S⊂V,dim(S)=j u∈S, u6min=0

1 e ρσ(u)−σ for j∈ {1, . . . , k−l−m}, 1 e ϑ−j−σ = min

S⊂V,dim(S)=j u∈S, u6max=0

1 e

ρσ(u)−σ

for j∈ {1, . . . , l}.

Proof. Using the matrix B defined in the proof of Lemma 2.5.1, the standard

minmax characterization ([94, Theorem 10.2.1]) for the eigenvalues ofByields for

j >1

1 e

ϑj−σ

= max

S⊂Rk_,_dim(_S₎₌_j_{y∈S, y6}min₌₀

yT_By yT_y = max S⊂Rk_,_dim(_S₎₌_j min e z∈S,ez6=0 e zT_VT₍_A₋_σI₎_V_e_z e zT_VT₍_A₋_σI₎2_V_z_e = max

S⊂V,dim(S)=j_e_u∈Smin_,_e_u6₌₀

e

uT(A−σI)ue e

uT₍_A₋_σI₎2_e_u.

A similar argument can be used for the harmonic Ritz values with a negative index.

Notice that it is necessary to split the minmax property in two parts due to the way we index the harmonic Ritz values.

In case the subspace V is a Krylov space for A it is known that the harmonic Ritz values interlace the Ritz values and the shiftσ[93, Section 7]. The following

(36)

interesting corollary can be interpreted as a generalization of this interlace prop-erty for more general subspaces. It is an application of the minmax propprop-erty for harmonic Ritz values (in Lemma 2.5.2) in combination with the statement from Lemma 2.5.1. (The corollary can also be viewed as a generalization of Theorem 2.1 in [7] to the case of indefinite matrices.)

Corollary 2.5.1. Assume that(A−σI)V has full rank. Letk= dim(V)and let

l,k−lbe the number of Ritz values less thanσ, greater thanσ, respectively. Then

σ < ϑl+j ≤ϑej for j∈ {1, . . . , k−l},

σ > ϑl+1−j ≥ϑe−j for j∈ {1, . . . , l}.

Proof. We prove the first statement.

WithT :=VT₍_A₋_σI₎_V _and_R_{:= (}_A₋_σI₎_V ₋_{V T} _{we have that (}_A₋_σI₎_V ₌

V T+RandVT_R_{= 0. Hence,}_xT_VT₍_A₋_σI₎2_{V x}₌_xT_T2_x₊_xT_RT_Rx_≥_xT_T2_x

for all x. We know from Lemma 2.5.1 that for j ≥ 1, ϑl+j > σ and ϑej > σ.

SinceT zi= (ϑi−σ)zi we have thatT zi= (ϑi−σ)−1T2zi. BecauseT2is positive

definite (there are no Ritz values equal toσ) we can use the minmax property for generalized eigenvalue problems and for harmonic Ritz values to get

1

ϑl+j−σ

= max

S⊂Rk_,_dim(_S₎₌_jy∈S, y6min=0

yT_{T y}

yT_T2_y

≥ max

S⊂Rk_,_dim(_S₎₌_jy∈S, y6min=0

yT_{T y} yT₍_T2₊_RT_R₎_y = 1 e ϑj−σ .

2.5.2

Optimal inclusion intervals for eigenvalues

The harmonic Ritz values provide information about the location of some of the eigenvalues ofAthat is optimal in some sense, just like the Ritz values. Paige, Par-lett and Van der Vorst [93] pointed out an important relation between Lehmann intervals and harmonic Rayleigh-Ritz. They showed that the harmonic Ritz values with respect to the shift σgive Lehmann’s optimal inclusion intervals for eigen-values as described in the following proposition.

Proposition 2.5.1 (Lehmann [81]). Let k = dim(V) and let l, k−l be the

number of Ritz values less thanσ, greater thanσ, respectively.

For eachi= 1, . . . , l, theLehmann interval [ϑe−i, σ] contains at leastieigenvalues

of A. For each i = 1, . . . , k−l, the Lehmann interval [σ,ϑei] contains at least

i of A’s eigenvalues. Moreover, in the absence of extra information no smaller

intervals have this property.

The phrase “extra information” in this proposition refers to extra outside infor-mation other thanV andAV. Kahan gives an alternative, but constructive, proof

(37)

2.5. Useful properties of harmonic Rayleigh-Ritz 29

of the optimality of the Lehmann intervals by deriving an explicit matrix Abwith

AV =AVb , such that the eigenvalues of Abare at the end points of the intervals. His construction can also be used to compute the harmonic Ritz values, which can offer computational advantages in some cases, for example, w

Nested iteration methods for nonlinear matrix problems