Lyapunov equations - Low-rank methods for parameter-dependent eigenvalue problems and matrix eq

Given n∈ N, A ∈ Rn×n, and B∈ Rn×m, we consider the following n× n Lyapunov equation

AX+ X AT = −BBT. (2.17)

In the vectorized form, (2.17) is equivalent to the following linear system

A vec(X ) = (I ⊗ A + A ⊗ I)vec(X ) = −(B ⊗ B)vec(Im), (2.18)

whereA = I ⊗ A + A ⊗ I. Using the spectral properties of the Kronecker sum, we obtain that (2.17) has a unique solution if and only ifλi+ λj= 0,∀λi,λj∈ λ(A), which we assume

in the following. By transposing the whole equation (2.17), we see that both X and XT are solutions, implying that the solution is necessarily symmetric. Furthermore, if A∈ Rn×nis stable (i.e. the spectrum lies in the left half of the complex planeλ(A) ⊂ C₋), the solution X can be represented in the following way

_∞

eAτB BTeATτdτ,

2.2. Lyapunov equations

In the following, we discuss the conditions that ensure low-rank structure in X and present an approach that exploits this to efﬁciently solve large-scale Lyapunov equations. Finally, we describe the role of Lyapunov equations in model reduction of linear dynamical systems with control, one of the most important applications.

2.2.1 Low-rank solutions of Lyapunov equations

We have shown that if A is stable, (2.17) has a unique positive semideﬁnite solution X . Addi- tionally, it has been shown in [Sab06, KT10] that if A is symmetric, then X exhibits a singular value decay if m n. More precisely, there exists a matrix Xk∈ Rn×nof rank km such that

X − Xk F≤ 8 B F λmax(A) exp −kπ2 log(8κ(A) ,

whereκ(A) is the condition number of A. This implies that X has an exponential eigenvalue decay: λk(X )γk, withγ = exp −π2 m log(8κ(A)) ,

whereλk(X ) denotes the k-th largest eigenvalue of X . We see that the decay rate deteriorates

and vanishes asκ(A) → ∞. This issue has been resolved in [GK14], where the authors show for certain situations that asκ(A) → ∞ the eigenvalue decay becomes exponential with respect to

k, instead of k:

λk(X )γ

k_{, with}_{γ = exp}_{− π/}_2m).

2.2.2 Solving large-scale Lyapunov equations

For n 5000, a classical approach to solving (2.17) is using a direct method, such as the Bartels-Stewart algorithm [BS72], which requiresO(n3) operations. For larger values of n, these methods are not computationally feasible as they require the Schur decomposition of

A. Instead, various iterative approaches have been proposed, that achieve computational

advantage by exploiting sparsity in A and the low-rank structure in the solution. In the following, we follow [Pen00] and describe one of the most popular approaches, the alternating direction implicit (ADI) iteration.

In the ADI method, the solution X is generated as a limit of the iterates Xi, deﬁned in the

following way:

(A+ piI )Xi−1/2 = −BBT− Xi−1(AT− piI ),

(A+ piI )XiT = −BBT− XiT−1/2(AT− piI ),

Chapter 2. Preliminaries

iteration step

Xi= (A−piI )(A+piI )−1Xi−1(AT−piI )(AT+piI )−1−2pi(A+piI )−1B BT(AT+piI )−1. (2.19)

It can be shown that the errors Ei= X − Xisatisfy the following expression Ei= (ri(A)ri(−A)−1)E0(ri(A)ri(−A)−1)T,

where riis the polynomial ri(x)= (x − p1I )···(x − p2I )· ··· · (x − piI ). Thus, to ensure conver-

gence, the shifts p1, p2, . . . need to be chosen in a way that will guarantee ri(A)ri(−A)−1≈ 0.

Assuming that A is diagonalizable, minimizing the spectral radius of ri(A)ri(−A)−1leads to

the following ADI minimax problem {p1, . . . , pi}= argmin p1,...,pi∈C− max x∈λ(A) |ri(x)| |ri(−x)| , (2.20)

which indicates criteria for choosing the shifts. As the spectrumλ(A) is usually not available, in practice, (2.20) is often relaxed by replacingλ(A) with E (compact subset of C such that

λ(A) ⊂ E): {p1, . . . , pi}= argmin p1,...,pi∈C− max x∈E |ri(x)| |ri(−x)| . (2.21)

The relaxed ADI minimax problem has been solved exactly (see [Wac63]) only for the case of symmetric A. For the general case, several heuristic strategies for choosing close to optimal shifts have been proposed, see, e.g. [Pen00, Wac88, FG13].

The ADI method can be implemented in a way that exploits positive deﬁniteness in X as well as the low-rank structure in X described in Section 2.2.1. In the low-rank version of the ADI method (LR-ADI), the iterates are substituted by their Cholesky decompositions Xi= ZiZ_iT,

while the iteration step (2.19) can be written in the following way

Zi= [(A − piI )(A+ piI )−1Zi−1

−2pi(A+ piI )−1B ],

with Z1=

−2pi(A+ p1I )−1B . A drawback of LR-ADI is that the memory requirements and

the computational cost per iteration are increasing with each iteration, since the low-rank factor Zi is enlarged by m in each iteration (rank(Zi)≤ mi, where m = rank(B)). However,

in practice, LR-ADI is an efﬁcient method since the required number of iterations is usually low. Furthermore, the effect of this drawback can be further reduced by performing low-rank truncation of the iterates.

Other popular methods for solving large-scale Lyapunov equations include the Rational Krylov projection method [HR92] and the extended Arnoldi method [Sim07]. In these methods, the approximate solution of the original Lyapunov equation is computed by projecting (2.17) onto k-dimensional (rational) Krylov subspaces. Solving the projected problem is equivalent to solving a small-scale k× k Lyapunov equation which can be solved efﬁciently using the

2.2. Lyapunov equations

Bartels-Stewart algorithm, since, in practice, we usually have k n. Projection techniques can also be used to accelerate the convergence of the ADI method. For example, in [BLT09], the Galerkin projection onto subspace Vk⊗Vk, where Vkis an orthonormal basis for the column

space of the current ADI iterateVk= range(Zk), is used for computing an approximate solution

of the form X= VkRkV_k∗.

Remark 2.13. As shown in [HS95, KPT14], Krylov subspace methods for solving Lyapunov equations can be effectively preconditioned with a few steps of the ADI method. For example, one step of the ADI method with a single shift p deﬁnes the following preconditioner for (2.18)

PADI−1 = (A − pI)−1⊗ (A − pI). (2.22)

Finding the optimal shift p in (2.22) is equivalent to solving (2.21) with i = 1. As shown

in [Sta91], for the case of a symmetric A, the optimal shift p equalsλmax(A)λmin(A).

In a similar fashion, it is possible to derive a preconditioner for (2.18) based on the ﬁrst steps

of the sign function iteration for Lyapunov equations [KPT14]. In particular, for = 1, this gives

rise to the following preconditioner Psign−1 = 1

2c(I⊗ I + c

2_A−1_{⊗ A}−1_), _(2.23)

with the scaling factor c=

A 2

A−1₂, which can be approximated using M 2≈ M 1 M ∞,

see, e.g., [SB08]. Other known choices of preconditioners for (2.18) include the classical Jacobi and SSOR preconditioning [HS95].

Remark 2.14. The ADI method can be extended to address generalized Lyapunov equations of the form

AX ET+ E X AT = −BBT,

where A, E∈ Rn×n, B∈ Rn×m, with E symmetric positive deﬁnite andλE − A a stable pencil.

Similarly as in LR-ADI, this extension can be formulated in terms of the low-rank Cholesky

factors Zi, which is also known as the generalized low-rank ADI [Sty08].

2.2.3 Lyapunov equation for Gramians of linear control systems

Suppose we are given the following continuous linear time-invariant dynamical system with control

x(t ) = Ax(t) + Bu(t),

y(t ) = C x(t),

with system matrices A∈ Rn×n, B ∈ Rn×m,C ∈ R×n, state vector x(t )∈ Rn, input control vector u(t )∈ Rm and output function y(t )∈ R. Furthermore, we assume that A is stable

Chapter 2. Preliminaries

given dynamical system, which can be very difﬁcult to compute for very large values of n. To address this problem, we aim to ﬁnd a reduced-order model

x(t) = Ax(t) + B u(t ),

y(t) = Cx(t),

with A∈ Rk×k, B∈ Rk×m, C∈ R×k,x(t) ∈ Rk,y(t) ∈ Rand k n.

Ideally, when reducing the state space, we would like to remove states that are either

• hard to reach: input energy to guide the system into the state is very high;

• hard to observe: output energy generated from system being in the state is very low.

This idea is implemented in the balanced truncation algorithm [Moo81, PS82], which pre- serves stability of the dynamical system and provides computable error bounds. In order to provide the reduced model, the balanced truncation algorithm relies upon computation of the controllability Gramian P and the observability Gramian Q which are deﬁned as the unique symmetric positive semideﬁnite solutions P,Q∈ Rn×nof the following Lyapunov equations:

AP+ PAT = −BBT,

ATQ+Q A = −CTC .

Given the Cholesky decompositions of the computed Gramians P= P_CTPCand Q= Q_CTQC, the

optimal projection bases W,V ∈ Rn×k are extracted as the dominant left and right singular vectors of PCQ_CT, respectively, while the resulting reduced-order model is constructed as

follows

A= WTAV, B= WTB, C= CV, x(t) = V x(t), and y(t) = Cx(t).

In document Low-rank methods for parameter-dependent eigenvalue problems and matrix equations (Page 32-36)