• No results found

Taylor's theorem for matrix functions with applications to condition number estimation

N/A
N/A
Protected

Academic year: 2019

Share "Taylor's theorem for matrix functions with applications to condition number estimation"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Contents lists available atScienceDirect

Linear

Algebra

and

its

Applications

www.elsevier.com/locate/laa

Taylor’s

theorem

for

matrix

functions

with

applications

to

condition

number

estimation

Edvin Deadman, SamuelD. Relton

SchoolofMathematics,TheUniversityofManchester,Manchester,M139PL,UK

a r t i c l e i n f o a bs t r a c t

Article history:

Received27April2015 Accepted6April2016 Availableonline18April2016 SubmittedbyV.Mehrmann

MSC:

15A12 15A16

Keywords:

Matrixfunction Taylorpolynomial Remainder Conditionnumber Pseudospectrum Fréchetderivative

We derive an explicit formula for the remainder term of a Taylor polynomial of a matrix function. This formula generalizes a known result for the remainderof theTaylor polynomialforananalytic functionofa complexscalar. We investigatesomeconsequencesofthisresult,whichculminate in new upper bounds for the level-1 and level-2 condition numbersofamatrixfunctionintermsofthepseudospectrum of the matrix. Numerical experiments show that, although theboundscanbe pessimistic,they canbecomputedmuch faster than the standard methods. This makes the upper boundsidealforaquickestimationoftheconditionnumber whilsta moreaccurate(andexpensive)methodcanbeused iffurtheraccuracyisrequired.Theyarealsoeasilyapplicable tomorecomplicatedmatrixfunctionsforwhichnospecialized conditionnumberestimatorsarecurrentlyavailable.

© 2016TheAuthors.PublishedbyElsevierInc.Thisisan openaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).

ThisworkwassupportedbyEuropeanResearchCouncilAdvancedGrantMATFUN(267526).

* Correspondingauthor.

E-mailaddresses:[email protected](E. Deadman),[email protected]

(S.D. Relton).

http://dx.doi.org/10.1016/j.laa.2016.04.010

(2)

1. Introduction

Taylor’stheoremisastandardresultinelementarycalculus(seee.g.[17]).Iff :RR isktimescontinuouslydifferentiableata∈R,thenthetheoremstatesthatthereexists

Rk :RRsuchthat

f(x) =

k

j=0

f(j)(a)

j! (x−a)

j+R k(x)

and Rk(x) = o(|x−a|k) as x a. Depending on any additional assumptions on f,

variouspreciseformulaefortheremaindertermRk(x) areavailable.Forexample,iff is

k+ 1 timescontinuouslydifferentiableontheclosedintervalbetweenaandx,then

Rk(x) =

f(k+1)(c) (k+ 1)! (x−a)

k+1 (1)

for some c between a and x. This is known as the Lagrange form of the remainder. Alternativeexpressions,suchastheCauchyform ortheintegralformfortheremainder arewellknown[17].

Taylor’stheoremgeneralizestoanalyticfunctionsinthecomplexplane:theremainder mustnow beexpressed intermsof acontourintegral.Iff(z) iscomplexanalyticinan opensubsetD ⊂Cofthecomplexplane,thekth-degreeTaylorpolynomialoff ata∈ D

satisfies

f(z) =

k

j=0

f(k)(a)

k! (z−a)

j+R k(z),

where

Rk(z) =

(z−a)k+1 2πi

Γ

f(w)dw

(w−a)k+1(wz), (2)

andΓ isacircle,centredat a,suchthatΓ⊂ D.See[1,Chap. 5, Sec. 1.2]foraproof of thisresult.

The first goal of this paper is to generalize (2) to matrices, thereby providing an explicit expression for the remainder term for the kth-degree Taylor polynomial of a matrixfunction.Notethatitwillnotbe possibleto obtainanexpression similarto (1)

becauseits derivation relies on the mean valuetheorem which does nothave anexact analoguefor matrix-valued functions.Our second goal is to investigate applications of this result in bounding the derivatives and condition numbers of matrix functionsvia pseudospectra.

(3)

exposition). Mathias [15] also obtains a normwise truncation error bound for matrix function Taylorpolynomials whichform partof theSchur–Parlettalgorithm [4]. There are also a number of remainder theorems within the operator theory literature which can be appliedto matrix functions. However, to our knowledge, this paper represents the firsttimeanexplicit remainderterm(as opposedto abound) hasbeen specifically obtainedfortheTaylorpolynomialofamatrixfunction.

The remaining sections of this paper areorganized as follows. In section2 we state andprovetheremaindertermforthekth-degreeTaylorpolynomialofamatrixfunction. In section 3weinvestigate someapplications of thisresult bybounding the firstorder remaindertermusingpseudospectraltechniquesandrelatingittotheconditionnumber off(A).Insection4weextendtheseresultstothelevel-2conditionnumberofamatrix function,introducedin[13].Insection5weexaminethebehaviourofthepseudospectral boundsonsometestproblemsandshowthattheycanbecomputedefficiently.Finallyin section6wepresent ourconclusionsanddiscusssomepotentialextensionsofthiswork.

2. RemaindertermforTaylorpolynomials

The Taylorseries theorems foundinHigham’s monograph [11] primarilyinvolve ex-pandingf(A) aboutamultipleoftheidentitymatrixI:

f(A) =

j=0

f(j)(α)

j! (A−αI)

j.

OurstartingpointisthemoregeneralTaylorseriesexpansionintermsofFréchet deriva-tives, obtainedbyAl-MohyandHigham [2,Thm. 1]. Supposethatf hasapowerseries expansionj=0ajxj withradius ofconvergencer >0 centeredattheorigin.The

inte-riorofthecircle|x|< rdefinesasimplyconnectedopensetD.Then,givenA,E∈Cn×n

withΛ(A),Λ(A+E)⊂ D(whereΛ(X) denotesthespectrumofthematrixX),Al-Mohy and Highamprovedthat

f(A+E) =

j=0 1

j!D [j]

f (A, E), (3)

where

D[fj](A, E) = d

j

dtj

t=0

f(A+tE). (4)

(4)

terms,D[1]f (A,E),coincideswiththe“standard”FréchetderivativeLf(A,E).

Addition-ally,ifA andE commutethenwe haveD[fj](A,E)=Ejf(j)(A),wheref(j) denotesthe

jthderivativeofthescalarfunctionf(x).

Before writing down the remainder term obtained by truncating the Taylor series in(3), wefirstrecallthestandardresultthat,foranyinvertibleA andB,

A−1−B−1=A−1(B−A)B−1. (5)

Wewillalsoneedthefollowinglemma.

Lemma2.1. LetX(t)=A−tB,wheret isascalar. Then

dj

dtj

t=0

X(t)1=j!A−1(BA−1)j.

Proof. Notethat

d dtX

1=X1XX1,

whereX denotesthederivativeofX,andthat,sincehigherderivativesofX vanish,

dj dtjX

1= (1)jj!X1(XX1)j.

Theresultthen followsbysubstitutingX =A−tB andsettingt= 0. 2

Furthermore, we note that by the Cauchy–Hadamard theorem any power series in thecomplexplaneconvergingtoafunctionf mustconverge onacirculardomain with radiusofconvergencer(whichcanbeinfinite).Inthefollowingresults,forthepurpose ofmaximizinggenerality,wesaythatf hasapowerseriesexpansionwhichconvergeson asimplyconnectedopensetD.ClearlyDmustbeasubsetofthiscirculardomain,but neednotbe circularitself.Thereasonforthisdistinctionisthatthe-pseudospectraof

A,used insection3,giveriseto setsofdifferingshape.

Wenowstateandprovethemainresultofthispaper,whichgivesanexplicitformof theremaindertermwhentruncating(3).

Theorem 2.2.Let f have apower series expansionabout theoriginwith radiusof

con-vergence r and let D ⊂ C be a simply connected open set within thecircle of radius r centeredat0.LetA,E∈Cn×n besuchthat Λ(A),Λ(A+E)⊂ D.Thenforanyk∈N

f(A+E) =Tk(A, E) +Rk(A, E),

(5)

Tk(A, E) = k

j=0 1

j!D [j]

f (A, E), (6)

Rk(A, E) =

1 2πi

Γ

f(z)(zI−A−E)1[E(zI−A)1]k+1dz, (7)

and Γisaclosed contour inD enclosingΛ(A)and Λ(A+E).

Proof. Theresultisprovedbyinductiononk.For thecasek= 0 wehavef(A+E)=

f(A)+R0(A,E).Then

R0(A, E) =f(A+E)−f(A)

= 1 2πi

Γ

f(z)[(zI−A−E)1(zI−A)1]dz,

using theCauchyintegraldefinitionofamatrixfunction.Itfollows from(5)that

R0(A, E) = 1 2πi

Γ

f(z)(zI−A−E)1E(zI−A)1dz.

Fortheinductivestep,weassumethatf(A+E)=Tk(A,E)+Rk(A,E).Theremainder

forthe(k+ 1)st degreeTaylorpolynomialisgivenby

Rk+1(A, E) =f(A+E)−Tk+1(A, E)

=f(A+E)−Tk(A, E)

1 (k+ 1)!D

[k+1]

f (A, E)

=Rk(A, E)

1 (k+ 1)!

dk+1 dtk+1

t=0

f(A+tE).

Substituting the inductive hypothesis for Rk(A,E) and the Cauchy integral form for

f(A+tE),assumingthattissufficientlysmall, gives

Rk+1(A, E) = 1 2πi

Γ

f(z)(zI−A−E)1[E(zI−A)1]k+1dz

1

2πi(k+ 1)!

dk+1 dtk+1

Γ

f(z)(zI−A−tE)1dz.

(6)

Rk+1(A, E) =

1 2πi

Γ

f(z)(zI−A−E)1[E(zI−A)1]k+1

(zI−A)1[E(zI−A)1]k+1dz

= 1 2πi

Γ

f(z)(zI−A−E)1[E(zI−A)1]k+2dz,

where(5)hasbeen usedoncemore.Thiscompletestheproof. 2

WeendthissectionbybrieflydescribinghowTheorem 2.2 alsoallowsusto obtaina remaindertermforPadéapproximants(thiswasfirstdoneinthescalarcasebyElliot[5]). Suppose that we approximate f(z) using a rational function pm(z)/qn(z), where

pm(z) and qn(z) are polynomials of degree m and n respectively. The Padé

approx-imant is the unique choice (up to scalar multiples) of pm(z) and qn(z) such that

f(z)−pm(z)/qn(z)=O(zm+n+1).Therefore,usingthesamerationalfunctionto

approx-imatethecorrespondingmatrixfunction,wehaveqn(X)f(X)−pm(X)=O(Xm+n+1).

Weintroducethetruncation errortermSm,n(X) tothePadé approximantsuchthat

f(X) =qn(X)1pm(X) +Sm,n(X).

Then,byrearrangingtheabove,

qn(X)f(X) =pm(X) +qn(X)Sm,n(X).

Thetermqn(X)Sm,n(X) isthentheremaindertermifweconsiderpm(X) tobeapower

series expansion of qn(X)f(X) when we set A = 0 and E = X. The remainder has

degreeatleastm+nand so,byapplying(7)with k=m+n, weobtain

Sm,n(X) =

qn(X)1Xm+n+1

2πi

Γ

qn(z)f(z)(zI−X)1

zm+n+1 dz,

wheretheclosedcontourΓ enclosesΛ(X) andtheorigin.

3. Applicationto conditionnumbers andpseudospectra

InthissectionweuseTheorem 2.2tostudythebehaviouroftheconditionnumberof amatrixfunction, whichmeasures thesensitivity of f(A) tosmall perturbationsinA. The resultsin this section are applicable for any induced matrix norm. Our approach requires borrowing a number of techniques from the analysis of pseudospectra. Recall thatthe-pseudospectrumofamatrixX istheset

Λ(X) =

z∈C:(zI−X)1 ≥−1 . (8)

(7)

Lemma 3.1. Let f andD satisfy the criteria of Theorem 2.2. Furthermore let >0be such that Λ(A)⊂ D and Λ(A+E)⊂ D,and take Γ˜⊂ D tobe aclosed contour that

encloses both Λ(A)andΛ(A+E).Then theremainderterm Rk(A,E) isboundedby

Rk(A, E)

Ek+1L˜

2πk+2 max

z∈Γ˜

|f(z)|, (9)

whereL˜ isthelengthofΓ˜.Inparticular,whenacircularcontourcenteredat0isused,

Rk(A, E)

Ek+1ρ˜

k+2 θmax[0,2π]|f( ˜ρe

)|, (10)

where ρ˜= max{|z|:z∈Λ(A+E)Λ(A)}is theradius ofthecircle.

(NotethattildesonL˜,Γ˜,andρ˜ areusedbecause,forthis resultonly,thecontour

needs to enclose Λ(A+E) in addition to Λ(A). For subsequent results, the contour

need onlyencloseΛ(A) andthetildes aredropped.)

Proof. Theproofisanalogousto thatofthebound

f(A) L˜ 2πmaxz∈Γ˜

|f(z)|,

obtained by Trefethen and Embree [19, Ch. 14]. We bound the norm of Rk(A,E) by

notingthat

Rk(A, E)

Ek+1 2π

˜ Γ

|f(z)|(zI−A−E)1(zI−A)1k+1.

On Γ˜ we have(zI−A−E)1≤−1 and(zI−A)1≤−1. Thefirstpartof the

lemmafollowsimmediately.Forthesecondpart,takeΓ˜tobeacirclewithcenter0and

radius ρ˜= max{|z|:z∈Λ(A+E)Λ(A)}. 2

We can also use this result to bound the absolute condition number of a matrix function. Recallthattheabsolute conditionnumbermeasures thefirstordersensitivity of f(A) tosmallperturbationsinA andisgivenby[11,Chap. 3]

condabs(f, A) := lim

τ→0Esupτ

f(A+E)−f(A)

τ

= max

E≤1Lf(A, E). (11)

(8)

Corollary 3.2.Let f and D satisfy the criteria of Theorem 2.2. Let > 0 be such that Λ(A) ⊂ D, and let Γ ⊂ D be a closed contour of length L that encloses the

-pseudospectrum.Then

condabs(f, A) L 2π2maxzΓ

|f(z)|. (12)

Inparticular, whenacircularcontour centeredat0isused,

condabs(f, A) ρ

2θmax[0,2π]|f(ρe

)|, (13)

whereρ= max{|z|:z∈Λ(A)}isthepseudospectralradiusof A.

Proof. Set k= 0 in (9). ConsiderE=α < so that,by anequivalent definition of

the-pseudospectrum,wehaveΛ(A+E)Λ(A).Then,sinceR0(A,E)=Lf(A,E)+

o(E),wehave

Lf(A, E) +o(α)

αL

2π2maxzΓ|f(z)|.

Wedividebyαandtakethesupremumover allE suchthatE≤αtoobtain

sup

E≤α

Lf(A, E/α) +o(α)/α ≤

L

2π2maxzΓ

|f(z)|.

Theproofof(12)iscompletedbytakingthelimitα→0 andrecallingthattheabsolute condition number of a matrix function is given by the operator norm of the Fréchet derivative(11).

The proof of (13) is essentially the same, except that (10) is taken as the starting pointratherthan(9).

Notethatan alternativeproof ofthecorollary canbe obtainedbystartingwith the integralrepresentation oftheFréchetderivative

Lf(A, E) =

1 2πi

Γ

f(z)(zI−A)1E(zI−A)1dz,

andboundingitaboveusingthetechniquesfrom theproofofLemma 3.1. 2

(9)

Lemma 3.3(Reddy, Schmid, and Henningson).Let W(A) be the numerical range of A and Δδ be acloseddisk of radiusδ.Thenforall>0

Λ(A)⊂W(A) + Δ,

wheresetadditionisdefinedcomponentwise;thatisS1+S2={s1+s2:s1∈S1, s2∈S2}.

Proof. SeeReddy, Schmid,andHenningson[16, Thm. 2.1]. 2

Since the numerical radius, r(A) := supzW(A)|z|, is equal to A2 we know that the -pseudospectral radius is no larger than A2+. Thus we obtain the following corollary.

Corollary 3.4.Letf,D,and>0satisfythecriteria ofCorollary 3.2andsuppose that

A2+< r,theradiusof convergenceforthepowerseriesexpansionof f.Then

condabs(f, A)

A2+ 2 |z|=maxA

2+|

f(z)|. (14)

Proof. Thecircle of radiusA2+around theoriginencloses Λ(A) andis of length

2π(A2+).Usingthis contourin(13)givesthedesiredresult. 2

Onepotentialapplicationofthisresultisinthedesignandanalysisofalgorithmsfor computing matrixfunctions. Manysuch algorithms workbyrescalingA to be ofsmall norm, applying the function to this scaled matrix (via a Padé approximant or Taylor series),andthenundoingtheeffectofthescaling.Thiscorollarymayallowustobetter understand the numerical effect of applying the matrix function to the scaled matrix, sincesuchanalysisis typicallydoneonlyinexactarithmetic.

WeendthissectionbybrieflymentioningarelatedtheoremduetoLui[14,Thm. 3.1], concerning the relationshipbetween thepseudospectra of Aand f(A). Thetheorem is restatedhereinournotation.RecallthatRk(A,E) wasdefinedinTheorem 2.2andthat

R0(A,E)=Lf(A,E)+o(E).

Lemma 3.5(Lui). Let,f,andΓ satisfy theconditions of Corollary 3.2.Furthermore

let f(A))= {f(z) :z Λ(A)}and M = maxE≤R0(A,E). Then f(A))

ΛM(f(A)).

Proof. IfzisaneigenvalueofA+E withE≤(so thatz∈Λ(A)),thenf(z) isan

eigenvalueoff(A+E)=f(A)+R0(A,E) andR0(A,E)≤M. 2

Thisresultshowsthat,tofirstorderin,the-pseudospectrumofAisrelatedtothe

(10)

4. Applicationto higherorderconditionnumbers

HighamandRelton[13] introducethelevel-qconditionnumberformatrixfunctions, whichisdefinedrecursivelyby

cond(absq)(f, A) := lim

α→0Zsupα

|cond(absq−1)(f, A+Z)cond(absq−1)(f, A)|

α , (15)

wherecond(1)abs(f,A):= condabs(f,A).Insection3wefocusedonthefirstorderremainder term, R0(A,E), and results concerning the condition number condabs(f,A) but – by choosingk >0 in Lemma 3.1–wecanattempttoextendresultssuchas Corollary 3.2

tothese higherorder conditionnumbers.

Beforeproceeding,we must firstinvestigate therelationshipbetween theD[fj](A,E) defined in(4) and higher order Fréchet derivatives.Recall thatDf[j](A,E) isa special caseofthejthorderFréchetderivativeinwhichtheperturbationineachdirectionisE. In[13]adefinitionofthejthorderFréchetderivative,assumingitiscontinuousinA,is givenintermsofthemixedpartialderivative:

L(fj)(A, E1, . . . , Ej) =

∂s1· · ·

∂sj

(s1,...,sj)=0

f(A+s1E1+· · ·+sjEj). (16)

Thefollowingtheorem expressesthisjthorder Fréchetderivativeintermsofacontour integral.

Theorem4.1. Letf be j timesFréchetdifferentiablesuchthat thejthFréchetderivative

iscontinuousatA,andletΓ beaclosed contourenclosingΛ(A)suchthatf isanalytic insideandonΓ.Then,thejthorderFréchetderivativeof amatrixfunctionf(A)inthe directionsE1,. . . ,Ej isgivenby

L(fj)(A, E1, . . . , Ej) =

1 2πi

Γ

f(z)(zI−A)1

σ∈Sj

k

i=1

(i)(zI−A)1dz, (17)

whereSjisthesetofpermutationsof{1,2,. . . ,k}.InparticularthederivativeD[fj](A,E)

isgivenby

Df[j](A, E) = j! 2πi

Γ

f(z)(zI−A)1[E(zI−A)1]j+1dz. (18)

Proof. For any choiceof si (in someneighbourhoodof 0) and Ei,we canwrite f(A+

s1E1+· · ·+sjEj) asaCauchyintegralbyusingthestandardCauchyintegraldefinition

(11)

L(fj)(A, E1, . . . , Ej) =

∂s1· · ·

∂sj

(s1,...,sj)=0

˜ Γ

f(z)(zI−(A+s1E1+· · ·+sjEj))1dz.

Using theLeibnizintegralrule,thedifferentialoperator

∂s1· · ·

∂sj

(s1,...,sj)=0

canbebroughtinsidetheintegralsign.Therequiredintegrandisthenobtainedbyusing theidentity

d dxU

1=U1dU

dxU 1.

Theresult(17)follows bythenrestrictingthecontourtotheclosed curveΓ containing Λ(A).Thesecondpartofthetheorem,(18),followsbysettingE1=· · ·=Ej. 2

Theorem 4.1 shows that, to firstorder, thekth remainder term intheTaylor series issimplythe(k+ 1)st derivative,as wemightexpect.Specifically,comparing (18)with

(7) wefind

Rk(A, E) =

1 (k+ 1)!D

[k+1]

f (A, E) +o(E k+2).

In addition, Theorem 4.1 allows us to prove the following theorem,which uses the pseudospectrumofAto boundthenormofthejthorder Fréchetderivative.

Theorem 4.2.Let f satisfy the criteria of Theorem 4.1 and let Γ be a closed contour

enclosing Λ(A) such that f is analytic inside and on Γ. Then the jth order Fréchet

derivative can bebounded by

L(fj)(A, E1, . . . , Ej)

j!L

2πj+1

max

z∈Γ|

f(z)|

j

i=1

Ei, (19)

where L isthelengthof Γ.

Proof. In (17) use the contour Γ, take norms, and note that (zI −A)1 1

on Γ. 2

(12)

any interest. Instead we restrict ourselves to the case q = 2 and the level-2 condition number.

Lemma4.3.Letf satisfythecriteriaofTheorem 4.1andletΓaclosedcontourenclosing

Λ(A)suchthat f isanalytic inside andonΓ.Thelevel-2conditionnumber isbounded

by

cond(2)abs(f, A) L

π3maxzΓ|f(z)|.

Whenacircularcontour centeredat0isused,

cond(2)abs(f, A) 2ρ

3 θmax[0,2π]|f(ρe

)|,

whereρ,thepseudospectralradius, istheradiusof thecircle.

Proof. Higham and Relton [13, Sec. 5] give an upper bound for the level-2 absolute

conditionnumberintermsofthenormofthesecondFréchetderivative

cond(2)abs(f, A) max

E1=1

max

E2=1

L(2)f (A, E1, E2). (20)

Substitutingtheboundfrom(19)into(20)givestherequiredresults. 2

5. Numericalexperiments

In this section we show how our pseudospectral bounds on the condition number,

(12) and (13), can be used to estimate the condition number of matrix functions in practice.Wealsofindthattheyarecheaperthanalternativeapproachesand,therefore, onemight use thepseudospectral bound as aquick estimate of thecondition number. If this estimate is unsatisfactorily large we can use existing methods to estimate it moreaccurately.Theterm“unsatisfactorilylarge”canbemadepreciseinthefollowing manner:manyapplicationsonlyrequirethefirstfewdigitsoftheresulttobecorrectso thatarelativeerrorof,forexample,1e-4isperfectlyacceptable.Whenusingabackward stable algorithm the relative error is approximately bounded above by the condition numbermultipliedbytheunitroundoff(u= 253inIEEEdoubleprecisionarithmetic). Throughoutthissection, tocomputeourboundontheconditionnumber,wewill be using(12)

condabs(f, A) L

2π2maxzΓ|f(z)|,

whereΓisaclosedcontouroflengthLthatenclosesthepseudospectrumofAandlies

(13)

condrel(f, A) = condabs(f, A) A f(A).

Combining these two results allows us to bound the relative condition number from above.ThisboundwillbecheaptocomputeprovidedthatthecostofcomputingLand

maxz∈Γ|f(z)|is sufficientlysmall.

Inordertousethisboundinpracticewemustchoosewhichmatrixnormtoconsider, thevalueof ,andthecontourΓ. WewillusetheFrobeniusnorm since,inthisnorm,

there isanexplicitformulafortheconditionnumberwhichcanbecomputedusing[11, Alg. 3.17]. However, the pseudospectrum is not defined in the Frobenius norm, since it requires the use of an induced norm. To resolve this, one can easily show that the absolute condition number in the Frobenius norm is bounded above by √n times the conditionnumberinthe2-norm,where nisthesize ofthematrix.Hencewehave

condrel(f, A, · F)

ncondabs(f, A, · 2)

AF

f(A)F

.

Theright-handsideofthisequationiswhatwewillcompute,wherethecondabs(·) term is boundedaboveby(12).

It remainsto chooseand Γ.Lookingat (12)wesee that,heuristically,inorder to

minimizetheupperboundwewouldliketobereasonablyfarfrom0.Someofourtest functions will have power series thatare convergent in acircle of radius 1 around the point z = 0; for these caseswe chooseΓ to be acircle centered at 0 with radius 0.99

and find thelargest such thatthe-pseudospectral radiuslies insidethis circle. This is computedusingthenonlinearoptimizationroutinefminbndinMATLAB.Whenour function hasapowerseries with aninfiniteradius ofconvergence we choose= 1 and take Γ to be acircle centeredat the meanof theeigenvalues of A(γ = n1

λi)with

radiusequaltothe-pseudospectralradiusofA−γI.Finally,tofindmax|f(z)|onthese contours,weagainusethenonlinearoptimizationroutinefminbndinMATLAB.Weuse

psapsrbyGuglielmiandOverton[7]tocomputethe-pseudospectralradiithroughout. Wewillcompareourpseudospectralmethoddescribedabove(hereafterreferredtoas

condpseudo)intheFrobeniusnormagainsttwoalternativemethodsforcomputingthe condition number: funm_condest_fro from the Matrix Function Toolbox [10] and an “exact” methoddetailed byHigham [11, Alg. 3.17], which we referto as condold and

condexact, respectively.

The method condold uses finite difference approximations to the derivatives of the matrixfunctionandhasO(n3) cost.Meanwhilecondexactexpressesthecondition num-berasthe2-normofamatrixKf(A)Cn

2×n2

,calledtheKroneckerformofthematrix function, which must be computed explicitly with cost O(n5). Therefore condexactis impractical forallbutthesmallestproblems.

(14)
[image:14.485.54.421.70.212.2]

Fig. 1.Conditionnumberestimates/boundsforthematrixfunctioncorrespondingtof(x)= log(1+x) in theFrobeniusnormover29 testmatrices.Wehavecondoldandcondexactoverlappingalmostentirely.Left:

Theconditionnumberestimates/bounds.Right:Theratiosofcondpseudoandcondoldtocondexact.

Wecomparethethreedifferentalgorithmsonfourmatrixfunctionscorresponding to thescalarfunctionslog(1+x),(1+x)1/15,exp(x),andcos(x).Thefirsttwoofthesehavea powerseriesrepresentationwhichisconvergentfor|x|<1,whilstthelatterhaveglobally convergent power series. The matrix functions are computed using logm and expm in MATLAB,alongwithcosmfrom[3],andpowerm_fre_newbyHigham and Lin [12].

For each function we use 29 test matrices (of size n = 10) from the Matrix Com-putation Toolbox [9] and plot both the computed condition numbers and the ratio of

condpseudoand condoldto condexact.For thefirsttwo functions,where we needall eigenvalues to lie within the region of convergence, we transform each matrix to have eigenvalues centered at 0 with A2 = 1 so that all eigenvalues lie within the unit disk.

InFig. 1 we see the condition numberas computed by the three methods,for each of the 29 test matrices, using the functionf(x)= log(1+x). The resultsare ordered by decreasing condition number as computed by condexact. We can immediately see thatcondpseudoisindeedanupperboundandisusually2–4ordersofmagnitudelarger thantheexactconditionnumber.Meanwhilecondoldisgenerallyaverygoodestimate oftheconditionnumber.

Next inFig. 2 we compare the condition numberswhen using the function f(x) = (1+x)1/15.Inthiscaseweseeverysimilarbehaviourtothepreviousfunction:condold andcondexactarealmostidenticalwhilstcondpseudoprovidesanupperboundthatis generally2–4ordersofmagnitude largerthanthetrueconditionnumber.

Fig. 3showstheresultswhenusingf(x)= exp(x).Inthiscasecondpseudoperforms slightlybetterthanpreviouslybeing only1–3ordersofmagnitudeabovecondexacton mosttestproblems.

(15)

eigen-Fig. 2.Conditionnumberestimates/boundsforthematrixfunctioncorrespondingtof(x)= (1+x)1/15in theFrobeniusnormover29 testmatrices.Wehavecondoldandcondexactoverlappingalmostentirely.Left:

[image:15.485.48.414.71.212.2]

Theconditionnumberestimates/bounds.Right:Theratiosofcondpseudoandcondoldtocondexact.

Fig. 3.Conditionnumberestimates/boundsforthematrixfunctioncorrespondingtof(x)= exp(x) inthe Frobeniusnormover29 test matrices.Wehavecondoldandcondexactoverlappingalmostentirely.Left:

Theconditionnumberestimates/bounds.Right:Theratiosofcondpseudoandcondoldtocondexact.

Fig. 4.Conditionnumberestimates/boundsforthematrixfunctioncorrespondingtof(x)= cos(x) inthe Frobeniusnormover29 test matrices.Wehavecondoldandcondexactoverlappingalmostentirely.Left:

[image:15.485.48.415.266.407.2] [image:15.485.48.415.461.602.2]
(16)

valuesextendingfar intothecomplexplane:as thecosinefunctiongrowsexponentially inthedirectionoftheimaginaryaxis|cos(z)|is extremelylargeonthechosencontour. Each ofthese four casesshowsthatcondpseudoprovides areliableupperbound on the condition number and is generally just a few orders of magnitude above the true value. We also note that, since condpseudo needs only the scalar function f(x) and doesnot needto compute thederivativesof amatrixfunction, viafinite differences or otherwise,itcaneasilybeappliedtoverycomplicatedmatrixfunctionssuchascos(√A) (forwhichnospeciallydesignedalgorithmsexist)withnomodification.Inthisparticular casewewouldapply ouralgorithm tothefunction

f(A) =

k=0

(1)kAk

(2k!) ,

which isanalytic and equivalent to cos(√A) away from theorigin, regardless of which branch of the square root function is selected. Matrix functions such as this can arise infiniteelementsemidiscretizationofthewaveequation.Forexample,thesecond order differentialequation

y(t) +Ay(t) =g(t), y(0) =y0, y(0) =y0,

hasthesolution

y(t) = cos(√At)y0+ (

A)1sin(√At)y0 +

t

0

(√A)1sin√A(t−s)g(s)ds,

where√Adenotes anysquare rootof A[6,p. 124], [18];see also[11, Prob. 4.1]forthe caseg(t)= 0.

Our next experiment compares the speed of estimating the condition number as the size of the matrix grows. Here we focus on the function f(x) = (1+x)t for

t = 1/5,1/15,1/52 and for n between 10 and 1000. For each value of n we take A

to be amatrix with elements normally distributed with zero mean and unit variance, scaledtohaveunitnorm.SincecondexactisanO(n5) algorithmitbecomesincreasing impractical as n grows: instead we will compare condpseudo against condold and a differentalgorithm,condhili.Thislatteralgorithm,designedbyHighamand Lin[12], estimates thecondition number of matrixpowers ina similar manner to condold but actuallycomputesthederivativesof thematrixfunction,as opposedtousing finite dif-ferenceapproximations. Thealgorithmisdesignedto estimatetheconditionnumberin the 1-norm but has similar computational complexity to condold which works in the Frobeniusnorm.ThisexperimentwasrunonalaptopwithanInteldual-corei7processor usingMATLABR2014b.

(17)
[image:17.485.47.414.70.210.2]

Fig. 5. Runtimein secondsandresultingspeedup whencomputingthematrixfunction correspondingto

f(x)= (1+x)tfort= 1/5,1/15,1/52 usingcondpseudo,condold,andcondhiliasnvariesbetween10 and

1000.Thex-axisshowsn,thesizeofthetestmatrix,whilstthey-axisdenotestheruntimeandspeedup, respectively.Left:Runtimeinsecondswhenrunningeachalgorithm.Right:Speedupwhenusingcondpseudo comparedtocondoldandcondhili.

of t whilst the right-hand plot shows the speedup when using condpseudo relative to theothermethods.Thex-axis showsn,thesizeofthematrices,whilstthey-axisshows theruntimeinseconds(left-handplot)andthespeedupobtained(right-handplot).We see that condpseudo is much cheaper than the alternatives for fairly small matrices and appearsto settle at around1.5 timesfaster than condoldand 2times fasterthan

condhili,respectively, onthis machine. This wouldsuggest thatusing condpseudois beneficial forapplications wherelow-accuracy solutionsare requiredandisparticularly goodinsituationswhere lotsofsmallmatrixfunctionsneedtobe computed.

6. Conclusions

Themainresultsinthispaperareasfollows.Wehaveobtainedanexplicitexpression fortheremaindertermofamatrixfunctionTaylorpolynomial(Theorem 2.2).Combining this with use of the -pseudospectrum of A leads to upper bounds on the condition numbers of f(A). Our numerical experiments demonstrated that our bounds can be used for practical computations: they provide a cheap upper bound on the condition number which is often onlya few orders of magnitude too large. This meansthat our bounds could beused as aquickestimateof theconditionnumberand ifthis estimate is too large,forinstance if theestimatesuggeststhatan insufficientnumberofcorrect significant figuresmightbe obtainedincomputing f(A),then existingmethodscanbe used toobtaintheconditionnumbermoreaccurately.

Anotherbenefitofourapproachisthatitcaneasilybeappliedtoboundthecondition numberofcomplicatedmatrixfunctionssuchascos(√A) withoutmodification,asthere are currentlynospecializedmethodsforcomputingsuchquantities.

(18)

thebehaviourof existingalgorithmsto computematrixfunctions(see thediscussion of

Corollary 3.4).This willbethesubjectoffuturework.

Acknowledgements

WeareverygratefultoNickHighamfornumeroushelpfulcommentsonearlierdrafts of this work and to the referee who helped to significantly increase the quality of the manuscript.

References

[1]LarsV.Ahlfors,ComplexAnalysis,thirdedition,McGraw-Hill,NewYork,ISBN 978-0-0700-0657-7, 1979.

[2] AwadH.Al-Mohy,NicholasJ.Higham,ThecomplexstepapproximationtotheFréchetderivativeof amatrixfunction,Numer.Algorithms53 (1)(2010)133–148, http://dx.doi.org/10.1007/s11075-009-9323-y.

[3] AwadH.Al-Mohy,NicholasJ.Higham,SamuelD.Relton,Newalgorithmsforcomputingthematrix sine and cosine separately or simultaneously, SIAM J. Sci. Comput. 37 (1) (2015) A456–A487,

http://dx.doi.org/10.1137/140973979.

[4] PhilipI.Davies, NicholasJ.Higham,ASchur–Parlettalgorithmforcomputingmatrixfunctions, SIAMJ.MatrixAnal.Appl.25 (2)(2003)464–485,http://dx.doi.org/10.1137/S0895479802410815. [5]David Elliott,Truncation errors inPadé approximationsto certainfunctions: an alternative

ap-proach,Math.Comp.(ISSN 0025-5718) 21 (99)(1967)398–406.

[6]F.R.Gantmacher,TheTheoryofMatrices,vol.1,Chelsea,NewYork,ISBN 0-8284-0131-4,1959.

[7]NicolaGuglielmi,MichaelL.Overton,Fastalgorithmsfortheapproximationofthepseudospectral radiusofamatrix,SIAMJ.MatrixAnal.Appl.32 (4)(2011)1166–1192.

[8]KurtHensel,ÜberPotenzreihenvonMatrizen,J.ReineAngew.Math.155 (42)(1926)100–110.

[9] Nicholas J. Higham, The Matrix Computation Toolbox, http://www.maths.manchester.ac.uk/ ~higham/mctoolbox.

[10] NicholasJ.Higham,TheMatrixFunctionToolbox,http://www.maths.manchester.ac.uk/~higham/ mftoolbox.

[11]NicholasJ.Higham, FunctionsofMatrices: TheoryandComputation,Societyfor Industrialand AppliedMathematics,Philadelphia,PA,USA,ISBN 978-0-898716-46-7,2008.

[12] Nicholas J. Higham, Lijing Lin: An improved Schur–Padé algorithm for fractional powers of a matrix and their Fréchet derivatives, SIAM J. Matrix Anal. Appl. 34 (3) (2013) 1341–1360,

http://dx.doi.org/10.1137/130906118.

[13] Nicholas J. Higham, Samuel D. Relton, Higher order Fréchet derivatives of matrix functions and the level-2 condition number, SIAM J. Matrix Anal. Appl. 35 (3) (2014) 1019–1037,

http://dx.doi.org/10.1137/130945259.

[14]S.-H.Lui,Apseudospectralmappingtheorem,Math.Comp.72 (244)(2003)1841–1854.

[15]RoyMathias,Approximationofmatrix-valuedfunctions,SIAMJ.MatrixAnal.Appl.14 (4)(1993) 1061–1063.

[16] S.C.Reddy,P.J.Schmid,D.S.Henningson,PseudospectraoftheOrr–Sommerfeldoperator,SIAM J.Appl.Math.53(1993)15–47,http://dx.doi.org/10.1137/0153002.

[17]Walter Rudin, Real and Complex Analysis, third edition, McGraw-Hill, New York, ISBN 0070542341,1986.

[18] StevenM.Serbin,Rationalapproximationsoftrigonometricmatriceswithapplicationto second-ordersystemsofdifferentialequations,Appl.Math.Comput.5 (1)(1979)75–92,http://dx.doi.org/ 10.1016/0096-3003(79)90011-0.

[19]LloydNicholasTrefethen,MarkEmbree,SpectraandPseudospectra:TheBehaviorofNonnormal MatricesandOperators,PrincetonUniversityPress,2005.

[20] H.W.Turnbull,Amatrix formof Taylor’stheorem,Proc.Edinb.Math.Soc.(2)2 (1930)33–54,

http://dx.doi.org/10.1017/S0013091500007537.

Figure

Fig. 1. ConditionThe number estimates/bounds for the matrix function corresponding to f(x) = log(1 + x) inthe Frobenius norm over 29 test matrices
Fig. 2. ConditionThe number estimates/bounds for the matrix function corresponding to f(x) = (1 + x)1/15 inthe Frobenius norm over 29 test matrices
Fig. 5. Runtimecompared in seconds and resulting speedup when computing the matrix function corresponding tof(x) = (1 + x)t for t = 1/5, 1/15, 1/52 using condpseudo, condold, and condhili as n varies between 10 and1000

References

Related documents

Key policy drivers (IOM Health Professions Education: A Bridge to Quality (2003); Lancet Commission (Frenk et al., 2010), Framework for Action on Interprofessional Education

Reporting. 1990 The Ecosystem Approach in Anthropology: From Concept to Practice. Ann Arbor: University of Michigan Press. 1984a The Ecosystem Concept in

The Lithuanian authorities are invited to consider acceding to the Optional Protocol to the United Nations Convention against Torture (paragraph 8). XII-630 of 3

In terms of the similarity of one-minute rainfall rate distribution with the pattern demonstrated by recent satellite precipitation measurements, the modified ITU-R

There are eight government agencies directly involved in the area of health and care and public health: the National Board of Health and Welfare, the Medical Responsibility Board

Insurance Absolute Health Europe Southern Cross and Travel Insurance • Student Essentials. • Well Being

A number of samples were collected for analysis from Thorn Rock sites in 2007, 2011 and 2015 and identified as unknown Phorbas species, and it initially appeared that there were

As a plan manager, do I lodge a claim for payment from the National Disability Insurance Agency (NDIA) or does the registered provider of supports do this.. If an organisation