HAL Id: hal-01881736
https://hal.archives-ouvertes.fr/hal-01881736
Submitted on 28 Nov 2019
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Second order accurate asynchronous scheme for
modeling linear partial differential equations
Asma Toumi, Guillaume Dufour, Ronan Perrussel, Thomas Unfer
To cite this version:
Asma Toumi, Guillaume Dufour, Ronan Perrussel, Thomas Unfer. Second order accurate
asyn-chronous scheme for modeling linear partial differential equations. Applied Numerical Mathematics,
Elsevier, 2017, 121, pp.115-133. �10.1016/j.apnum.2017.06.014�. �hal-01881736�
Second
order
accurate
asynchronous
scheme
for
modeling
linear
partial
differential
equations
✩
Asma Toumi
a,
∗
,
Guillaume Dufour
a,
1,
Ronan Perrussel
b,
Thomas Unfer
b aONERA,2avenueEdouardBelin,31400Toulouse,FrancebLAPLACE-ENSEEIHT2,rueCharlesCamichelBP7122,31071ToulouseCedex7,France
We propose an asynchronous methodfor the explicitintegration of multi-scale partial differentialequations.Thismethodisrestrictedbyalocal CFL(CourantFriedrichsLewy) condition rather than the traditional global CFL condition. Moreover, contrary to other local time-stepping(LTS) methods,the asynchronousalgorithmpermits the selection of independenttimestepsineachmeshelement.WederivedanasynchronousRunge–Kutta 2(ARK2)schemefromastandardexplicitRunge–Kuttamethodandweproved thatthe ARK2 schemeis secondorderconvergent. Comparing withthe classicalintegration, the asynchronousschemeiseffectiveintermsofcomputationtime.
1. Introduction
Numerical simulationhasbecome a centraltool for themodeling ofmanyphysicalsystems (fluiddynamics,plasmas, electromagnetism,etc.).Multi-scalephenomenamaketheintegrationofthesephysicalsystemsdifficultintermsofaccuracy andcomputation time. Time-steppingintegration techniquesused forsolving suchproblems generallyfall intotwo cate-gories:explicitandimplicitschemes.Intheexplicitschemes,allunknownvariablesarecomputedatthecurrenttimelevel fromquantitiesalreadyavailable.TimestepisthenlimitedbythemostrestrictiveCFLconditionoverthewhole computa-tiondomain.Theimplicitmethodallowtoovercomethistimestepconstraintbutcomeswithanadditionalcomputational costarisingfromsolvinglargelinearsystems.Thustheimplicitapproachbecomeseffectiveintermsofcomputationalcost onlyifthetimestepsusedarequitelargewithrespecttotheCFLcondition.Howeverthisisnotalwayspossible,especially whenstronglycoupledandmulti-scale phenomenasuchasmicrowaveplasmagenerationarestudied.
Toovercomesuchproblems,anumberoflocaltime-steppingapproacheshavebeendeveloped.Thesemethodsare re-strictedby a local CFL condition ratherthan the traditionalglobalCFL condition. In [14], Osher andSanders proposed a local time-steppingscheme forone-dimensional scalar conservation laws.They gavea thorough analysis ofa first order spatial discretization witha localforward Euler time-stepping scheme. This scheme allows each element to take either an entire time step or some fixed number ofsmaller steps. Tang andWarneck proposed in [17] a class of high resolu-tion local time step schemesfor hyperbolic conservations by projecting the solutionincrements ateach localtime step. Savcenco et al.[15] constructed a multirate schemefor parabolicproblems. This schemewas obtainedby adaptation of
✩ ThisworkwasfundedbytheFrenchNationalResearchAgencyundercontractNo. ANR-11-MONU-0019.
*
Correspondingauthor.E-mailaddresses:[email protected](A. Toumi),[email protected](G. Dufour),[email protected](R. Perrussel),
[email protected](T. Unfer). 1 Fax:+33(0)562252593.
an implicitRosenbrock-type scheme.The ideais tocompute apredictionforall thecells ofthemeshwiththetrapezoid formula. Then the finemeshis updated usinginformationfromthecoarse meshby quadraticinterpolation. Dawsonand Kirby[2],developedupwindmethodsforsolvingconservationlawswhichallowlocaltimerefinementtobecoupledwith localspatialrefinement.Inthatschemealimiterisappliedwhichisadaptedtotheoutcomeofpreviousstages.W. Hunds-dorfer etal.[8],notedthat severalmultirate schemes forconservationlaws haveone ofthefollowing defects: thereare schemes thatare locallyinconsistant,e.g.[14,2],andschemes thatare notmass-conservative,e.g. [17,15].Theydiscussed these two defects for one-dimensional conservationlaws. In the same context, an error analysisis presented in [7], for explicitpartitionedRunge–Kuttamethodsandmultiratemethodsappliedtoconservationlaws.Hundsdorferandal.showed that the differentmultirate methods studiedin[8] leadto orderreduction ofthe schemes.Toguarantee mass conserva-tion, flux-based decompositions are studied. In this case, the accuracy may deteriorate. Another approach developed by BergerandOliger[1]involvesautomaticallytakingsmallertimestepswherethemeshisrefined.Intheirapproachrefined grids arelaid overregions of the coarsemesh. Informationis then passed betweenthe grids by means ofinjectionand interpolation. Flaherty etal.[3]developed a parallel,adaptive discontinuous Galerkinmethodwitha localforward Euler schemewhichreliesoninterpolatingvaluesintime atinterfacesbetweentimestepsofdifferentsizes.Thisscheme, how-ever, doesnot appear to conserveflux along these interfaces.Also, only firstorder intime methodsare discussed. Note that for thesedifferentalgorithms, local time steps are usually selectedto be fractions ofthe globaltime step. Recently a newtime integrationapproachhasbeenappliedtoequationsofnon-linearelastodynamics [9].Itisbasedonadiscrete spacetimeformofHamilton’svariationalprinciple.Thisalgorithmpermitstheselectionofindependenttime stepsineach meshelement.However,thisapproachisonlyapplicabletoHamiltoniansystems.In[11]OmelchenkoandKarimabadi pre-sentedan asynchronousapproachtoa diffusion–advection–reactionequationinone dimension.Theirmethodisbasedon discrete-eventsimulation(DES). Theydefinedtheconcept of“fluxcapacitor”, avariableinwhichtheystoredtheflux be-tween two cells until a later event. Mass conservationis guaranteed whenthe neighboring cell isupdated andthe flux capacitors areemptied.Inordertoimprovetheaccuracy oftheirasynchronous method,Omelchenko andKarimabadi de-velopedin [12]a second orderDESalgorithmandshowedthat,atleastnumerically, thesecond orderisindeedattained on one-dimensionalgasdynamicstestproblems.Morerecently,theyextendedin[13] theirDESalgorithmtomultiple di-mensions inthecaseof“logicallyuniformmeshes”. In[16],V.A.Semiletov andS.A.Karabasovproposed an asynchronous time-stepping algorithm for the Compact Accurately Boundary Adjusting high-REsolutionTechnique (CABARET), which is an Eulermethod fornon-linear aeroacoustic problems. Numericaltests up to 3D show that the asynchronous algorithm maintains thesamesecond-orderconvergencerateastheoriginal single-timesteppingschemeandthat italsodecreases the absolutenumerical error. However, the question of the performance in terms of computational time remains open. In [10]V.A.Semiletov andS.A.Karabasov adaptedtheir asynchronous methodforsolvingfluid dynamicsequations.Their approach isbasedon atransformationof thegoverningequationsinspaceandtime inorderto rewritethe initial prob-lem on a uniform Cartesian grid, with adapted conditions at the grid interfaces. As for the numerical implementation of their scheme, the authors combined their new asynchronous method with their CABARET scheme presented in [16] which results ina quasi-second orderscheme.The correspondingnumericaltestswere limitedto two-dimensional prob-lems.
Inthispaperwefocusontheasynchronous methodproposedbyone oftheco-authorT.Unfer.Hismethoduses local stabilitycriteriaandcanbeusedonanarbitrarymesh.In[22],hedevelopedanupwindasynchronousforwardEulerscheme forthetransportequationinonedimension.Westudiedtheextensionofthismethodtoanarbitraryspacedimensionand we proved through a thoroughanalysis that the asynchronous scheme isfirst order convergent. In addition,we noticed thattheasynchronousschemereducesnumericaldiffusionandCPUtimeincomparisontotheclassicalfirstorderscheme. Toimprovetheconvergencerateoftheasynchronousscheme,T.Unferproposed in[21] anextensionoftheasynchronous integration procedure to higher order. This method was explored in some details from a practical viewpoint. We then explored it froma moretheoretical viewpointandweproved that theasynchronous schemeasit was presentedin[21] isatmostoforderone.ThederivationofhigherorderexplicitLTSmethodsisaseriousproblemandonlyafewmethods are available. In [18], A. Taube et al. proposed an ADER-DG scheme withtime-accurate LTS forthe Maxwell equations. Theyfound vianumericalstudythattheir approachmaintainsthe accuracyoftheglobaltime-steppingADER-DGscheme “withaverysmallincreaseincomputationaleffort”.Grote etal.[4]derivedan explicitLTS schemefromstandardexplicit Runge–Kutta(RK)methods.TheyhaveprovedthattheirschemepreservestheaccuracyoftheoriginalRKscheme.However, theirapproachisapplicableonlyinthecaseoftwodistinctregionsinthemesh:a“coarse”regionwiththelargerelements anda“fine”regionwiththesmallerelements.Inthefineregion,thetimestepmustbeafractionofthecoarsetime step. Here,wedevelopanasynchronousRK2schemeinanarbitrarymesh.
This paperisoutlined asfollows.In section 2, wesummarizebasicideas ofthe asynchronousmethodology. Then,we consideran arbitraryspacediscretizationofa givenlinearpartialdifferentialequationandweapplytheARK2method.In section 3,we provethattheasynchronousschemeissecond orderconvergent.Next,numericalexperimentsthatillustrate the convergencerateof theasynchronous schemeand validate thegain incomputation time are presented insection 4. Finally,someconclusionsandfinalremarksaregiveninsection5.
2. Numericalasynchronoussimulation
Forstability reasons,anyexplicitmethod hasto fulfilla time-step restriction.Thismayreduce the methodefficiency, becauseautomaticgridgenerationtoolsforunstructuredmeshesmayproduceverysmallcellsingeometricallycomplicated partsofthecomputationaldomain.Toovercomethisproblem,weproposeanasynchronousschemewithlocaltimesteps.
We consider
a polygonal domain in
R
d (d≥
1). LetT = {
Ki
,
i=
1,
. . . ,
N}
be a partition of the domainin N polyhedral volumes Ki. For all i
∈ J
1,
NK
,N (
i)
= {
j,
|
Ki∩
Kj|=
0}
is the set of indices of neighboring volumes of Ki;|
Ki∩
Kj|
denotesthe(
d−
1)
positivemeasureofKi∩
Kj.LetN
+(
i)
(resp.N
−(
i)
)bethesetoftheindicesoftheneighbors oftheelementKiwhoreceive(resp.send)theoutflow(resp.inflow)fluxesfrom(resp.to) Ki.Wesupposeanarbitraryspacediscretizationofagivenlinearpartialdifferentialequationwithoutasourceterm,forthe sakeofsimplicity.Thisleadstoasystemofcoupledordinarydifferentialequations
dyi
dt
(
t)
=
Biy(
t),
∀
Ki∈
T
,
(1)whereyiandy arerespectivelyvectorsthatcontainthedegreesoffreedom(dofs)intheelementKiandinalltheelements, andBiamatrixcomingfromthespacediscretization.
2.1. Asynchronousmethodology
Theasynchronousmethodologyisbasedontwokeyideas.Firstofall,theasynchronousalgorithmpermitstheselection ofindependenttimesteps ineach meshelement.Then,thesequenceofupdatetimesofthecellsisnolongerdefinedby continuously addinga uniformtimestep asintheclassical case. Weneed tointroduce thesequenceof themosturgent refreshtimetag
(
ti)
i≥1 definedby t1
=
min j(
tj)
:=
ti,
t 2=
min min j=i(
tj),
2ti
,
. . .
where for all i, the time step
ti of the element Ki is determined from the local stability restriction. Let
(
tri)
r be the sequenceofupdatetimesofagivenelementKiwhichisdefinedbytri
=
rti
,
∀
r≥
0.
Foreachelement,thesourcetermdependsonlyonthecurrentcellwhereasthefluxesdependalsoonneighbors.Hence thesecondbasicideaconsistsinseparatingthesourcetermandthefluxes.Inotherwords,weneedtosplitthetermBi (1) inthreecontributions: Vi foralocalvolumecontribution, Fi+,j foralocalfluxcontributionandFi−,j foraflux contribution fromtheneighbor.Then(1)becomes
dyi
dt
(
t)
=
Viyi(
t)
+
j∈N (i)F+i,jyi
(
t)
+
Fi−,jyj(
t),
∀
Ki∈
T
.
(2)Theconcept oftheasynchronous timeintegrationcan besummarizedinthreemainphases.First, atthestart-up time all densities andfluxes are initialized. Then, a second phase iscarried out by continuously applyingthe following steps untiltheglobalsimulationclockisadvancedpastthesimulationfinishtime;seealsoFig. 1:findthemosturgentcelltobe refreshedbyusingtheCFLcondition,computethevaluesofthedensitieswhichareneededtocomputethefluxes,compute thenext refreshtimeofthe currentcellandthenupdate thevaluesofthefluxes. Finally,thethird phaseistobuild the solutionattheoutputtime.
ThenumberofupdatedcellsinthesecondstepinFig. 1dependsonthetreatedsystemandonthespacediscretization method. Forexample, in the caseof an upwind scheme,we need to update only the outflowfluxes andthen only the currentcell Ki andtheneighborsKj,
∀
j∈
N
+(
i)
are involved;seealso[20].Whereasforacentered scheme,we needat leasttoupdatealltheneighbors.Inallcases,onlyfewcellsareinvolved.Note that the critical point for speedingup an asynchronous method interms of computation time is the searching algorithmforthemosturgenttimetagtobetreated,somealternativeoptionsarepresentedin[22].
2.2. ARK2algorithm
TopresenttheasynchronousRunge–Kutta2algorithm,westartwiththesemi-discretizedsystemwritteninthefollowing formasexplainedinsection2
dyi
dt
(
t)
=
Viyi(
t)
+
j∈N (i)Fig. 1. The four steps of the second phase of the asynchronous algorithm with five elements.
Wedenoteby ymi theapproximationofyi
(
mti
)
andbyy
t m i theapproximationof dyi dt
(
mti
)
.Pre-computedupdatetimescorrespond tomultiplesof
ti ineach Ki.The timeadvanceisperformedusingtheDiscreteTimeScheduler(DTS) algorithmasdefinedin[22],tcuristhecurrentsimulationtime.
Attcursuchthattcur
/
ti isaninteger p,wehavetoperformthefollowingoperations1. Computethetentativedofsattcur
˜
yi
(
tcur)
= ˜
yi(
tpre,i)
+ (
tcur−
tpre,i)
˜
yt i
(
tmid,i),
where tmid,i
=
tcur−
tcur−tpre,i 2
andtpre,i isthe last time wherethe tentative dofs hasbeen updated. The tentative mid-pointslopeisdefinedasfollows
y
˜
t i
(
tmid,i)
=
Vi yip−1+
ti 2
y
t p−1 i
+
j∈N+(i)Fi+,j yip−1
+
tmid,i,j− (
p−
1)
tiy
t p−1 i
+
j∈N−(i)Fi−,j ynj j
+
tmid,i,j−
njtj
y
t nj j
,
where nj=
tpre,i
tj , tmid,i,j
=
tnext,i,j−
tnext,i,j−
tpre,i,j 2with tpre,i,j
=
max(
njtj
,
(
p−
1)
ti)
and tnext,i,j=
min((
nj+
1)
tj,
pti
)
.2. Foralltheneighborsoftheelementi,i.e. j
∈
N (
i)
,weupdatethetentativevalues˜
yj
(
tcur)
= ˜
yj(
tpre,j)
+ (
tcur−
tpre,j)
y
˜
t j
(
tmid,j),
tpre,jisthelasttimewherethetentativedofshasbeenupdatedandtmid,j isthe“mid-point”timefortheelement j. 3. Affectthedofsandupdatethelastrefreshtimes
ypi
= ˜
yi(
tcur),
tpre,i=
tcurand tpre,j=
tcur,
∀
j∈
N
(
i).
4. Afterallelements Ki suchthattcur
/
tiisanintegerhavebeenupdated,updatetheirslopesy
t p i
=
Viyip+
j∈N (i) Fi+,jyip+
Fi−,j ynj j+ (
pti
−
njtj
)
y
t nj j
,
where nj=
tcur
tj
.
Fig. 2. Notation:the temporalevolutionofthesystembetweentwoinstantstri
i and t ri+1
i wheretheelements j andn arethe neighborsofi (left).
DependencyoftheelementKiontheotherelementsofthemesh(right).(Forinterpretationofthereferencestocolorinthisfigure,thereaderisreferred
tothewebversionofthisarticle.)
Thefinalsimulationtimeshouldcorrespondtoacommonmeetingtimeofallcellsi.e. amultipleofthelowestcommon multiple(LCM)ofallthelocaltimesteps.Ifitisnotthecase,weuseaninterpolationatleastoforder2.
3. Propertiesoftheasynchronousscheme
3.1. ConvergenceoftheARK2scheme
The asynchronousRK2 schemeisderived fromthe standardexplicitRunge–Kutta 2method.We shallnowprove that theARK2schemeissecondorderconvergent.
3.1.1. Notations
1. Let usbegin withsome notations concerning the temporal evolution of the systembetween two successive update timesofanelement Ki whicharemultiplesof
ti;seealsoFig. 2(left)
(a) Lettri+1
i
:= (
ri+
1)
ti=
tcur≤
T whereT isthefinaltimeofthesimulation.Supposethattheelement Kihasbeenupdatedm timesbetweentri
i andt ri+1
i attheinstants:t1pre,i,t2pre,i,...,tmpre,i,0
≤
m≤
ti minj∈N (i)(tj)+
1. (b) Letδ
ik=
tk pre,i−
t k−1pre,i, for all 2
≤
k≤
m,δ
i1=
t1pre,i−
t ri i ,δ
m+1 i=
t ri+1 i−
tmpre,i and m+1 k=1δ
ki=
ti. Then, tkmid,i=
tkpre,i
− δ
ki/
2 for1≤
k≤
m andtmmid+1,i=
tri+1i
− δ
m+1 i/
2.(c) Supposethatforall j
∈
N (
i)
wehavethefollowingassumptions: i. betweentrii andt ri+1
i ,the element j has beenupdated z ri
j timesatinstantswhich are multiplesof
tj then 0
≤
zrij
≤
m.Forexample,inFig. 2(left),z rin
=
1 andzrji=
5 wheren and j aretwoneighborsofi.ii. rj
tj
≤
riti andattheinstanttkpre,i,rkj
tj denotethelast updatetimeoftheelement j whichismultipleof
tj justbefore tkpre,i.Notethat theremaybe situationswhere r k+1
j
tj
=
rkjtj.In Fig. 2forinstance,forthe neighborn,r4 j
tj
=
r3jtj
=
r2jtj
=
r1jtj
=
rntn. iii. tk mid,i,j
=
tknext,i,j−
tk next,i,j−
tkpre,i,j 2=
t k next,i,j+
tkpre,i,j 2 ,with: A. tkpre,i,j=
max(
rkjtj
,
riti
)
; B. tknext,i,j=
min((
rkj+
1)
tj,
(
ri+
1)
ti)
.2. Theupdate ofan element Ki involvesitsneighbors.Theneighborsof Ki dependontheir neighbors,theneighborsof the neighborsof Ki involves their neighbors and soon. Then, we introduce thefollowing notations; see also Fig. 2 (right).
(a) Theelements Kj, j
∈
N (
i)
,aretheneighborsofKi.(b) Theelements K{i,1},
{
i,
1} ∈
N (
j)
,aretheneighborsofKj andthentheneighborsoftheneighborsofKi. (c) Theelements K{i,2},{
i,
2} ∈
N ({
i,
1})
,aretheneighborsoftheneighborsoftheneighborsofKi.Lethri
{i,ts} bethenumberofdependenciespresentedbytheelement Ki betweent
ri
i andts,wherets isthetimeofthe last synchronizationofthesolutionwhichcanparticularlybe theinitialtime t0.Inother words,theelements K
{i,hrii,ts} usetheirvaluescomputedatthetimeofthelastsynchronizationofthesolution.Notethatthestudyofthedependency will benecessaryforthetheoreticalstudy oftheARK2schemewhichneeds thecontrolofthetemporal evolutionof the system.However, forthe asynchronousalgorithm,theupdateofa givenelementinvolvesonlytheneighborsand then noneedoftheneighborsoftheneighbors.Thisdependencyisimplicitanditishiddenbehindtheintermediate stepsofthealgorithm.
3. Letusintroducesomeelementsfortheidentification:
(
Bx)
i=
Vixi+
j∈N+(i) Fi+,jxi+
j∈N−(i) Fi−,jxj(
B Pix)
i=
Vixi+
j∈N+(i) Fi+,jxiwhere Piisadiagonalprojectionmatrixonthedofoftheelementi.
Proposition1.Theevolutionofanygivencelli betweenthetwoinstantstri
i
=
ritiandtiri+1
=
t rii
+
tiisgivenbythefollowingexpression yri+1 i
=
y ri i+
ti Vi yri i+
ti 2
y
t ri i
+
m+1 k=1
δ
kij∈N+(i) Fi+,j yri i
+
tkmid,i,j−
riti
y
t ri i
+
m+1 k=1
δ
ikj∈N−(i) Fi−,j yr k j j
+
tkmid,i,j−
rkjtj
y
t rk j j
.
Proof. Bydefinition,wehave
yri+1 i
= ˜
y tm pre,i i+ δ
m+1 i˜
yt i
(
tmmid+1,i)
= ˜
yt m−1 pre,i i+ δ
m i˜
yt i
(
tmmid,i)
+ δ
mi +1y
˜
t i
(
tmmid+1,i)
..
.
=
yri i+
m+1 k=1
δ
ki˜
yt i
(
tkmid,i)
whereforall1
≤
k≤
m+
1y
˜
t i
(
tkmid,i)
=
Vi yri i+
ti 2
y
t ri i
+
j∈N+(i) Fi+,j yri i+
tkmid,i,j−
riti
y
t ri i
+
j∈N−(i) Fi−,j yr k j j+
tkmid,i,j−
rkjtj
y
t rk j j
Toconclude,wejustusethat
km=+11δ
ki=
ti.2
Proposition2.Foranygivenelementi,andbetweentwoinstantstri
i andt ri+1
i ,wehavethefollowingresult
m
+1 k=1
δ
ki×
tkmid,i,j=
(
2ri+
1)(
ti)
2
2
,
∀
j∈
N
(
i).
(4)Inparticular,betweenthetwoinstantsts,anytimeofthesynchronizationofthesolution,andts
+
tiwehave∀
j∈
N (
i)
m
+1 k=1
δ
ki×
tkmid,i,j=
(
ti)
2
Proof. Wefirstprovethat,foragivenelement j
∈
N (
i)
,betweentwosuccessivetimesofupdateoftheelement Kjwhich are multiples oftj and which are betweentiri and trii+1, tmidk ,i,j is constant. In other words, if ri
ti
≤ (
rj+
1)
tj<
(
rj+
l)
tj≤ (
ri+
l+
1)
ti,l integer,thenforallk suchthat(
rj+
l)
tj<
tkpre,i≤ (
rj+
l+
1)
tjwehavetkpre,i,j
=
max(
rkjtj
,
riti
)
=
max((
rj+
l)
tj,
riti
)
= (
rj+
l)
tj and tknext,i,j=
min((
rkj+
1)
tj, (
ri+
1)
ti)
=
min((
rj+
l+
1)
tj, (
ri+
1)
ti)
= (
rj+
l+
1)
tj.
Asaresult, tkmid,i,j=
(
2(
rj+
l)
+
1)
tj 2.
(5)Now,weproveEquation(4).Supposethatforall j
∈
N (
i)
,zrij isthenumberoftimesofupdateoftheelement Kjattimes thataremultiplesof
tj betweenri
tiand
(
ri+
1)
ti.Therearethreecases:1. If zri
j
=
0 thenrjtj
≤
riti
≤ (
ri+
1)
ti≤ (
rj+
1)
tj andforall1≤
k≤
m+
1,tmidk ,i,j
=
(
ri+
1)
ti+
riti 2
=
(
2ri+
1)
ti 2,
therefore, m+1 k=1
δ
ki×
tkmid,i,j=
(
2ri+
1)(
ti)
2 2.
2. If zri j=
1 thenwehave m+1 k=1
δ
ki×
tkmid,i,j=
((
rj+
1)
tj+
riti
)
2× ((
rj+
1)
tj−
riti
)
+
((
ri+
1)
ti+ (
rj+
1)
tj)
2×
(
ri+
1)
ti− (
rj+
1)
tj=
1 2(
ri+
1)
2ti2
−
r2iti2
=
(
2ri+
1)(
ti)
2 2 3. If zrij
≥
2 thenrjtj
≤
riti
< (
rj+
1)
tj< ...
< (
rj+
zrji)
tj≤ (
ri+
1)
ti< (
rj+
zrji+
1)
tj.Using(5)we havethe followingresult m+1 k=1
δ
ki×
tkmid,i,j=
((
rj+
1)
tj+
riti
)
2× ((
rj+
1)
tj−
riti
)
+
zrij−1 s=1(
2(
rj+
s)
+
1)
tj 2×
tj+
((
ri+
1)
ti+ (
rj+
z ri j)
tj)
2× ((
ri+
1)
ti− (
rj+
z ri j)
tj)
=
((
rj+
1)
tj)
2 2−
(
riti
)
2 2+
((
ri+
1)
ti)
2 2−
((
rj+
zrji)
tj)
2 2+
zrij−1 l=1(
2(
rj+
l)
+
1)
×
(
tj)
2 2.
Notethatz ri j−1l=1
(
2(
rj+
l)
+
1)
isasumofanarithmeticsequencewithacommondifferenceequaltotwo.Therefore, zrij−1 l=1(
2(
rj+
l)
+
1)
=
zrij−1 l=0(
2(
rj+
l)
+
1)
−(
2rj+
1)
=
zri j(
4rj+
2zrji)
2− (
2rj+
1)
=
zri j 2+
2rj(
zrji−
1)
−
1andthen m
+1 k=1
δ
ik×
tkmid,i,j=
(
2ri+
1)(
ti)
2 2.
Theproofisthencomplete.
2
Proposition3.Foranyelementi andateachtimetri
i
=
riti,rianinteger,wehavethefollowingexpressionoftheslopei
y
t ri i
=
B Piyri i+
j∈N−(i) Fi−,j yrj j+ δ
i,j⎛
⎝
B Pjyrj j+
{i,1}∈N−(j) F{−i,1},jyr{i,1} {i,1}⎞
⎠
+
hrii,ts−1 z=1⎛
⎝
z n=1δ
{i,n−1},{i,n} {i,n}∈N−({i,n−1}) F{−i,n−1},{i,n}⎞
⎠
×
⎛
⎝
B P{i,z}yr{i,z} {i,z}+
{i,z+1}∈N−({i,z}) F{−i,z},{i,z+1}y{r{ii,,zz++11}}⎞
⎠ ,
whereforall j
∈
N
−(
i)
,δ
i,j=
riti
−
rjtj
>
0 andrjtjisthelasttime,multipleof
tj,ofupdatingtheelement j justbefore trii.
Forall1
≤
z<
hrii,ts
−
1,r{i,z}t{i,z}
<
tri
i and
δ
{i,z−1},{i,z}=
r{i,z−1}t{i,z−1}
−
r{i,z}t{i,z}
>
0 wheretsisthetimeofthelastsyn-chronizationofthesolutionandhri
{i,ts}isthenumberofdependenciespresentedbytheelementKibetweent
ri
i andts.Wealsosupposed
thatK{i,0}
≡
Kj,j∈
N (
i)
aretheneighborsofKi. Proof. Theexpressionoftheslopei attrii isgivenbythefollowingformula
y
t ri i
=
Viyrii+
j∈N+(i) Fi+,jyri i+
j∈N−(i) F−i,j yrj j+ δ
i,jy
t rj j
=
B Piyri i+
j∈N−(i) Fi−,j yrjj+ δ
i,jy
t rj j
,
theelements j aretheneighborsoftheelementi andforall j
∈
N
−(
i)
,δ
i,j=
riti
−
rjtj
>
0 whererjtj
=
t rjj
<
t rii is thelasttime,multipleof
tj,wheretheelement j hasbeenupdated.UsingthenotationsillustratedinFig. 2wehavethe followingresult
y
t rj j
=
B Pjyrj j+
{i,1}∈N−(j) F−j,{i,1} yr{i{i,,11}}+ δ
j,{i,1}y
t r{i,1} {i,1}
,
where for all j
∈
N (
i)
and for all{
i,
1}
∈
N (
j)
, r{i,1}t{i,1}
=
t r{i,1}{i,1}
<
trj
j is the last time, multiple of
t{i,1}, where the element K{i,1}hasbeenupdated.Then
δ
j,{i,1}=
rjtj
−
r{i,1}t{i,1}
>
0 andy
t r{i,1} {i,1}
=
B P{i,1}yr{i,1} {i,1}+
{i,2}∈N−({i,1}) F{−i,1},{i,2} yr{i,2} {i,2}+ δ
{i,1},{i,2}y
t r{i,2} {i,2}
.
Likewise,y
t r{i,2} {i,2}
=
B P{i,2}yr{i,2} {i,2}+
{i,3}∈N−({i,2}) F{−i,2},{i,3} yr{i,3} {i,3}+ δ
{i,2},{i,3}y
t r{i,3} {i,3}
.
In the same way, we define for all 3
≤
z<
hrii,ts
−
1, the slopesy
t r{i,z} {i,z} and
δ
{i,z−1},{i,z}=
r{i,z−1}t{i,z−1}
−
r{i,z}t{i,z}
>
0.Inparticular,forz=
hriFig. 3. The temporal evolution of the system between two instants of synchronization tqand tq+1=tq+ t
max; example of five elements.
y
t r {i,hri i,ts−1} {i,hrii,ts−1}
=
B P{ i,hrii,ts−1}y r {i,hri i,ts−1} {i,hrii,ts−1}+
{i,hrii,ts}∈N−({i,hrii,ts−1}) F− {i,hrii,ts−1},{i,hrii,ts} yts {i,hrii,s}+ δ
{i,hri i,ts−1},{i,h ri i,ts}y
t ts {i,hrii,ts}
,
where,t r {i,hrii,ts } {i,hrii,ts}=
ts.Consequently,y
t ri i
=
B Piyri i+
j∈N−(i) F−i,j yrj j+ δ
i,j B Pjyrj j+
{i,1}∈N−(j) F−j,{i,1} yr{i,1} {i,1}+ δ
j,{i,1} B P{i,1}yr{i,1} {i,1}+
{i,2}∈N−({i,1}) F{−i,1},{i,2} yr{i,2} {i,2}+ δ
{i,1},{i,2}...
+ ... + δ
{i,hri i−2},{i,hrii−1} B P{ i,hrii−1}y r {i,hrii−1} {i,hrii−1}+
{i,hrii}∈N−({i,hrii−1}) F− {i,hrii −1},{i,hrii} y0 {i,hrii}+ δ
{i,hrii −1},{i,hrii}(
B y 0)
{i,hrii} andtheny
t ri i
=
B Piyri i+
j∈N−(i) F−i,j⎛
⎝
yrjj+ δ
i,j⎛
⎝
B Pjyrj j+
{i,1}∈N−(j) F{−i,1},jyr{i,1} {i,1}⎞
⎠
⎞
⎠
+
hrii−1 z=1
⎛
⎝
z n=1δ
{i,n−1},{i,n} {i,n}∈N−({i,n−1}) F{−i,n−1},{i,n}⎞
⎠
×
⎛
⎝
B P{i,z}yr{i,z} {i,z}+
{i,z+1}∈N−({i,z}) F{−i,z},{i,z+1}y{r{ii,,zz++11}}⎞
⎠ ,
wesupposedthat K{i,0}
≡
Kj, j∈
N (
i)
aretheneighborsofKi.2
Tostudytheconvergencerateoftheasynchronous scheme,we firstneedtosupposethatthesolutionissynchronized at the instants ts which are multiples of a fixed time step
tmax. In the case of meshes where all the elements have a common meeting time, the solution is automatically synchronized at the instants which are multiples of the lowest common multiple (LCM) of all the local time steps and then
tmax is equal to the LCM. If it is not the case, so that the local time steps are completely independent, we have to suppose that the solution is regularly synchronizedat the instantsmultiplesof
tmax where,inthiscase,
tmax iscomputedinfunctionofthetimesteps
ti.Forinstance,wecan choose
tmax
=
maxi(
ti)
. Note that the hypothesis of regular synchronizationof the solution hasno influence on the asynchronousaspect;seeFig. 3.Infact,thecellsadvanceindependentlybetweentq andtq+1.Itisimportantto notethattheregularsynchronizationofthesolutionwasimposedonlyfordemonstrationpurposes.Numerically,thesynchronization is only madeatthefinal time of thesimulation independentlyof themesh; seethe ARK2algorithm. Let
(
tq)
q≥0 be the sequence defined by tq+1
=
tq+
tmax. The studyof the convergence of the ARK2 scheme will be done between two instantsofsynchronization:tq andtq+1.Let(
tnq)
1≤n≤Nq bethe sequenceofthemosturgentrefreshtime tags betweentq andtq+1,where Nq
=
isi andforalli, si=
tmax
ti
.
(
tki,q)
k is thesequenceofthe temporalevolutionof theelement i betweentq andtq+1 definedbytki,q
=
tq+
kti(Fig. 3). Theorem1.Foralltheelementsi andatanyinstanttri
i,q
=
tq+
ritisuchthattq
<
tri,iq≤
tq+1,thefollowingexpressionholdstrueyt ri i,q i
=
yt q i+ (
riti
)(
B yt q)
i+
(
riti
)
2 2(
B 2ytq)
i (6)+
2ri n=3(
ti)
nα
n,ri(
B Pi)
n−2(
B2ytq)
i+
Rri,iasyn,
(7) where:α
n,0=
0∀
n≥
3,α
3,ri=
α
3,ri−1+
(ri−1)2+(ri−1) 2 ,α
4,ri=
α
3,ri−1+
α
4,ri−1+
(ri−1)2 2 ,α
n,ri=
α
n,ri−1+
α
n−1,ri−1+
αn−2,ri−1 2∀
5≤
i≤
2(
ri−
1)
,α
2ri−1,ri=
α
2(ri−1),ri−1+
α2(ri −1)−1,ri −1 2 ,α
2ri,ri=
α2(ri −1),ri −1 2 ,and Rri i,asyn∞≤
Mrii n=3 Cn,ri(
ti)
nBn
∞ytq∞ (8) Mri
i
=
max(
2ri,
hiri)
andforall3≤
n≤
M rii,Cn,riisaboundedconstant.
Proof. TheproofofTheorem 1isverytechnicalandmanyformulasarelong,itisthenplacedintheappendix.
2
Theorem2.TheasynchronousRunge–Kutta2schemeisasecondorderaccuratescheme. Proof. Let yi
(
tq+1)
bethesemi-discrete exactsolutionoftheODE(3)attq+1, ytq+1
i,syn isthesolutionoftheclassicalglobal timestepRunge–Kutta2methodattq+1and ytiq+1 theasynchronoussolution.Then,
yi(
tq+1)
−
yt q+1 i ∞≤
yi(
tq+1)
−
yt q+1 i,syn∞+
yt q+1 i,syn−
yt q+1 i ∞ (9)TheclassicalglobaltimestepRK2schemeissecondorderconvergentthen,
yi(
tq+1)
−
ytq+1
i,syn
∞≤
C0(
tmax)
3,
where,for
tmaxsufficientlysmall,C0isaconstantindependentof
tmax.Tocompletetheproofwethusneedtotreatthe secondtermontherightof(9).First,usingTheorem 1,wegetforalli
yt si q,i i
=
y tq i+ (
siti
)(
B yt q)
i+
(
siti
)
2 2(
B 2ytq)
i+
2si n=3(
ti)
nα
n,si(
B Pi)
n−2(
B2ytq)
i+
R tqsi,i i,asynwhereforalli,tsi
q,i
=
tq+
siti,si
=
Etmax
ti
.Asafirststep,supposethat tmax
ti isaninteger.Asaresult, ytiq+1
≡
yt si q,i i=
yt q i+ (
tmax)(
B yt q)
i+
(
tmax)
2 2(
B 2ytq)
i+
2si n=3(
ti)
nα
n,si(
B Pi)
n−2(
B2ytq)
i+
Rt q+1 i,asyn Consequently, ytiq,syn+1−
yitq+1∞=
Etiq+1∞,
where Etiq+1=
2si n=3(
ti)
nα
n,siB n−2(
B2ytq)
i−
2si n=3(
ti)
nα
n,si(
B Pi)
n−2(
B2ytq)
i−
Rt q+1 i,asyn=
2si n=3(
ti)
nα
n,sij∈N−(i) Fi,j n−2
(
B2ytq)
j−
Rt q+1 i,asyn,
B is linear.
Therefore,
Etiq+1∞≤
C1(
tmax)
3,
(10)wherefor
tmax sufficientlysmall,C1 isaconstantindependentof
tmax.Finally,notethatforthecasewhere tmaxti isnot an integerwe justneed tosynchronizethesolutionattq+1 using asecond orderinterpolationasitwas explainedwhen theARK2algorithmhasbeenpresented.
2
3.2. ComparisonwiththeLTS-RK2schemepresentedin[4]
In[4],Grote etal.derived an explicitLTS-RK schemefromstandardexplicitRunge–Kutta(RK) methodsinthe caseof twodistinctregionsinthemesh:a“coarse”regionwiththelargerelementsanda“fine”regionwiththesmallerelements. Inthefine region,thetime stepmustbe afractionofthecoarse time step.TheyusetheclassicalRK methodofagiven orderinsidethefineregionandquadratureformulainsidethecoarseregion.Theintermediatevaluesneededatthecoarse andfine mesh interface are obtainedthrough a judicious combinationof interpolation andTaylor expansion. Theyhave proved that their schemepreserves the accuracy of the original RK scheme. We started from their LTS-RK scheme and we adapted the algorithm presented in [4] to the asynchronous aspect in order to apply the three main phases ofthe asynchronousintegration.Inparticular,eachcellistreatedindependently.Inthispaper,weonlyfocusonordertwo.Inthe caseofthemeshimposedin[4],weprovedthattheARK2degeneratesintotheLTS-RK2scheme.Moreprecisely,usingthe asynchronousalgorithm, wegetthe sameresultateachglobaltime step
t, where
t denote thetime stepdictatedby theCFL-conditioninthecoarserpartofthemesh.Whereas,ateachtimestepofsizen
t
/
p wheret
/
p isthelocaltime stepinsidetherefinedregionofthemeshwithp andn areintegersand0<
n<
p,theresultisdifferent.Infact,wehave thefollowingresultProposition4.Inthecaseoftwodistinctregionsinthemeshwherethetimestepofthefineregionisafractionofthetimestepofthe
coarseregion,theARK2schemedegeneratesintotheLTS-RK2schemepresentedin[4].
Proof. Resumingthenotationusedin[4]:
=
c∪
f wherec isthecoarseregionwiththelargetimestep
tc and
f isthefineregionwiththesmalltimestep
tf
=
ptc,p isaninteger.Introducesomeelementsfortheidentification(
Bx)
i=
Vixi+
j∈N+(i) F+i,jxi+
j∈N−(i) Fi−,jxj(
B P x)
i=
Vixi+
j∈N+(i) F+i,jxi+
j∈N− f(i) Fi−,jxj(
B(
I−
P)
x)
i=
j∈Nc−(i) Fi−,jxjwhere
N
c(
i)
andN
f(
i)
aretheneighborsoftheelement i whichare respectivelyinthecoarsemesh andthefine mesh. Inotherwords, j∈
N
c(
i)
then j∈
N (
i)
∩
c and j∈
N
f(
i)
then j∈
N (
i)
∩
f. P isadiagonalmatrix,itsdiagonalentries areequaltozeroorone.Itidentifytheunknownsassociatedwiththelocallyrefinedregion.(
I−
P)
identifytheunknowns associated withthe coarse region. Theprevious resultsare proved for anygiven ofmesh, inparticular fora mesh with two regions. In fact,using the previous notationswe havetmax
=
tc.This isa natural choicebecause the solutionis regularlysynchronizedattheinstantsmultiplesoftc.Thenthesequence
(
tq)
qisdefinedbytq+1=
tq+
tc,foralli∈
f,(
tki,q)
0≤k≤p is definedby tki,q=
ktf andfor all j
∈
c,(
tkj,q)
0≤k≤1 is definedby t0j,q=
tq andt 1j,q
=
tq+1.Let ususethe previousresultsofProposition 1andProposition 2forafineelement,theproofforthecoarseelementscanbedoneinthe sameway.Foralli∈
f wehaveyt ri+1 i,q i
=
y trii,q i+
tf B P ytrii,q i+
(
tf)
2 2⎛
⎝
B Py
t trii,q
⎞
⎠
i+
tf B(
I−
P)
ytq i+
(
2ri+
1)(
tf)
2 2 B(
I−
P)
y
t tq i
,
withy
t trii,q i
=
B P ytrii,q i+
B(
I−
P)
ytq+
ritf
y
t tq i
ComparingwiththeexpressionoftheslopegiveninProposition 3,thedependencieswiththeneighborsaresimplified.In fact,inthisparticularcase,thereareonlytwotypesofelementsthefinecells whicharegroupedinthematrix P andthe coarse cellswhichare groupedinthe matrix
(
I−
P)
.Allthefineelements hasbeenupdatedinthesametimesandit is alsothecaseforthecoarseelements.Consequently,theexpressionof yi (6),i a fineelement,inTheorem 1iswrittenin thefollowingformyt ri i,q i
=
yt q i+ (
ritf
)(
B yt q)
i+
(
ritf
)
2 2(
B 2ytq)
i+
2ri m=3(
tf)
mα
m,ri(
B Pi)
m−2(
B2ytq)
i,
andthenwefindthesameexpressiongivenin[4].2
3.3. OriginalityoftheasynchronousformulationinARK2
The originality ofour approachwithrespect to theLTS schemedeveloped in[4],andmore generallywithrespect to anyotherstandardlocaltime-steppingapproach,lieswithinitsabilitytoadapttoanyarbitrarymeshanditsresiliencefor beingsecondorderconvergentschemeindependentlyofthechosenmesh.Indeed,forthestandardLTSschemes,localtime stepsareusuallyselectedtobefractionsoftheglobaltimestepwhereastheasynchronousschemepermitstheselectionof independenttimestepsineachmeshelement.
However, asmentioned inthe introduction,the ideaof usinga differenttime-step inevery element alreadyexists in the literature.In[11],OmelchenkoandKarimabadiintroduced such anasynchronous schemefordrift-diffusion transport. The major difference between their asynchronous approach and the one presented herelies within the way the fluxes are updated. In[11],theconcept of“flux capacitor”, avariablein whichtheflux betweentwo cellsis stored,is used.In ourapproach,desynchronizedfluxescanbeusedwhichallows fluxesrecomputationsinordertoensurethesecond-order convergence.
In [18], A. Taube etal. proposed an ADER-DG scheme with independent time steps forthe Maxwell equations with arbitrary high order of accuracy in space and time, although no proof of the convergence rate is given. Their method consistsinreplacingthetimederivativeswithspacederivatives.Thecomputationofthefluxbetweenthegridcellsisthen basedonthesolutionofgeneralizedRiemannproblems.Ourasynchronousmethodisdifferent,sinceitisderivedfromthe standardexplicitRunge–Kutta2methodandthusthenumericalsolutionneedstogothroughintermediatestagesinorder tobeadvancedintime.
4. Numericaltests
Weconsiderthetwo-dimensionalMaxwellequationsintransverseelectricmode:
⎧
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎩
∂ ∂tEx=
1∂∂yHz in× (
0,
T)
∂ ∂tEy= −
1 ∂∂xHz in× (
0,
T)
∂ ∂tHz=
1 μ∂∂yEx−
1 μ∂∂xEy in× (
0,
T)
(11) where E=
Ex,
Ey,
0 TandH
= (
0,
0,
Hz)
T arerespectivelytheelectricandmagneticfields.and
μ
are,respectively,the electricpermittivity andmagneticpermeability.isthecomputationaldomainandT is thefinal timeofthesimulation. Forthespatialdiscretization,weapplythesecondorder P1-discontinuousGalerkinschemewithupwindfluxes.
4.1. NumericalconvergenceoftheARK2scheme
We shall now present numerical experiments which validate the order of convergence of the ARK2 method derived in section 3. The one-dimensional tests assessing the second order convergence ofthe ARK2 scheme havealready been published in [20]. Inthispaper, weconsider thetwo-dimensional Maxwell equations(11). Theinitial condition:
(
Ex)
0=
(
Ey)
0= (
Hz)
0=
cos(
2π
x)
,is a rectangular domain of size
[
0,
1]
× [
0,
1]
and the boundary conditions are supposed periodic.First, we divide
[
0,
1]
in X -direction, respectivelyin Y -direction,into treeequalparts. Theleft andrightintervals are discretized with an equidistantmesh of sizex,respectively
y.The middle interval isdiscretized withan equidistant mesh of size p
x
/
q, respectively py
/
q where p andq are integers; seealso Fig. 4 left. We start with a simplecase, mesh 1, wherethelocaltime steps arefractionsof theglobaltimesteps with p=
4 andq=
1.Then foramoregeneral case, mesh 2, we supposethat p=
2, q=
3 and then p/
q<
1.Finally, mesh 3is totest the ARK2algorithmin thecase wherethetimestepsareindependent,wesupposethatthepositionsxiand yi followapolynomiallowandthusxi