ContentslistsavailableatScienceDirect
European
Journal
of
Operational
Research
journalhomepage:www.elsevier.com/locate/ejorStochastics
and
Statistics
Compact
Markov-modulated
models
for
multiclass
trace
fitting
Giuliano
Casale
a,∗,
Andrea
Sansottera
b,
Paolo
Cremonesi
ba Department of Computing, Imperial College London, 180 Queen’s Gate, SW7 2AZ, London, UK. b Politecnico di Milano, DEIB, Via Ponzio 34/5, 20133 Milan, Italy
a
r
t
i
c
l
e
i
n
f
o
Article history: Received 8 May 2014 Accepted 6 June 2016 Available online xxx Keywords: Counting processMarked Markov-modulated Poisson process Trace
Fitting
a
b
s
t
r
a
c
t
Markov-modulatedPoissonprocesses(MMPPs)arestochasticmodelsforfittingempiricaltracesfor sim-ulation,workloadcharacterizationand queueinganalysis purposes.Inthispaper, wedevelop thefirst countingprocessfittingalgorithmforthemarkedMMPP(M3PP),ageneralizationoftheMMPPfor mod-elingtraces withevents ofmultipletypes.We initiallyexplain howtofittwo-stateM3PPs to empiri-caltracesofcounts.Wethenproposeanovelformofcomposition,called interposition,whichenables theapproximatesuperpositionofseveraltwo-stateM3PPswithoutincurringintostatespaceexplosion. Comparedtoexactsuperposition,wherethestatespacegrowsexponentiallyinthenumberofcomposed processes,ininterpositionthestatespacegrowslinearlyinthenumberofcomposedM3PPs. Experimen-talresultsindicatethattheproposedinterpositionmethodologyprovidesaccurateresultsagainstartificial andreal-worldtraces,withasignificantlysmallerstatespacethansuperposedprocesses.
© 2016TheAuthors.PublishedbyElsevierB.V. ThisisanopenaccessarticleundertheCCBYlicense(http://creativecommons.org/licenses/by/4.0/).
1. Introduction
The Markov-modulated Poisson process (MMPP) is a general fitting tool for traces with correlated arrivals (Fischer & Meier-Hellstern,1993)whichfindsapplicationinmodelingnetwork traf-fic (Okamura, Dohi, & Trivedi, August 2009), disk I/O patterns (Verma & Anand, December 2007), gridand cloud workloads(Li, Muskulus, & Wolters, 2006), and self-adaptive software systems (Perez-Palacin,Merseguer, &Mirandola, 2012). Acentral property ofMMPPsisthecomposabilitywithotherMarkovmodels,suchas queueingsystems(Horváth,Horváth,&Telek,2009;Horváth,2012; Houdt, 2012) or stochastic Petri nets (Perez-Palacin et al., 2012), whichenables usingnon-renewalprocessesinperformance mod-els.Thisisimportantbecauseautocorrelationandtemporal depen-dence can greatly affect system performance andtherefore need to be taken into account in system modeling (Mi, Zhang, Riska, Smirni,&Riedel,2007).
The MMPP is a special caseof the Markovian Arrival Process (MAP) (Neuts, 1979) in which the departure process is a modu-lation of Poisson processes and the active process is chosen ac-cordingtothestateofacontinuous-timeMarkovchain.Inthis pa-per, we proposeageneralization oftheMMPP, whichwe callthe
markedMMPP(M3PP),anddevelopascalablemethodologyfor
fit-∗ Corresponding author. Tel.: +44 20 759 42920.
E-mail addresses: [email protected] (G. Casale), [email protected] (A. Sansottera), [email protected] (P. Cremonesi).
tingM3PPstoempiricaldatasets.TheM3PPcanbe regardedasa specializationoftheMMAP,themarkedextensionoftheMAP(He &Neuts,1998).Amarkedpointprocessassociatestoeacharrivala classlabel(He&Neuts,1998;He,2001),thusitisusefultomodel traceswitheventsofmultipleclasses(e.g.,readandwriterequests indiskdrives).Markedprocessesarealsoimportantinthe analy-sisofpriorityqueuesandmulticlassmodels(Buchholz,Kemper,& Kriege, 2010; Horváth et al., 2009; Horváth, 2012; Houdt, 2012). However, few techniquesexist fortheir fittingand they all focus onmatchingmomentsoftheinter-arrivaltimeprocessformarked MAPs(MMAPs)(Buchholzetal.,2010;Horváthetal.,2009). There-fore,notechniquesexistyetforfittingmarkedMarkov-modulated processesto countdata.Still, tolimit monitoringoverheads, only countdatacanbe extractedfromcertainclassesofcomputerand communications systems(e.g., network links).This motivates the investigationintomethodologiestofitmarkedcountdata.
In this paper, we fill this gap by developing approximate fit-ting algorithms for the counting process of the M3PP. In partic-ular,we first explain how to approximatelyfit a two-stateMMPP counting process and then extend the idea to two-state M3PPs. Theproposed approachisapplicable totraceswithaggregatedor coarse-grainedmeasurements,whichcannotbeanalysedusing ap-proachesbasedoninter-arrivaltimes,sincetheserequiremoments froma tracerecordingallthe arrivals.The maindrawbackisthat counting processes are mathematically less tractable than inter-arrivalprocesses,thereforeonenormallyfocusesonlow-order mo-mentsofcounts.
http://dx.doi.org/10.1016/j.ejor.2016.06.005
0377-2217/© 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ).
Two-state models are often insufficient to fit complex traces, therefore we also study the approximate fitting of large M3PPs. In the single class setting, a known limitation of MMPPs is the inabilityto simultaneously fit many statistical descriptors dueto the non-linearity of their underlying equations (Bodrog, Heindl, Horváth,& Telek,2008; Heindl,Horváth,& Gross, 2006; Horváth &Telek,2009).Thishasledtothedefinitionofseveralapproaches to fit complex traces by composing multiple small-sized MMPPs or MAPs using Kronecker operators (Andersen & Nielsen, 1998; Casale,Zhang,&Smirni,2010;Horváth&Telek,2002).These meth-ods employ composition operators for moment fitting, offering a differenttrade-off betweencomputationalcostandfittingaccuracy comparedto fittingmethods basedon theEM algorithm(Breuer, 2002;Horváth&Okamura,2013;Klemm,Lindemann,&Lohmann, 2003).In particular,thesuperposition operatorallows one to de-scribeatrace bythestatisticalmultiplexingofseveralMMPPs,at theexpenseofan exponential growthofthenumberof statesin theresultingprocess(Sriram& Whitt,1986).Thisstate space ex-plosionis anobstacle fortheapplicationofMMPPs andMAPsto modeling real systems; forexample it considerably slows down, oreven renders infeasible, the numerical evaluation ofqueueing modelsbymatrixgeometricmethods(Bini,Meini,Steffé,Pérez,& Houdt,2012;Pérez,Velthoven,&Houdt,2008).
In this paper, we tackle the state space explosion problem ofsuperposition by showingthat M3PPs admit a particular form of composition, which we call interposition, that enables several MMPPsto sharethesame state spacewithoutmutuallyaffecting themarginaldistributionsoftheircountingprocesses.However, in-terpositionintroduces spurious covariancesbetweenclass arrivals that may be seen as the cost of the state-space reduction. The interposition method defines an original approach to build large Markov-modulatedprocesses,inwhichasetofJtwo-stateM3PPs is merged into a singleM3PP process withjust J+1 states and without affecting the marginal counting processes of the initial M3PPs.We identify general conditions forinterposition to result in a valid MMAP andobserve that these conditions can be sat-isfiedby M3PPs,butnot by generalMMAPs. The abilityto inter-poseprocessesmakes a case forusingM3PPs, insteadof general MMAPs,forfittingcountdata.Wethen define amethodto auto-maticallyidentifytheM3PPstobeinterposedandamixed-integer linear program (MILP) that can help in automatically identifying theorderofinterpositionoftheM3PPs.
We conclude the paperby reporting fittingexperiments fora setofartificialandreal-worldtraces.Weshowthat theproposed M3PPfittingalgorithms are widely applicable andrun efficiently even in the caseof approximate fitting.We also find that inter-positionismuchmorescalablethansuperposition,whileretaining comparableaccuracy.
Summarizing,ourmaincontributionsareasfollows:
• Section 3 defines fitting algorithms for the counting process oftwo-stateM3PPs witharbitrarynumber ofclasses.Our for-mulas are in closed-form for exact fitting and use quadratic programming for approximate fitting. As a by-product, our approach also introduces the first infeasibility adjustment methodologyforapproximatefittingofunmarkedMMPPs. • Section4introducesthenewnotionofinterposition.Thisisan
aidtocomposemultipleM3PPs,whilepreservingtheir statisti-calproperties,aswerigorouslyestablishinTheorem1. • Section 5 develops a methodology for fitting empirical traces
usingtheinterpositionoperator.
The paper reports in Section 6 an experimental study on random models and a real trace validating the effectiveness of the proposed models and algorithms. In addition to the above, a description of necessary background is given in Section 2. Section7concludesthepaper.
2. Background 2.1. Modelandnotation
An m-state MMPP is specified by a continuous-time Markov chain(CTMC)withirreducibleinfinitesimalgeneratorQ,havingm
states,andratevector(
λ
(1),...,λ
(m)).WhentheCTMCisinstatej,theMMPPgeneratesarrivalsaccordingtoaPoissonprocesswith rate
λ
(j). The effect of the modulating action of the underlying CTMC is to enable the modeling of non-Poisson, possibly nonre-newal, arrival processes. We assume the underlying CTMC to be initializedaccordingtoitssteady-statedistributionsothatthe pro-cessistime-stationary.Foreaseofcomparisonwiththeliterature, weusethroughoutthepapertheMAP(D0,D1)notation,whereforaMMPPD1=diag
(
λ
(
1)
,. . .,λ
(
m)
)
andD0=Q− D1.Wedefinea M3PP[K]asaMMPPinwhicharrivalsaremarked withoneoutofKavailableclasses.Thismaybeseenasamarking ofthePoissonprocessesoftheMMPPwhereonedecomposeseach rate
λ
(j),1≤ j≤ m, intoarrivalratesqj,kλ
(j),1 ≤ k≤ K,subject tokqj,k=1,qj,k≥ 0.WhenanarrivalisgeneratedbyaPoisson process withrate qj,kλ
(j) itis said to be ofclass k. Equivalently, inmatrixnotation, augmentinga MMPP(D0,D1)withmarksde-fines a set of matrices D1,k=diag
(
q1,kλ
(
1)
,. . .,qm,kλ
(
m)
)
, such thatD1=Kk=1D1,k.Thetuple(
D0,D1,1,...,D1,K)
iscalledtherep-resentationoftheM3PP[K].Also,(D0,D1) istheembeddedMMPP,
whichwerefertoastheunmarkedprocess.
In the restof the paper we often deal withprocesses having
m=2states.Inthiscase,forreadability,weuse
λ
andλ
inplace ofλ
(1)andλ
(2)andqkandqk inplaceofq1,kandq2,k.2.2. Problemstatement
Let us consider a M3PP[K] and let nk(t) ≥ 0 be the num-ber of arrivals of class k at time t after initialization, subject to
nk
(
0)
=0forallclasses.ThecountingprocessoftheM3PP[K]isthe CTMC with state n(
t)
=(
n1(
t)
,n2(
t)
,...,nK(
t))
. We shalldenote by n(
t)
=Kk=1nk
(
t)
the aggregate arrival count. Given an initial state probabilityvector, theevolution overtime ofthisprocess is characterized bya matrixP(n,t), withelementpi,j(n, t) inrowi andcolumnjrepresenting theprobability thata M3PPinitialized instate iisinstate j attimet withanarrival countn(t).We re-ferto P(n,t) asthecounting processmatrix.The countingprocess evolvesaccordingtotheKolmogorovforwardequations∂
P(
n,t)
∂
t =P(
n,t)
D0+ K k=1 P(
n− ek,t)
D1,k, (1) where ek is a column vector of zeros except for a one in posi-tionk.Fromthisequation,itissimpletoderivefactorialmoment functions, from which moments of the counting process can be obtained in closed form He and Neuts (1998). For example,this methodyieldsthemeanarrivalcountofclasskattimetasμ
k(
t)
=E[nk(
t)
]=μ
kt,μ
k=π
D1,k1, (2) whereπ
is the stationary distribution of the embedded CTMC,π
Q=0,π
1=1,μk
isthemeanarrivalrateofclasskinthe time-stationaryprocess,0and1arerespectivelycolumnvectorsofzeros andones. The variance ofclass k incounts (also called variance-timecurve)is(He&Neuts,1998)Var[nk
(
t)
]=μ
k− 2μ
2k+2ckD1,k1t− 2ck I− eQtd k, (3)whereck=
π
D1,k(
1π
− Q)
−1anddk=(
1π
− Q)
−1D1,k1.Theabove formulas and notations are also valid for the unmarked process (K=1),butinthatcasewe omitthedependenceonclass k.Thecovarianceofcountsforclasseskandhisgivenby
Cov[nk
(
t)
,nh(
t)
]= 12
(
Var[nk(
t)
+nh(
t)
]−Var[nk(
t)
]− Var[nh(
t)
])
, (4) whereVar[nk(
t)
+nh(
t)
]canbeobtainedfrom(3)byreplacingμk
withμk
+μh
andD1,kwithD1,k+D1,h.The M3PP[K] fitting problem is to find a representation
(
D0,D1,1.....,D1,K)
such that (2)–(4) match, to the best possible extent,thecorrespondingempiricalmomentsatgiventimescalest. Such a representation needs to be valid, meaning that all rates needbenon-negativerealnumbersandthegeneratorQofthe em-beddedCTMCmustbevalidandirreducible.2.3. Two-stateMMPPfitting
Inordertofitatwo-stateM3PP[K],weproposetoseparatelyfit theMMPPfortheunmarkedprocessandtheclassmarkingsofthe M3PP. SinceMMPP fittingiswell-understood, we justsummarize theMMPPfittingmethodproposedinHeffesandLucantoni(1986). This algorithm receives in input the following descriptors of the countingprocess:
μ
= E[n(
t)
] t , I(
t)
= Var[n(
t)
] E[n(
t)
] , I=tlim→∞I(
t)
,μ
(3)(
t)
=E(
n(
t)
−μ
t)
3,where
μ
istherate,I(t)istheindexofdispersionforcounts(Sriram &Whitt,1986) atagiventimescalet,Iistheasymptoticindexof dispersionvaluefort→∞,andμ
(3)(t)isthethirdcentralmomentofcountsattimescalet.Theobjectiveistofittheratesusedinthe MMPPrepresentation D0=
−(
λ
+ r)
r r −(
λ
+r)
, D1=diag
(
λ
,λ
)
,subjecttoalltheparametersbeingnon-negativeandto
λ
+λ
>0. Noteinparticularthatweexcludethetrivialcasesλ
=λ
andr=r=0,wheretheMMPPdegeneratesintoaPoissonprocess,which doesnotrequireatwo-statemodel.
Fitting theabove descriptors toa two-stateMMPP requiresto solve forthe unknownsr, r,
λ
,andλ
usinga nonlinear system composed by the following four equations (Heffes & Lucantoni, 1986)μ
=λ
r+λ
r x , I=1+ 2(
λ
−λ
)
2rr x2(
λ
r+λ
r)
, I− I(
t)
I− 1 = 1− e−xt xt , h=(
λ
−λ
)
x, (5)where x=r+r, x=r− r, t is an arbitrary finite timescale at which we want to fit the index of dispersion, and h=
(
g(3)(
1,t)
− k1− k2+k3
μ
− k4μ
x)(
k4+(
k3/x)
− k5)
−1.Theparam-eters in the last equation are k1=
μ
3t3, k2=3μ
2(
I− 1)
t2, k3=3
μ
(
I− 1)
/xt, k4=3μ
(
I− 1)
te−xt/x2, k5=6μ
(
I− 1)(
1− e−xt)
/x3,andg(3)(1,t)isthethirdfactorialmomentofcountsattimescalet.
WepointtoHeffesandLucantoni(1986,Eq.(14d))forexplicit ex-pressionsofg(3)(1,t).
Heffes and Lucantoni (1986) propose the following fitting
method.First,computex=r+r,solvingatanarbitrarytimescale
t=t1thefixedpointequation
x=1 t
I− 1 I− I(
t)
1− e−xt, (6)obtained by rewriting the third equation appearing in (5). Then, compute hatasecondarbitrarytimescalet=t2.Ifh=0,the
fit-tingformulasarethengivenby r=2x
1+1 4y+1 , r=x− r,
λ
=μ
−xhrx,λ
=μ
+xhrx.wherey=
(
I− 1)
μ
x3(
2h2)
−1.Ifh=0,theexplicitfittingformulasinsteadbecome r=r=x 2,
λ
=μ
− 1 22
(
I− 1)
μ
x,λ
=μ
+1 22
(
I− 1)
μ
x. Themaindrawbackofthismethodisthattheformulascanreturn negativeratesforsome combinations oftheinput parameters. In thiscase, approximatefitting techniquesare required.Yet, to the bestofourknowledgethesearenot availableintheliterature on MMPPcountingprocessfitting.3. Fittingthecountingprocessofatwo-stateM3PP[K]
Inthissection,wedevelopamethodtofittwo-stateM3PP[K]s withrepresentation D0=
−(
λ
+ r)
r r −(
λ
+r)
, D1,k=diag
(
qkλ
,qkλ
)
, whereλ
+λ
>0,K k=1qk=Kk=1qk=1,qk ≥ 0,qk≥ 0,forall1 ≤ k≤ K.Asmentionedbefore,theavailabilityofexactmethodsto fitunmarkedMMPPsjustifiesanapproachwhereweseparatelyfit theembedded MMPPfirst, followedby the markingprocess that definestheM3PP. Inpractice,thismeansthat we firstfit D0 andD1 usingaMMPPfittingalgorithmappliedtotheunmarkedtrace.
ThenwedeterminetheindividualD1,kmatricesusingthemarked traceandsubjectto theconditionD1=Kk=1D1,k,whichensures thattheembeddedMMPPdoesnotchange.InSection3.1,we dis-cusshowtoperformapproximatefittingoftheunmarkedprocess using an extension ofthe algorithm by Heffes and Lucantoni. In Section3.2,wediscussthefittingofthemarkedprocess.
3.1. Step1:approximateMMPPfitting
The fitting algorithm of Heffes and Lucantoni described in Section2.3cannotbe appliedtocaseswheretheinput setof de-scriptors
μ
,I(t),Iandμ
(3)(t)isinfeasiblefortheconsideredMMPP.Wehavefoundthisto happenfrequentlyinrealtraces,andeven though one may repeatedly attemptto fit different timescalest1
andt2 until finding a feasible set ofdescriptors, we believe that
bettersolutionsareneededandpossible.Wehavenotfound pre-vious worksaddressing thisproblem, atleast forfittingcounting processes.Ourapproximationconsistsinsacrificing thedegree of freedomofthethirdmomentofcounts
μ
(3)(t)torestorefeasibilityofallsecond-orderdescriptors.Asbefore,wefocusoncaseswhere theM3PPprocess is notPoisson, thus we assume
λ
=λ
,r> 0,r>0.
Webeginbycharacterizingthefeasibilityregionoftheindexof dispersionforatwo-stateMMPP.
Proposition1. Ina two-stateMMPPwith
λ
=λ
,theindexofdis-persionsatisfiesI>I(t)>1foranytimescalet.
Proof. Fromthesecond equationin(5),itreadilyfollowsthatI>
1since
λ
=λ
.Moreover,sincex>0,itfollowsfrom(6)thatI>I(t).Notingthatforx>0itis 0<1− e−xt
xt <1,
∀
t>0,x>0,by(5)andtheconstraintI>I(t)weconcludethatI(t)>1. The previous proposition states the well-known fact that MMPPscanonlyfittracesthathavegreatervariabilitythana Pois-son process. We now show that asking for r> 0 and r > 0, to excludethecasewheretheMMPPisaPoissonprocess,isimplied bythepreviousstatementandthusalwaysverified.
Proposition2. ForI>1,theconditionsr>0andr>0arealways satisfied.
Proof.Forthe caseh= 0,observethat y > 0ifI> 1.Therefore 1/
4y+1<1thatimpliesr>0andr>0.Sincexispositive,it alsofollowsthat0<r<xand0<r<x.
For the case h=0, the result follows immediatelygiven that
x> 0since r=r=0would implya Poissonprocess that would haveI=1againsttheassumption.
Therefore, providedthat I>I(t)> 1,feasibilityfollows by en-suringthatarrivalratesarenon-negativeand
λ
+λ
>0.Using(5), wecanreformulatethisrequirementas:−
μ
x r ≤ h x ≤μ
x r . (7)Notethat thisexpression implicitly characterizes the feasible re-gionfortherate
μ
and,viathehterm,forthethirdmomentg(3)(1,t).Ifoneoftheseconditionsisnotmet,weproposetorelaxthe fit-tingmethodbyignoringthematchingofthethirdmomentg(3)(1,
t)asfollows.Considerfirstthefollowinglowerboundonr.
Proposition3. Withoutloss ofgenerality,assume
λ
>λ
.TheninafeasibleMMPP(2)itisr≥ u,where
u=
(
I− 1)
x22
μ
+(
I− 1)
x. (8)inwhichx=r+r isthesolutionof(6).
Proof.Wesubstitutethefirstequation in(5)intothesecondone andfindaftersimplealgebra
λ
=λ
+(
I− 1)
μ
x32rr . (9)
Obtaining
λ
fromthefirstequationin(5)andequatingto(9),we getthefeasibilityconstraintλ
=μ
−r x(
I− 1)
μ
x32rr ≥ 0. (10)
Theresultfollows usingr=x− r andsolving forr, whichyields thelowerboundu.
Anyvalueu≤ r<xleadstoafeasiblesolution,thuswecanfor examplechoosethemiddlepointofthisintervalr=u+
(
u− x)
/2. Oursuggestionofthemiddlepointisconvenienttokeep the for-mulas in closed-form, however other choices are possible. After choosing r, the other parameters are easily obtained by settingr=x− r andusing(9)and(10)toobtain
λ
andλ
.Summarizing, provided that I > I(t1) > 1, the approximate
methodwe havedefinedfits
μ
,I(t1)andIexactlyattheexpenseofsacrificingan infeasiblethirdmoment
μ
(3)(t2),wheret1 andt2
arearbitrarytimescales.
3.2.Step2:fittingthemarkingprocess
In the second step of the fitting algorithm we determine the D1,kmatrices.After Step1,allthestatisticalpropertiesofthe un-markedprocessareconstrainedbytheD0andD1 matricesofthe
MMPP,thusthefocusofthisstepistofittheclass-specific proper-tiesandtheclasscovariances.Weconsiderbothexactand approx-imatefittingmethods.
There are several ways of choosing which empirical descrip-tors to fit with a M3PPand each choice leadsto equations that maydifferintractability comparedtoother choices. Wehave ex-perimentedwithseveralpossibilities, andwe havefoundthat fit-tingasetof centralmoments, suchasmeanclass rates
μk
,class variances,or covariances,typicallyleadsto adifficult non-convex formulation. Conversely, we have found more efficient to fit a two-stateM3PP[K]usingthemeanclassratesandper-class contri-butionstotheasymptoticindexofdispersionI.Otherefficient ap-proachesarepossible,suchasfittingmeanclassratesandrelativecovariancesorfittingmeanclassratesanddifferencesbetweenthe classvariances.However,weherefocusonthefirstmethod,which we believe providesthe bestcombinationofefficiency (quadratic programming)andeaseofinterpretation.
3.2.1. Fittingmeanclassratesandper-classcontributionstothe indexofdispersion
We begin by considering the more general problem of fitting the mean class rates
μk
and gk(
t)
=Var[nk(
t)
]+ Cov[nk(
t)
,i=kni(
t)
].The gk(t) termshave thesimple interpreta-tion ofmodeling thecontribution ofclass k to thevariance-time curve of the unmarked process, sinceσ
(
t)
=VarKk=1nk(
t)
= Kk=1gk
(
t)
.Wethenspecializethemethodtotheindexof disper-sion,usingthefactthatI(
t)
=σ
(
t)
/E[n(
t)
]=Kk=1gk
(
t)
/E[n(
t)
]. OurgoalistocomputetheD1,kmatricesfromthegk(t)values, given D0 and D1=Kk=1D1,k. Recall that D1,k=diag(
qkλ
,qkλ
)
. The problem under consideration is to determine the values of theprobabilitiesqk andqk thatuniquelydefinetheD1,k matrices, giventhatλ
andλ
areknownfromtheD1 matrixfittedinStep1.Exact matching. We now give formulas to fitthe probabilities qk andqkfrom
μk
andgk(t).Notethatthearbitrarytimescaletused inthestatementdoesnotneedtobeequaltothetimescalesused forfittingtheembeddedMMPP.Proposition4. Given(D0,D1)and,foreachclassk,
μk
andgk(t)atanarbitrarytimescalet,theparametersoftheM3PPcanbecomputed asfollows: qk=f1,1
μ
k+f1,2gk(
t)
, (11) qk=f2,1μ
k+f2,2gk(
t)
, (12) where f1,1=−F2x/F, f1,2=λ
r/F, f2,1=F1x/F, f2,2=−λ
r/F, in whichF=(
F1λ
r− F2λ
r)
, F1=λ
rx−4(
2(
λ
−λ
)
re−xt+tx(
x2− 2(
λ
−λ
)
r)
+2(
λ
−λ
)
r)
, F2=λ
rx−4(
2(
λ
−λ
)
re−xt+tx(
x2− 2(
λ
−λ
)
r)
+2(
λ
−λ
)
r)
,andx=r+risthesolutionof(6).
Proof. The expressions of
μk
and gk(t) for a two-state M3PP[K] processarefoundfromthedefinitionstobeμ
k=λ
qkr+
λ
qkrx , (13)
gk
(
t)
=F1qk+F2qk, (14)whereF1andF2areconstantcoefficientsgivenD0,D1andthe
ref-erencetimescalet.Solving(13)forqkweobtain qk=
μ
kx−λ
qkrλ
r . (15)Substituting(15)into(14),weobtainthefittingformulas(11)and (12).
We are now ready to determine the contributions to the in-dex of dispersion. Note that Proposition 4 may also be used to fitthe contributionofclassk toI(t) intheembeddedMMPP, i.e.,
Gk
(
t)
=gk(
t)
/E[n(
t)
]. This is because we can rewrite the fitting formulasasqk=c1,1
μ
k+c1,2Gk(
t)
, (16)qk=c2,1
μ
k+c2,2Gk(
t)
, (17)where c1,1= f1,1, c1,2=f1,2
μ
t, c2,1= f2,1, c1,1=f2,2μ
t. Theasymptoticexpressionsofthecoefficientsast→∞arethen read-ilyobtainedas
c1,1=− 2
(
λ
−λ
)
r+x2 2λ
r(
λ
−λ
)
, c1,2=μ
x2 2λ
r(
λ
−λ
)
, c2,1= 2(
λ
−λ
)
r+x2 2λ
r(
λ
−λ
)
, c2,2=−μ
x2 2λ
r(
λ
−λ
)
.Combining these expressions with the requirements qk ≥ 0 and
qk≥ 0,wefindaftersimplepassagesthislowerboundrequiredto holdforthefeasibilityoftheper-classcontributionstothe asymp-toticindexofdispersion
Gk≥
μ
μ
k 1+2max((
λ
−λ
)
r,(
λ
−λ
)
r)
x2 . (18)For two-state M3PPs,the minimum value ofthe right-hand side is achievedforPoisson processes,where Gk=
μk
/μ
andkGk=I=1.
Approximatematching. Somevaluesofthedescriptorsmayleadto unfeasiblevaluesoftheparameters(e.g.,negativeprobabilities).In thiscase,wechoosetofittheper-classarrivalrates
μk
exactlyand findthefeasiblevaluesG˜kthatminimizethefollowingL2-normK k=1
˜ Gk− Gk Gk 2 =1 2x THx+fTx+K, (19) with x=(
G˜1,...,G˜K)
, H=diag(
2/G21,...,2/G2K)
, fT=(
−2/G1,. . .,−2/GK)
,K being the numberofclasses ofthe M3PP, andsubjecttotheconstraints−ci,2G˜k≤ ci,1
μ
k∀
i=1,2; k=1,...,K (20) ci,2 K k=1 ˜ Gk=1− ci,1μ
∀
i=1,2. (21)Theseconstraintsarederived usingequations(16)and(17)forqk
andqk.Inparticular,(20)stemsdirectlyfromtheconstraintsqk≥ 0andqk≥ 0,whereas(21)followsfromtheconditionsKk=1qk= 1andKk=1qk=1.
The above optimization program mayalso be used forfitting mean class rates
μk
and the contributions gk(t) to the index of dispersion I(t).Inordertodoso,itissufficienttoreplacethe co-efficients ci,j withthe coefficientsfi,j,Gk withgk(t), andG˜k with˜
gk
(
t)
.In either case, the program has a convex quadratic objective function,thusitsminimizercanbeefficientlyobtainedusing stan-dard quadratic programmingsolvers. Once the problemis solved andthefeasiblevaluesGkorg˜k
(
t)
areobtained,theparametersqk andqkarereadilyfittedusing(16)and(17).4. CompositionalfittingofM3PP[K]s
Intheprevioussection,wehavedefinedageneralpurpose fit-ting methodfortwo-state M3PPs.We nowconsider the problem ofexploitingtheadditionaldegreesoffreedomofaM3PP[K]to in-creasetheflexibilityofthefitting.Inthiscase,exactfitting meth-ods capable ofexploitingall available degreesoffreedom do not existevenforunmarkedprocesses.Thereforewefocuson compo-sitionalfitting,whereonebuildsalargeprocessbycompositionof smallerprocessesthat aresimplerto fittocountdata.The draw-backofcompositionalapproachesofthiskindisthattheyusejust afewdegreesoffreedomofageneralMMAPinreturnforeaseof fitting.
We first review andgeneralize superposition methods for un-marked MMPPs. Afterwards, we introduce a novel form of com-position,calledinterposition,whichoffersadifferenttrade-off be-tween accuracyandcompactnessof therepresentation.Notethat we donotconsidermethods thatare specificto inter-arrival pro-cesses, such as theKronecker product composition (Casaleet al., 2010).
4.1. Superposition
Unmarked case. Consider a set of JMMPPs
(
D0j,D1j)
, 1 ≤ j ≤J, their superpositionis the MMPPprocess
(
D0+,D+1)
where D+0 = J j=1D j 0, D+1 = J j=1D j1, in which denotes the Kronecker sum
operator(Brewer,1978).Superpositionnaturallyarisesin network-ing to describe the traffic process obtained by merging separate traffic flows, each describedby a MMPP.The superposedprocess definesthemultiplexingofJchannels,eachwithinter-arrivaltimes describedbythej-thMMPP.Thisprocesshasmeanarrivalrateand variance-time curve equal to the sum of mean arrival rates and variance-timecurves of the individual MMPPs. The index of dis-persion is a weightedsumof the IDCsof the individual MMPPs, i.e., I
(
t)
=j(
μj
/μ
)
Ij(
t)
, whereμ
=jμj
is the mean arrival rateofthesuperposition.Also,ifMMPPjhasmjstates,the super-posedprocesshasJj=1mjstates,thusitssizegrowsexponentially withJ.Marked case. Let K=
{
1,...,K}
be a set of classes. We con-siderJM3PPsandassumethat M3PPjgeneratesarrivalofclassesKj⊂ K,withK1∪K2∪· · · ∪KJ=K.Inthiscase,someM3PPs con-tributetoarrivalsofclassk,thusD+0 =J
j=1D j 0,D+1,k= J j=1D j 1,k,
∀
k∈K,wherewedefineD1j,k=0ifk∈/Kj,where0isherea ma-trixofall zeros ofordermj.The statisticalproperties ofthe em-beddedunmarkedprocess areobtainedasinanunmarked super-position. Lastly, note that if each M3PP has two states, the re-sultingprocess is a M3PPwith2J statesand K classes.Thus, the main drawback of the superposition method is the state-space explosion, which limits the composition to a small number of processes.4.2.Interposition
We now propose a new form of composition for Markov-modulatedprocesses that tackles thestate-space explosion prob-lemofsuperposition.Informally,ourideaistodefinean operator bywhichseveralM3PPscanbydefineduponthesamestatespace, without interfering with their respective marginal counting pro-cesses.Thisallowsustopreserveintheinterpositionthecounting processpropertiesfittedinisolationforeachcomposedM3PP.We characterizethemainfeatureofthisnewcompositionoperatorin Theorem1,givenlater.Notethattheresultsinthissectionarealso applicabletounmarkedMMPPs,andthusrepresentadvances also forunmarkedprocesses.
ConsiderasetofJtwo-stateM3PPswheretheithprocesshasKi classes.Assumewithoutlossofgeneralitythatclassesarelabelled suchthateachclassappearsinoneandonlyoneM3PP.TheM3PPs haverepresentation Di0=
−ri−λ
i ri ri −ri−λ
i, Di1,k=diag
(
qi,kλ
i,qi,kλ
i)
, k=1. . .,Ki,where we define the probabilities Ki
k=1qi,k=1, qi,k ≥ 0, and Ki
k=1qi,k=1,qi,k≥ 0,andtheratesri=Jj=i
α
jandri= ij=1
βj
,forgivenconstants
αj
≥ 0andβj
≥ 0.Equivalently,giventhe val-uesofthe ri andri constants, we can define theratedifferencesαi
=ri− ri+1 andβi
=ri− ri−1,withboundaryvaluesβ
1=r1 andαJ
=rJ.WenowdefinetheinterpositionoperatorforM3PPs.Giventhe setof Jtwo-stateM3PPs
(
D0,Di1,1,...,Di1,Ki)
, theinterposedpro-cess
(
D∗0,D∗1,1,...,D∗1,K)
istheM3PPwithJ+1states,K=J i=1Kiclasses,andrepresentation D∗0=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
−α
1α
2 ...α
Jβ
1 −α
2 ...α
J . . . ... −... ...
β
1 ...β
J−1 −α
Jβ
1 ...β
J−1β
J −⎤
⎥
⎥
⎥
⎥
⎥
⎦
, D∗1,k= qi,cλ
iIi 0 0 qi,cλ
iIJ+1−i,
whereIn istheidentitymatrixofordern,classk=ij−1=1Kj+c is theclass inthe interposedprocess associated to classc of thei -thcomposedM3PP,andthediagonalelementsofD∗0 aresuchthat
(
D∗0+kD∗1,k)
1=0.Throughoutthissection,wedenotebyKithe setofclassindexesintheinterposedprocessassociatedtothei-th composedM3PP,andbyK¯i={
1,...,K}
\
Kiitscomplement.TheinterposedprocessmaybeseenasaM3PPmodulatedbya CTMCwithaninfinitesimalgeneratorQ∗=D∗0+Kk=1D∗1,kthat al-lowsexactCTMCaggregation(Bolch,Greiner,deMeer,andTrivedi, 1998,Chap 4). Specifically, foreach of the initial M3PPs,we can defineapartitionofthestatesoftheinterposedprocesssuchthat, byexactaggregationoftheCTMCQ∗onerecoversthe correspond-ingCTMC ofthe initial M3PP. Forexample,we mayconsider the aggregation Q∗=
⎡
⎢
⎢
⎢
⎢
⎢
⎢
⎣
−r1α
1α
2 ...α
Jβ
1 −r2− r1α
2 ...α
J . . . ... ... ... ...β
1 ...β
J−1 −rJ− rJ−1α
Jβ
1 ...β
J−1β
J −rJ⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎦
⇒⎡
⎢
⎣
−r1 J i=1α
iβ
1 −β
1⎤
⎥
⎦
= −r1 r1 r1 −r1=Q1, whereQ1=D10+
kD11,k isthe infinitesimalgenerator of theith composedM3PP.Similarly,foreachpartition,thedefinitionofD∗1,k ensures that the departure rates are identical tothe ones in the composedM3PP.
In order to prove that the marginal counting process of each composed M3PP, for all classes k∈Ki, is preserved by the in-terposition operator, we require additional notation. Let P(ni, t),
ni=
(
n1,. . .,nKi)
,bethe2× 2countingprocessmatrixforthei-th M3PPattimet.Similarly,letP∗(n,t)bethe(
J+1)
×(
J+1)
count-ingprocessmatrixforthecomposedM3PP,wheren=(
n1,...,nJ)
. Let0n and 1n be rowvectors ofsize n ofall zeros andall ones, respectively,anddefinetheaggregationmatrix(Buchholz,1994)Vi=
1i 0J+1−i 0i 1J+1−i.
Foran arbitrary
(
J+1)
×(
J+1)
matrixX,the2× 2matrix Xi=WiXVTi,whereWi=diag
(
1JT+1VTi)
−1Vi,givestheaggregationofX intothetwostatesassociated tothei-thcomposed M3PP. There-foreP∗i
(
n,t)
=WiP∗(
n,t)
VTi,isthecountingprocessmatrixaggregatedontothestatesofthei -thcomposedM3PP.Fromthismatrix,wecanreadilycomputethe marginalcountingprocessmatrixofthei-thM3PPas
P∗i
(
ni,t)
= nh:h∈K¯iP∗i
(
n,t)
,wherethesummationisoverallclassesnot associatedtothei-th M3PP.Usingthisnotation,weprovethemaincharacterization re-sultfortheinterposedprocess,whichstatesthattheinterposition operatordoesnotaffectthemarginalcountingprocessforthei-th M3PP.
Theorem1. Assumeallprocessestobetime-stationary,then P
(
ni,t)
=P∗i(
ni,t)
,∀
t≥ 0,ni≥ 0. (22)Proof. Forthe composed M3PPit is simple to verifythat Vi has thefollowingproperties
Q∗VTi =ViTQi, D∗1,kVTi =VTiDi1,k,
∀
k∈Ki. (23) Note that conditions of this kind naturallyarise in the study of minimal representations of Markovian and Rational Arrival Pro-cesses(Buchholz&Telek,2013).LetusthenconsidertheKolmogorovforwardequationforP∗(n,
t). Pre-multiplying(1)by Wi andpost-multiplying byVTi,we ob-tain
∂
P∗i(
n,t)
∂
t =WiP ∗(
n,t)
D∗ 0V T i + k∈Ki P∗i(
n− ek,t)
Di1,k + k∈K¯i WiP∗(
n− ek,t)
D∗1,kVTi,wherewehaveused(23)tosimplifytheexpressionandthe nota-tionn− ek indicates theremovalofa job ofclassk fromn. Now pluggingtheidentityD∗0=Q∗−K
k=1D∗1,kandusingagain(23)we get
∂
P∗i(
n,t)
∂
t =P ∗ i(
n,t)(
Qi− k∈Ki Di1,k)
+ k∈Ki P∗i(
n− ek,t)
Di1,k + k∈K¯i Wi(
P∗(
n− ek,t)
−P∗(
n,t))
D∗1,kVTi =P∗i(
n,t)
Di0+ k∈Ki P∗i(
n− ek,t)
Di1,k + k∈K¯i Wi(
P∗(
n− ek,t)
− P∗(
n,t))
D∗1,kVTi.WenowconsiderthemarginalprobabilityP∗i
(
ni,t)
,forwhichthe lastexpressionimplies∂
P∗i(
ni,t)
∂
t = nh:h∈K¯i P∗i(
n,t)
Di0+ nh:h∈K¯i k∈Ki P∗i(
n− ek,t)
Di1,k + nh:h∈K¯i k∈K¯i Wi(
P∗(
n− ek,t)
− P∗(
n,t))
D∗1,kVTi. (24) Since theinfinitesummations inthelast termof(24)are onthe sameclassindexes(K¯isets),andP∗(
n− ek,t)
=0whennk=0,by symmetrythedoublesummationvanishes.Thus(24)reducesto∂
P∗i(
ni,t)
∂
t =P ∗ i(
ni,t)
Di0+ k∈Ki P∗i(
ni− ek,t)
Di1,k,which is identical to the Kolmogorov forward equation for the counting process of the i-th composed M3PP. In order to prove this statement, we just need to show that the initial conditions oftheKolmogorov forwardequationsarethesame,i.e.,P
(
ni,0)
=P∗i
(
ni,0)
,∀
ni.Sinceweareconsideringtime-stationaryprocesses, the initial state of the interposed process is determined by the equilibrium distribution of the CTMC with generator Q∗. Con-versely, forthe i-thM3PP thisisgiven bythe equilibrium distri-butionoftheCTMCwithgeneratorQi.By(23)weseethatforan arbitraryvectorπ
Wi
π
Q∗VTi =Wiπ
VTiQi=πi
Qi.Therefore, if we choose the initial distribution to be the time-stationary distribution
π
Q∗=0,π
1=1, we readily find that its aggregationcorresponds tothetime-stationarydistributionofthei-th M3PP, i.e.,
π
iQi=0 andπ
i1=1, where the last condition holdssinceweuseexactaggregation.Thisconcludestheproof. We remarkthat(23)provides aconditionforageneralMMAP to define a valid interposition. It is simple to see that the diag-onal structure of theD1,k matricesin a M3PP satisfiesthese as-sumptions.Whileouranalysisdoesnotexcludethatotherkindsof MMAPs maybe usedforinterposition,the conditionsrequiredto satisfy(23)donotseemtoreadilysuggestalternatives otherthan theM3PP.ThismotivatestheuseofM3PPsasabuildingblockfor interposition.4.2.1. Feasibilityofaninterposition
We now give examples of valid and invalid interposed pro-cesses.Considerthefollowingtwo-stateM3PP[2]s:
A0=
−6 3 1 −5, B0= −3 1 4 −7
, C0= −9 7 2 −5
, E0= −7 4 1 −3
,
where A1,1=diag
(
1,3)
, A1,2=E1,1=diag(
2,1)
, B1,1=C1,1=diag
(
1,2)
,andB1,2=C1,2=E1,2=diag(
1,1)
.Itiseasytoseethatthe interposition of A=
(
A0,A1,1,A1,2)
and B=(
B0,B1,1,B1,2)
satisfiestheassumptionsonthenon-negativityoftherates
αj
andβj
andyieldsthecomposed3-stateM3PP[4]D∗0=
−8 2 1 1 −8 1 1 3 −11 ,with D∗1,1=diag
(
1,3,3)
, D∗1,2=diag(
2,1,1)
, D∗1,3=diag(
1,1,2)
,D∗1,4=diag
(
1,1,1)
. The per-class arrival rates and variances of counts inthe interposedprocess matchthe corresponding statis-ticsofthetwoclassesofAandthetwoclassesofB.Conversely, the interposition of processes A and C=
(
C0,C1,1,C1,2)
is invalid because the non-diagonal rates of C0areboth greaterthanthecorrespondingratesinA0.Similarly,the
interposition of A andE=
(
E0,E1,1,E1,2)
is invalid, even thoughBandE arethesameprocess,becausetheirstatesareorderedin a differentwayandthisaffectsthecomputationofthe
αj
andβj
coefficients. Thislast example provides intuitionon thefact that, toobtain afeasibleinterposed process,onemayneedtore-order the states of the M3PPs. In the next section, we describe an algorithm tofindafeasible interpositionofagivenset ofM3PPs, ifoneexists.4.2.2. Classcovariances
While theinterposed process preservesthe marginalcounting propertiesofthecomposedM3PPs,whencomparedto superposi-tionthiscomesattheexpenseofintroducingspuriouscovariances betweenarrivalsofdifferentM3PPs.Thisisbecause,bydefinition oftheinterposedprocess,atransitioninthestatespaceofa com-posedM3PPalsochangesthecurrentstate oftheothercomposed M3PPs. In order to quantify the magnitude of thesecovariances, welookattheasymptoticcovariances,whichcontributetothe in-dex ofdispersion.Observe that,by plugging (3)into(4), wefind aftersomesimplifications
σ
k,h=tlim→∞Cov[nk(
t)
,nh(
t)
]=−2μ
kμ
h+π
D1,h(
1π
− Q)
−1D1,k1 +π
D1,k(
1π
− Q)
−1D1,h1.Let P=
(
1π
− Q)
−1 and observe that this by construction is astochastic matrix with equilibrium vector
π
. Using the fact thatπ
D1,k=μk
,foranyclassk,andafterdeterminingthestructureofthePmatricesfortheinterposedprocess,itispossibletocompute their Jordan canonical form, which after algebraic simplifications yieldstheformula
σ
k,h=rirj
(
qj,hλ
j− qj,hλ
j)(
qi,kλ
i− qi,kλ
i)(
xj+xi)
x2jx2i
, (25)
whereitisassumedthatk∈Kiandh∈Kj,andi<j. Someremarksontheformulasareasfollows:
• WhenanyofthetwoM3PPstendstoaPoissonprocess,apair ofdepartureratesatthenumeratorof(25)annihilatesand
σ
k,h goestozero.• Theorderofthedenominatorsuggeststhatawaytoreducethe covariance introduced by interposition may consist in spend-ing degrees of freedom to maximize xi and xj in the em-bedded MMPPs, for fixed arrival rates. When applyingsuch a scheme, one shouldhowever take intoaccount the boundsin Proposition 3,since anincrease ofthevalue ofxialsoreduces fittingflexibilityintheembeddedMMPP.
• Since xi>0foranyi,wenote thatthe signofthecovariance isdeterminedonlybythedifferencesbetweentheper-class ar-rivalrateswithineachofthecomposedM3PPs.
5. Fittingtheinterposedprocess
In thissection we consider two issues that arise in composi-tional fitting based on interposition. First, we consider the deci-sionprobleminvolvingthe mappingofa markedtrace intoa set oftwo-statesM3PPs.Then weshowthattheproblemoffindinga feasibleinterpositionofasetofJsecond-orderM3PPs,wherethe
i-thM3PPmodelsanysubsetKioftheKclasses,canbeformulated asaMILPandhencesolvedusinganintegerprogrammingsolver.
5.1. FittingamarkedtraceintoasetofM3PPs
Givenatrace witharrivalsofKclasses,iftheclassarrivalsare independentitispossibletofitthetraceusingasuperpositionof
KindependentMMPPs,oneforeachclass,andthenreducethesize oftheresultingprocessusinginterposition.However,iftheclasses haveasignificantlylargecovariance,thismethodmaynotproduce goodresults.Therefore,we proposetwo heuristicfittingmethods to address thesetwo cases.We assessthe effectiveness of these methodslaterinSection6.
5.1.1. IndependentMethod
Inthismethod, we ignoreclass covariances andfit each class intoa separate two-stateMMPP.The method issimilar to super-position,but returns a much smaller M3PP[K] withK+1 states, instead of 2K states. Interposition uses the order of composition obtainedfromtheMILPmethodthatwepresentinSection5.2.
5.1.2. Covariance-basedmethod
We initially build the asymptotic co-variance matrix