Synthetically generated Traces - AN ALGORITHMIC APPROACH

An Algorithmic Approach

CHAPTER 7. AN ALGORITHMIC APPROACH

8.1. Synthetically generated Traces

Section 5.3) and for fitting general acyclic PH distributions for CAPPs we use a tool called^Momfit(cf. [42] and Section 6.2). Using these distributions a CHEP or CAPP is constructed as described in Chapter 7. Most of the MAPs presented in this study are taken from [100] where different MAP fitting algorithms have been compared.

8.1. Synthetically generated Traces

To assess whether CHEPs and CAPPs can capture the characteristics of other stochas-tic processes, several different processes have been simulated to obtain synthestochas-tically generated traces.

The first trace has been generated by a CHEP(4, 5, 4) consisting of a Hyper-Erlang distribution with three branches, phases S = (1, 1, 2), rates λ = (0.12, 0.46, 2.97) and initial probabilities τ = (0.08, 0.55, 0.37) and an ARMA(5, 4) base process with AR coefficients

α = (0.50, 1.09, −0.42, −0.25, 0.078) and MA coefficients

β = (1.24, −0.69, −1.39, 0.03).

Figure 8.1 shows the results of a CHEP(4, 8, 2) and a CAPP(4, 3, 5) fitted to the

(a) cumulative distribution function

(b) probability density function

0.08 Autocorrelations for lags 1-100

Trace CHEP CHEP(4,8,2) CAPP(4,3,5)

Figure 8.1.: Fitting results for a trace generated by a CHEP

trace. As one can see from Figures 8.1(a) and 8.1(b) ^GFIT was able to recreate the marginal distribution while^Momfitresulted in an acyclic PH distribution with a similar

CHAPTER 8. EXPERIMENTAL RESULTS

cdf but a different density function. In both cases the autocorrelation was captured almost exactly (cf. Figure 8.1(c)). Observe, that the fitted models have a different base process order than the original CHEP, which is caused by some slight variations in the autocorrelation structure between the original CHEP and the generated trace.

(a) cumulative distribution function

(b) probability density function

-0.05 Autocorrelations for lags 1-30

MAP(3) Trace MAP(3) CHEP(3,5,2) CAPP(3,4,3)

Figure 8.2.: Fitting results for a trace generated by a MAP(3)

As a second example a trace from the MAP(3) that was already used in Section 3.2.2 was generated. The results are shown in Figure 8.2 and give a similar picture as for the first trace. Again,^GFITwas able to deliver a slightly better fitting of the distribution than ^Momfit, but in both cases the autocorrelation was fitted almost exactly by the CHEP(3, 5, 2) and the CAPP(3, 4, 3), respectively.

To generate traces with a distribution that is nontypical for a PH distribution and therefore more difficult to fit than the two previous examples, ARTA processes with different marginal distributions have been used for trace generation. The first process has a Johnson bounded marginal distribution and an ARMA(9, 5) base process.

As one can see from Figure 8.3 both,^GFITand^Momfitwere able to provide a sufficient approximation of the distribution. Note, that the Hyper-Erlang distribution returned byGFITwas not adequate for capturing the autocorrelation structure and thus, a trans-formation was necessary to bring the distribution into series canonical form and add an additional state. Therefore, both processes in Figure 8.3 are CAPPs, although the CAPP(6, 9, 3) resulted from a transformed Hyper-Erlang distribution with 5 states. For the general acyclic PH distribution returned by^Momfitthat was used for the second CAPP(5, 8, 7) no transformation was necessary.

8.1. SYNTHETICALLY GENERATED TRACES

(a) cumulative distribution function

(b) probability density function

0.05 Autocorrelations for lags 1-50

Trace ARTA CAPP(6,9,3) CAPP(5,8,7)

Figure 8.3.: Fitting results for a trace generated by an ARTA process with Johnson marginal distribution

The last synthetically generated trace was most difficult to fit. It was generated from an ARTA process with Weibull marginal distribution and a base process that exhibits positive and negative autocorrelation. The fitting results are shown in Figure 8.4. For both distributions, the Hyper-Erlang distribution returned by^GFITand the APH distri-bution fitted according to the moments by^Momfit, several transformation steps were necessary to obtain a marginal distribution that could capture the negative correla-tion of the trace. Figures 8.4(a) and 8.4(b) show that the CAPP(11, 7, 2) using the transformed Hyper-Erlang distribution provides a slightly better approximation of the distribution than the CAPP(8, 5, 3) using an APH distribution fitted according to the moments. As one can see from Figure 8.4(c) both processes were able to provide a good fitting of the autocorrelation structure of the trace.

From the previous examples we have seen that CHEPs and CAPPs are able to model a variety of different distribution shapes with different autocorrelation structures. How-ever, the choice of the basic parameters used for the fitting steps is not really done in a systematic way. Usually an exhaustive search is performed over a subset of the param-eter region to find the best model for the available data. For example^GFITtries several combinations for the number of branches and number of phases for each branch and selects the best model from these combinations (cf. [151] and Sect. 5.3). A similar approach is proposed in Sect. 7.2 for selecting the order of the ARMA(p, q) base pro-cess. This can be done if the parameter region is not too large, which is often the case,

CHAPTER 8. EXPERIMENTAL RESULTS

0 0.2 0.4 0.6 0.8 1

0 1 2 3 4 5 6 7 8 9 10

distribution

t cumulative distribution function

Trace ARTA CAPP(11,7,2) CAPP(8,5,3)

(a) cumulative distribution function

0 0.05 0.1 0.15 0.2 0.25

0 2 4 6 8 10

density

t probability density function

Trace ARTA CAPP(11,7,2) CAPP(8,5,3)

(b) probability density function

-0.15 -0.1 -0.05 0 0.05 0.1 0.15

2 4 6 8 10 12 14 16 18 20

autocorrelation

lag Autocorrelations for lags 1-20

Trace ARTA CAPP(11,7,2) CAPP(8,5,3)

Figure 8.4.: Fitting results for a trace generated by an ARTA process with Weibull marginal distribution

but reaches its limits for cases where a large number of phases or AR and MA coeffi-cients are required to model the distribution or autocorrelation structure, respectively.

Additionally it is unclear which measures are most important for the fitting quality.

For the analysis of technical systems and also in simulation models where several pa-rameters have an unknown impact on the results often ideas from statistical design of experiments [34] are used to evaluate the influence of parameters in a systematic way in order to identify the parameters which have the main influence on the result mea-sures. It seems promising to consider similar ideas for the parametrization of CHEPs and CAPPs. To the best of the author’s knowledge, design of experiments has not been used before for assessing fitting approaches for stochastic processes. However, although the design of experiments has the potential to give new insights into the effect of different parameters on the fitting quality and may allow one to find good parameter settings with less effort, the identification of important factors and the experimental setup require some careful planning and preliminary considerations to yield mean-ingful results. Since the different measures and parameters are highly dependent, a straightforward application of factorial designs of experiments [115] does not result in a useful approach. Consequently, the application of more systematic approaches from experimental design for setting parameters of CHEPs and CAPPs is identified as a promising area which should be considered in future research.

In document Fitting simulation input models for correlated traffic data (Page 118-122)