SERIAL CORRELATION - RANDOM NUMBER GENERATION

RANDOM NUMBER GENERATION

4.4.2 SERIAL CORRELATION

It is frequently the case that one is interested in the extent to which a set of data is auto-correlated (e.g., self-correlated). This is particularly true, for example, in a steady-state analysis of the waits experienced by consecutive jobs entering a service node. Intu-itively, particularly if the utilization of the service node is high, there will be a high positive correlation between the wait wⁱ experienced by the i^th job and the wait wⁱ⁺¹ experienced by the next job. Indeed, there will be a statistically significant positive correlation between wi and w^i+j for some range of small, positive j values.

In general, let x1, x2, . . . , xn be data which is presumed to represent n consecutive observations of some stochastic process whose serial correlation we wish to characterize.

In the (uⁱ, vi) notation used previously in this section, we pick a (small) fixed positive integer j ¿ n and then associate uⁱ with xⁱ and vⁱ with x^i+j as illustrated

u : x1 x2 x3 · · · xi · · · xn−j xn−j+1 · · · xn

v : x1 · · · x^j x1+j x2+j x3+j · · · x^i+j · · · xⁿ

The integer j > 0 is called the autocorrelation lag (or shift). Although the value j = 1 is generally of primary interest, it is conventional to calculate the serial correlation for a range of lag values j = 1, 2, . . . , k where k ¿ n.*

Because of the lag we must resolve how to handle the “non-overlap” in the data at the beginning and end. The standard way to handle this non-overlap is to do the obvious — ignore the extreme data values. That is, define the sample autocovariance for lag j, based only on the n − j overlapping values, as

cj = 1 n − j

n−j

i=1

(xⁱ−x)(x¯ ^i+j −x)¯ j = 1, 2, . . . , k, where the sample mean, based on all n values, is

¯ x= 1

i=1

xⁱ. The associated autocorrelation is then defined as follows.

Definition 4.4.5 The sample autocorrelation for lag j is r^j = cj

j = 1, 2, . . . , k where the sample variance is

c0 = s² = 1 n

i=1

(xⁱ−x)¯ ².

* Serial statistics are commonly known as statistics, e.g., autocovariance and auto-correlation.

Computational Considerations

The problem with the “obvious” definition of the sample autocovariance is that an implementation based on this definition would involve a two-pass algorithm. For that reason, it is common to use the following alternate definition of the sample autocovariance.

Although this definition is not algebraically equivalent to the “obvious” definition, if j ¿ n then the numerical difference between these two autocovariance definitions is slight.

Because it can be implemented as a one-pass algorithm, Definition 4.4.6 is preferred (in conjunction with Definition 4.4.5).

Definition 4.4.6 The sample autocovariance for lag j is cj =

Ã 1

n − j

n−j

i=1

xixi+j

−x¯² j = 1, 2, . . . , k.

Example 4.4.3 A modified version of program ssq2 was used to generate a sam-ple of waits and services experienced by 10 000 consecutive jobs processed through an M/M/1 service node, in steady-state, with arrival rate 1.0, service rate 1.25, and utiliza-tion 1/1.25 = 0.8. Definiutiliza-tions 4.4.5 and 4.4.6 were used to compute the corresponding sample autocorrelations rj for j = 1, 2, . . . , 50, in what is commonly known as an sample autocorrelation function, or correlogram, illustrated in Figure 4.4.6.

0 5 10 15 20 25 30 35 40 45 50

−0.2 0.0 0.2 0.4 0.6 0.8 1.0

• – wait

◦ – service

••

• •• •• •

• •• •

• • •

• • • •• • • •••• • • ••

• • • • •••

• • • • •••• • • • •

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦◦ ◦ ◦ ◦ ◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦◦

Figure 4.4.6.

Sample autocorrelation functions for the M/M/1 queue.

As expected, the sample wait autocorrelation is positive and high for small values of j, indicating that each job’s wait is strongly (auto)correlated with the wait of the next few jobs that follow. Also, as expected, the sample autocorrelation decreases monotonically toward zero as j increases. The rate of decrease may be slower than expected; if the utilization were smaller (larger), the rate of decrease would be higher (lower). It is quite surprising that the wait times of two jobs separated by 49 intervening jobs have a moderately strong positive correlation. Also, as expected, the sample service autocorrelation is essentially zero for all values of j, consistent with the stochastic independence of the service times.

We select the first three values of the the sample autocorrelation function for the wait times in order to interpret the magnitude of the r^j values. First, r0 = 1 means that there is perfect correlation between each observation and itself, since the lag associated with r⁰ is zero. The next sample autocorrelation value, r1 = 0.957, indicates that adjacent (lag 1) jobs, such as job number 15 and job number 16, have a statistically significant strong positive correlation. If the 15th job has a long wait, then the 16th job is almost certain to also have a long wait. Likewise, if the 15th job has a short wait, then the 16th job is almost certain to also have a short wait. Anyone who has waited in a busy queue recognizes this notion intuitively. Finally, consider the estimated lag-two sample autocorrelation r2 = 0.918. This autocorrelation is not quite a strong as the lag-one autocorrelation due to the increased temporal distance between the wait times. The positive value of r2

indicates that wait times two jobs apart (e.g., the 29th and the 31st wait times) tend to be above the mean wait time together or below the mean wait time together.

Graphical Considerations

Several formats are common for displaying the sample autocorrelation function. In Figure 4.4.6, we plot the r^j values as points. Another common practice is to draw “spikes”

from the horizontal axis to the r0, r1, . . . , rk values. It is certainly not appropriate to connect the points to produce a piecewise-linear function. This would imply that r^j is defined for non-integer values of j — which it is not.

Statistical Considerations

The previous example indicated that jobs separated by 50 lags have wait times that are positively correlated. But how do we know that r50 differs significantly from 0. Leaving out the details, Chatfield (2004, page 56) indicates that an r^j value will fall outside the limits ±2/√

n with approximate probability 0.95 when the lag j values are uncorrelated.

In the previous example with n = 10 000, for instance, these limits are at ±0.02, indicating the all of the wait time sample autocorrelation values plotted differ significantly from 0.

For the service times, only r10 = 0.022 falls outside of these limits. Experience dictates that this is simply a function of random sampling variability rather than some relationship between service times separated by 10 jobs. We have set up our service time model with independent service times, so we expect a flat sample autocorrelation function for service times. The spurious value can be ignored.

The high autocorrelation that typically exists in the time-sequenced stochastic data produced by a simulation makes the statistical analysis of the data a challenge. Specifically, if we wish to make an interval estimate of some steady-state statistic like, for example, the average wait in a service node, we must be prepared to deal with the impact of autocor-relation on our ability to make an accurate estimates of the standard deviation. Most of so-called “classical” statistics relies on the assumption that the values sampled are drawn independently from a population. This is often not the case in discrete-event simulation and appropriate measures must be taken in order to compute appropriate interval esti-mates.

Program acs

To implement Definition 4.4.6 as a one-pass algorithm for a fixed lag-value j involves nothing more than storing the values xi, xi+j, accumulating the xi sum, and accumulating the x²i and xⁱx^i+j “cosums.” It is a greater challenge to construct a one-pass algorithm that will compute cj for a range of lags j = 1, 2, . . . , k. In addition to the accumulation of the xⁱ sum, the simultaneous computation of c0, c1, . . . , c^k involves storing the k + 1 consecutive values xi, xi+1, . . . , xi+k and accumulating the k + 1 (lagged) xixi+j cosums for j = 0, 1, 2, . . . , k. The k + 1 cosums can be stored as an array of length k + 1. A more interesting queue data structure is required to store the values xi, xi+1, . . . , xi+k. This queue has been implemented as a circular array in the program acs. A circular array is a natural choice here because the queue length is fixed at k + 1 and efficient access to all the elements in the queue, not just the head and tail, is required. In the following algorithm the box indicates the rotating head of the circular queue. An array index p keeps track of the current location of the rotating head; the initial value is p = 0.

Algorithm 4.4.1 Program acs is based on the following algorithm. A circular queue is initially filled with x¹, x2, . . . , xk, xk+1, as illustrated by the boxed elements below. The lagged products x1x1+j are computed for all j = 0, 1, . . . , k thereby initializing the k + 1 cosums. Then the next data value is read into the (old) head of the queue location, p is incremented by 1 to define a new head of the queue location, the lagged products x2x2+j

are computed for all j = 0, 1, . . . , k, and the cosums are updated. This process is continued until all the data has been read and processed. (The case n mod (k + 1) = 2 is illustrated.)

(i = k + 1) x1 x2 x3 · · · xk−1 x^k x^k+1 (p = 0) (i = k + 2) xk+2 x2 x3 · · · xk−1 xk xk+1 (p = 1) (i = k + 3) xk+2 xk+3 x3 · · · xk−1 xk xk+1 (p = 2)

... ... ... ... ... ... ... ...

(i = 2k) x^k+2 x^k+3 x^k+4 · · · x2k x^k x^k+1 (p = k) (i = 2k + 1) xk+2 xk+3 xk+4 · · · x2k x2k+1 xk+1 (p = k + 1) (i = 2k + 2) x^k+2 x^k+3 x^k+4 · · · x2k x2k+1 x2k+2 (p = 0)

... ... ... ... ... ... ... ...

(i = n) xn−1 xⁿ xn−k · · · xn−4 xn−3 xn−2 (p = 2)

After the last data value, xⁿ, has been read, the associated lagged products computed, and the cosums updated, all that remains is to “empty” the queue. This can be accomplished by effectively reading k additional 0-valued data values. For more details, see program acs.

4.4.3 EXERCISES

Exercise 4.4.1 Prove that the orthogonal distance from the point (ui, vi) to the line a u+ b v + c = 0 is in fact

dⁱ = |a ui+ b vⁱ+ c|

√a² + b² .

Hint: consider the squared distance (u − uⁱ)²+ (v − vⁱ)² from (uⁱ, vi) to a point (u, v) on the line and show that d²i is the smallest possible value of this distance.

Exercise 4.4.2 (a) If u⁰i = αûuⁱ+ βû and vi⁰ = α^vvⁱ+ β^v for i = 1, 2, . . . , n and constants αu, α^v, βû, and β^v how does the covariance of the u⁰, v⁰ data relate to the covariance of the u, v data? (b) Same question for the correlation coefficients? (c) Comment.

Exercise 4.4.3^a The orthogonal distance regression derivation presented in this section treats both variables equally — there is no presumption that one variable is “independent”

and the other is “dependent.” Consider the more common regression approach in which the equation of the regression line is v = au + b, consistent with a model that treats u as independent and v as dependent. That is, given the data (uⁱ, vⁱ) for i = 1, 2, . . . , n and the line defined by the equation v = au + b, the conventional (non-orthogonal) distance from the point (ui, vi) to the line is

δi = |vⁱ−(auⁱ+ b)|.

(a) What choice of the (a, b) parameters will minimize the conventional mean-square dis-tance

∆ = 1 n

i=1

δi² = 1 n

i=1

(vⁱ− auⁱ− b)². (b) Prove that the minimum value of ∆ is (1 − r²)s²v.

Exercise 4.4.4 Prove Theorem 4.4.3.

Exercise 4.4.5 To what extent are these two definitions of the autocovariance different?

c⁰j = 1 n − j

n−j

i=1

(xi−x)(x¯ i+j −x)¯ and cj =

Ã 1

n − j

n−j

i=1

xixi+j

−x¯²

Exercise 4.4.6 (a) Generate a figure like the one in Figure 4.4.6 but corresponding to a utilization of 0.9. Do this for three different rngs streams. Comment. (b) Repeat for a utilization of 0.7. (You can ignore the service autocorrelations.) Comment.

Exercise 4.4.7 If Definition 4.4.6 is used in conjunction with Definition 4.4.5, there is no guarantee that |r^j| ≤ 1 for all j = 1, 2, . . . , k. If it is important to guarantee that this inequality is true, then how should the two definitions be modified?

In document Discrete Event Simulation - A First Course - Lemmis Park (Page 176-181)