• No results found

SINGLE-SERVER MACHINE SHOP

RANDOM NUMBER GENERATION

3.3.3 SINGLE-SERVER MACHINE SHOP

The third discrete-event simulation model considered in this section is a single-server machine shop. This is a simplified version of the multi-server machine shop model used in Section 1.1 to illustrate model building at the conceptual, specification, and computational level (see Example 1.1.1).

A single-server machine shop model is essentially identical to the single-server service node model first introduced in Section 1.2, except for one important difference. The service node model is open in the sense that an effectively infinite number of jobs are available to arrive from “outside” the system and, after service is complete, return to the “outside”. In contrast, the machine shop model is closed because there are a finite number of machines (jobs) that are part of the system — as illustrated in Figure 3.3.8, there is no “outside”.

queue ...

......

.............................................................................................................................................

....

.....

.......

...

... server operational

machines ..........................................................................................................

......................................................

Figure 3.3.8.

Single-server machine shop system diagram.

In more detail, there is a finite population of statistically identical machines, all of which are initially in an operational state (so the server is initially idle and the queue is empty). Over time these machines fail, independently, at which time they enter a broken state and request repair service at the single-server service node.* Once repaired, a machine immediately re-enters an operational state and remains in this state until it fails again. Machines are repaired in the order in which they fail, without interruption.

Correspondingly, the queue discipline is FIFO, non-preemptive and conservative. There is no feedback.

To make the single-server machine shop model specific, we assume that the service (repair) time is a Uniform(1.0, 2.0) random variate, that there are M machines, and that the time a machine spends in the operational state is an Exponential (100.0) random variate.

All times are in hours. Based on 100 000 simulated machine failures, we want to estimate the steady-state time-averaged number of operational machines and the server utilization as a function of M .

* Conceptually the machines move along the network arcs indicated, from the opera-tional pool into and out of service and then back to the operaopera-tional pool. In practice, the machines are usually stationary and the server moves to the machines. The time, if any, for the server to move from one machine to another is part of the service time.

Program ssms

Program ssms simulates the single-server machine shop described in this section. This program is similar to program ssq2, but with two important differences.

• The library rngs is used to provide an independent source of random numbers to both the simulated machine failure process and the machine repair process.

• The failure process is defined by the array failure which represents the time of next failure for each of the M machines.

The time-of-next-failure list (array) is not maintained in sorted order and so it must be searched completely each time a machine failure is simulated. The efficiency of this O(M ) search could be a problem for large M . Exercise 3.3.7 investigates computational efficiency improvements associated with an alternative algorithm and associated data structure.

Example 3.3.5 Because the time-averaged number in the service node ¯l represents the time-averaged number of broken machines, M − ¯l represents the time-averaged number of operational machines. Program ssms was used to estimate M − ¯l for values of M between 20 and 100, as illustrated in Figure 3.3.9.

0 10 20 30 40 50 60 70 80 90 100 Number of Machines (M )

100 2030 4050 6070 80

M − ¯l

• • • • • • • •

........................................................

... ... ... ... ... ... ... ... ... ... ... Figure 3.3.9.

Time-averaged number of operational machines as a function of the number of machines.

As expected, for small values of M the time-averaged number of operational machines is essentially M . This is consistent with low server utilization and a correspondingly small value of ¯l. The angled dashed line indicates the ideal situation where all machines are continuously operational (i.e., ¯l = 0). Also, as expected, for large values of M the time-averaged number of operational machines is essentially constant, independent of M . This is consistent with a saturated server (utilization of 1.0) and a correspondingly large value of ¯l. The horizontal dashed line indicates that this saturated-server constant value is approximately 67.

Distribution Parameters

The parameters used in the distributions in the models presented in this section, e.g., µ= 2.0 for the average interarrival time and β = 0.20 for the feedback probability in the single-server service node with feedback, have been drawn from thin air. This has been done in order to focus on the simulation modeling and associated algorithms. Chapter 9 on “Input Modeling” focuses on techniques for estimating realistic parameters from data.

3.3.4 EXERCISES

Exercise 3.3.1 Let β be the probability of feedback and let the integer-valued random variable X be the number of times a job feeds back. (a) For x = 0, 1, 2, . . . what is Pr(X = x)? (b) How does this relate to the discussion of acceptance/rejection in Section 2.3?

Exercise 3.3.2a (a) Relative to Example 3.3.2, based on 1 000 000 arrivals, generate a table of ¯x and ¯q values for β from 0.00 to 0.24 in steps of 0.02. (b) What data structure did you use and why? (c) Discuss how external arrivals are merged with fed back jobs.

Exercise 3.3.3 For the model of a single-server service node with feedback presented in this section, there is nothing to prevent a fed-back job from colliding with an arriving job.

Is this a model deficiency that needs to be fixed and, if so, how would you do it?

Exercise 3.3.4a Modify program ssq2 to account for a finite service node capacity.

(a) For capacities of 1, 2, 3, 4, 5, and 6 construct a table of the estimated steady-state probability of rejection. (b) Also, construct a similar table if the service time distribution is changed to be Uniform(1.0, 3.0). (c) Comment on how the probability of rejection depends on the service process. (d) How did you convince yourself these tables are correct?

Exercise 3.3.5a Verify that the results in Example 3.3.4 are correct. Provide a table of values corresponding to the figure in this example.

Exercise 3.3.6 (a) Relative to Example 3.3.5, construct a figure or table illustrating how ¯x (utilization) depends on M . (b) If you extrapolate linearly from small values of M, at what value of M will saturation (¯x = 1) occur? (c) Can you provide an empirical argument or equation to justify this value?

Exercise 3.3.7a In program ssms the time-of-next-failure list (array) is not maintained in sorted order and so the list must be searched completely each time another machine failure is simulated. As an alternative, implement an algorithm and associated sorted data structure to determine if a significant improvement in computational efficiency can be obtained. (You may need to simulate a huge number of machine failures to get an accurate estimate of computational efficiency improvement.)

Exercise 3.3.8 (a) Relative to Example 3.3.2, compare a FIFO queue discipline with a priority queue discipline where fed-back jobs go the head of the queue (i.e., re-enter service immediately). (b) Is the following conjecture true or false: although statistics for the fed-back jobs change, system statistics do not change?

Exercise 3.3.9a (a) Repeat Exercise 3.3.7 using M = 120 machines, with the time a machine spends in the operational state increased to an Exponential (200.0) random variate.

(b) Use M = 180 machines with the time spent in the operational increased accordingly.

(c) What does the O(·) computational complexity of your algorithm seem to be?

STATISTICS

Sections

4.1. Sample Statistics (program uvs) . . . 132 4.2. Discrete-Data Histograms (program ddh) . . . 148 4.3. Continuous-Data Histograms (program cdh) . . . 159 4.4. Correlation (programs bvs and acs) . . . 172 The first three sections in this chapter consider the computation and interpretation of univariate-data sample statistics. If the size of the sample is small then essentially all that can be done is compute the sample mean and standard deviation. Welford’s algorithm for computing these quantities is presented in Section 4.1. In addition, Chebyshev’s inequality is derived and used to illustrate how the sample mean and standard deviation are related to the distribution of the data in the sample.

If the size of the sample is not small, then a sample data histogram can be computed and used to analyze the distribution of the data in the sample. Two algorithms are presented in Section 4.2 for computing a discrete-data histogram. A variety of examples are presented as well. Similarly, a continuous-data histogram algorithm and a variety of examples are presented in Section 4.3. In both of these sections concerning histograms, the empirical cumulative distribution function is presented as an alternative way to graphically present a data set. The fourth section deals with the computation and interpretation of paired-data sample statistics, in both paired-correlation and auto-correlation applications.

Correlation is apparent in many discrete-event simulations. The wait times of adjacent jobs in a single-server node, for example, tend to be above their means and below their means together, that is, they are “positively” correlated.

Discrete-event simulations generate experimental data — frequently a lot of exper-imental data. To facilitate the analysis of all this data, it is conventional to compress the data into a handful of meaningful statistics. We have already seen examples of this in Sections 1.2 and 3.1, where job averages and time averages were used to characterize the performance of a single-server service node. Similarly, in Sections 1.3 and 3.1, aver-ages were used to characterize the performance of a simple inventory system. This kind of “within-the-run” data generation and statistical computation (see programs ssq2 and sis2, for example) is common in discrete-event simulation. Although less common, there is also a “between-the-runs” kind of statistical analysis commonly used in discrete-event simulation. That is, a discrete-event simulation program can be used to simulate the same stochastic system repeatedly — all you need to do is change the initial seed for the random number generator from run to run. This process is known as replication.

In either case, each time a discrete-event simulation program is used to generate data it is important to appreciate that this data is only a sample from that much larger population that would be produced, in the first case, if the program were run for a much longer time or, in the second case, if more replications were used. Analyzing a sample and then inferring something about the population from which the sample was drawn is the essence of statistics. We begin by defining the sample mean and standard deviation.

4.1.1 SAMPLE MEAN AND STANDARD DEVIATION

Definition 4.1.1 Given the sample x1, x2, . . ., xn (either continuous or discrete data) the sample mean is

¯ x= 1

n

n

X

i=1

xi.

The sample variance is the average of the squared differences about the sample mean s2 = 1

n

n

X

i=1

(xi−x)¯ 2.

The sample standard deviation is the positive square root of the sample variance, s = √ s2. The sample mean is a measure of central tendency of the data values. The sample variance and standard deviation are measures of dispersion — the spread of the data about the sample mean. The sample standard deviation has the same “units” as the data and the sample mean. For example, if the data has units of sec then so also does the sample mean and standard deviation. The sample variance would have units of sec2. Although the sample variance is more amenable to mathematical manipulation (because it is free of the square root), the sample standard deviation is typically the preferred measure of dispersion since it has the same units as the data. (Because ¯x and s have the same units, the ratio s/¯x, known as the coefficient of variation, has no units. This statistic is commonly used only if the data is inherently non-negative, although it is inferior to s as a measure of dispersion because a common shift in the data results in a change in s/¯x.)

For statistical reasons that will be explained further in Chapter 8, a common alterna-tive definition of the sample variance (and thus the sample standard deviation) is

1 n −1

n

X

i=1

(xi−x)¯ 2 rather than 1 n

n

X

i=1

(xi−x)¯ 2.

Provided one of these definitions is used consistently for all computations, the choice of which equation to use for s2 is largely a matter of taste, based on statistical considerations listed below. There are three reasons that the 1/(n − 1) form of the sample variance appears almost universally in introductory statistics books:

• The sample variance is undefined when n = 1 when using the 1/(n − 1) form. This is intuitive in the sense that a single observed data value indicates nothing about the spread of the distribution from which the data value is drawn.

• The 1/(n − 1) form of the sample variance is an unbiased estimate of the population variance when the data values are drawn independently from a population with a finite mean and variance. The designation of an estimator as “unbiased” in statistical terminology implies that the sample average of many such sample variances converges to the population variance.

• The statistic

n

X

i=1

(xi−x)¯ 2

has n − 1 “degrees of freedom”. It is common practice in statistics (particularly in a sub-field known as analysis of variance) to divide a statistic by its degrees of freedom.

Despite these compelling reasons, why do we still use the 1/n form of the sample variance as given in Definition 4.1.1 consistently throughout the book? Here are five reasons:

• The sample size n is typically large in discrete-event simulation, making the difference between the results obtained by using the two definitions small.

• The unbiased property associated with the 1/(n − 1) form of the sample variance applies only when observations are independent. Observations within a simulation run are typically not independent in discrete-event simulation.

• In the unlikely case that only n = 1 observation is collected, the algorithms presented in this text using the 1/n form are able to compute a numeric value (zero) for the sample variance. The reader should understand that a sample variance of zero in the case of n = 1 does not imply that there is no variability in the population.

• The 1/n form of the sample variance enjoys the status as a “plug-in” (details given in Chapter 6) estimate of the population variance. It also is the “maximum likelihood estimate” (details given in Chapter 9) when sampling from bell-shaped populations.

• Authors of many higher-level books on mathematical statistics prefer the 1/n form of the sample variance.

The relationship between the sample mean and sample standard deviation is summa-rized by Theorem 4.1.1. The proof is left as an exercise.

Theorem 4.1.1 In a root-mean-square (rms) sense, the sample mean is that unique value that best fits the sample x1, x2, . . . , xn and the sample standard deviation is the corresponding smallest possible rms value. In other words, if the rms value associated with any value of x is

d(x) = v u u t 1 n

n

X

i=1

(xi− x)2

then the smallest possible rms value of d(x) is d(¯x) = s and this value is achieved if and only if x = ¯x.

Example 4.1.1 Theorem 4.1.1 is illustrated in Figure 4.1.1 for a random variate sample of size n = 50 that was generated using a modified version of program buffon. The data corresponds to 50 observations of the x-coordinate of the righthand endpoint of a unit-length needle dropped at random (see Example 2.3.10). The 50 points in the sample are indicated with hash marks.

0.0 x¯ 2.0

0.0 s 1.0

d(x)

x

|

| | | |

|| || || | || |||| || | || | | ||| |

| | ||| ||| |||||| | | || | ||

.........

.....................

.............................................................................

Figure 4.1.1.

Sample mean minimizes.

rms.

For this sample ¯x ∼= 1.095 and s ∼= 0.354. Consistent with Theorem 4.1.1 the smallest value of d(x) is d(¯x) = s as illustrated.

The portion of the figure in Figure 4.1.1 that looks like

|

| | | |

|| || || | || |||| || | || | | ||| |

| | ||| ||| |||||| | | || | ||

is known as a univariate scatter diagram — a convenient way to visualize a sample, provided the sample size is not too large. Later in this chapter we will discuss histograms — a superior way to visualize a univariate sample if the sample size is sufficiently large.

Chebyshev’s Inequality

To better understand how the sample mean and standard deviation are related, ap-preciate that the number of points in a sample that lie within k standard deviations of the mean can be bounded by Chebyshev’s inequality. The derivation of this fundamental result is based on a simple observation — the points that make the largest contribution to the sample standard deviation are those that are most distant from the sample mean.

Given the sample S = {x1, x2, . . . , xn} with mean ¯x and standard deviation s, and given a parameter k > 1, define the set*

Sk = {xi |x − ks < x¯ i<x¯+ ks}

consisting of all those xi ∈ S for which |xi −x| < ks. For k = 2 and the sample in¯ Example 4.1.1, the set S2 is defined by the hash marks that lie within the 2ks = 4s wide interval centered on ¯x as shown by the portion of Figure 4.1.1 given below.

|

| | | |

|| || || | || |||| || | || | | ||| |

| | ||| ||| |||||| | | || | ||

¯ x

¯ x+ 2s

¯ x −2s

|←−−−−−−−−−−−−−−−− 4s −−−−−−−−−−−−−−−−→|

Let pk = |Sk|/n be the proportion of xi that lie within ±ks of ¯x. This proportion is the probability that an xi selected at random from the sample will be in Sk. From the definition of s2

ns2 =

n

X

i=1

(xi−x)¯ 2 = X

xi∈Sk

(xi−x)¯ 2+ X

xi∈Sk

(xi−x)¯ 2

where the set Sk is the complement of Sk. If the contribution to s2 from all the points close to the sample mean (the points in Sk) is ignored then, because the contribution to s2 from each point in Sk (the points far from the sample mean) is at least (ks)2, the previous equation becomes the following inequality

ns2 ≥ X

xi∈Sk

(xi−x)¯ 2 ≥ X

xi∈Sk

(ks)2 = |Sk|(ks)2 = n(1 − pk)k2s2.

If ns2 is eliminated the result is Chebyshev’s inequality pk ≥1 − 1

k2 (k > 1).

(This inequality says nothing for k ≤ 1.) From this inequality it follows that, in particular, p2 ≥0.75. That is, for any sample at least 75% of the points in the sample must lie within

±2s of the sample mean.

* Some values in the sample may occur more than once. Therefore, S is actually a multi-set. For the purposes of counting, however, the elements of S are treated as distinguishable so that the cardinality (size) of S is |S| = n.

For the data illustrated by the scatter diagram the true percentage of points within

±2s of the sample mean is 98%. That is, for the sample in Example 4.1.1, as for most samples, the 75% in the k = 2 form of Chebyshev’s inequality is very conservative. Indeed, it is common to find approximately 95% of the points in a sample within ±2s of the sample mean. (See Exercise 4.1.7.)

The primary issue here is not the accuracy (or lack thereof) of Chebyshev’s inequality.

Instead, Chebyshev’s inequality and practical experience with actual data suggest that the

¯

x ±2s interval defines the “effective width” of a sample.* As a rule of thumb, most but not all of the points in a sample will fall within this interval. With this in mind, when analyzing experimental data it is common to look for outliers — values so far from the sample mean, for example ±3s or more — that they must be viewed with suspicion.

Linear Data Transformations

It is sometimes the case that the output data generated by a discrete-event simulation, for example the wait times in a single-server service node, are statistically analyzed in one system of units (say seconds) and later there is a need to convert to a different system of units (say minutes). Usually this conversion of units is a linear data transformation and, if so, the change in system statistics can be determined directly, without any need to re-process the converted data.

That is, if x0i = axi+b for i = 1, 2, . . . , n then how do the sample mean ¯xand standard deviation s of the x-data relate to the sample mean ¯x0 and standard deviation s0 of the x0-data? The answer is that

¯ x0 = 1

n

n

X

i=1

x0i = 1 n

n

X

i=1

(axi+ b) = a n

à n

X

i=1

xi

!

+ b = a¯x+ b and

(s0)2 = 1 n

n

X

i=1

(x0i−x¯0)2 = 1 n

n

X

i=1

(axi+ b − a¯x − b)2 = a2 n

n

X

i=1

(xi−x)¯ 2 = a2s2. Therefore

¯

x0 = a¯x+ b and s0 = |a|s.

Example 4.1.2 Suppose the sample x1, x2, . . . , xn is measured in seconds. To convert to minutes, the transformation is x0i = xi/60 so that if, say ¯x = 45 (seconds) and s = 15 (seconds), then

¯ x0 = 45

60 = 0.75 (minutes) and s0 = 15

60 = 0.25 (minutes).

* There is nothing magic about the value k = 2. Other values like k = 2.5 and k = 3 are sometimes offered as alternatives to define the ¯x ± ksinterval. To the extent that there is a standard, however, k = 2 is it. This issue will be revisited in Chapter 8.

Example 4.1.3 Some people, particularly those used to analyzing so-called “normally distributed” data, like to standardize the data by subtracting the sample mean and divid-ing the result by the sample standard deviation. That is, given a sample x1, x2, . . . , xn

with sample mean ¯x and standard deviation s, the corresponding standardized sample is computed as

x0i = xi−x¯

s i = 1, 2, . . . , n.

It follows that the sample mean of the standardized sample is ¯x0 = 0 and the sample stan-dard deviation is s0 = 1. Standardization is commonly used to avoid potential numerical problems associated with samples containing very large or very small values.

Nonlinear Data Transformations

Although nonlinear data transformations are difficult to analyze in general, there are

Although nonlinear data transformations are difficult to analyze in general, there are