Experiments: Basic Problems and Solutions
2.1 Precision control of steady-state mean val ues
2.1.3 Methods of Batch Means
The methods of batch means (BM) that have been proposed can be cate- gorised into three classes:
non-overlapping batch means (NOBM), overlapping batch means (OBM), and spaced batch means (SBM).
All of them require that sequences of analysed data are stationary, thus initial observations, collected during initial transient periods, should be dis- carded following, for example, the method discussed in the next section. Then, "steady-state" observations are collected during a single (long) simu- lation run, and for weakening correlations existing between consecutive data, the recorded sequence of n original observations x1;x2;:::;xn is divided into
nonoverlapping batches (x11;x12;:::;x1n), (x21;x22;:::;x2n), ... of sizem, su-
cientlylarge for making mean values over these batches (almost) independent. 23
Thus, mean values over consecutive non-overlapping batches of observations are used as (secondary) output data in the statistical analysis of simulation results. This approach is based on the assumption that observations more separate in time are less correlated; see [BRIL73] for a formal justication. By the central limit theorem [POLY20], the batch means should also be approximately normally distributed.
Selecting a batch size m that ensures uncorrelated batch means appears to be the major problem. A natural solution is to estimate the correlation between batch means starting from an initial batch size m1, and, if the cor-
relation cannot be ignored, increase the batch size and repeat the test. Thus, at this stage the method in its sequential version requires two procedures: the rst responsible for sequentially testing for an acceptable batch size, and the second responsible for sequentially testing the accuracy of estimators. The sequence of batch means can be regarded as non-autocorrelated when the correlation coecients of all lags assume small magnitudes; say, if they are less than 0.05. One can also determine the threshold for neglecting the autocorrelations in a statistical way, by testing their values at an assumed level of signicance; see [ADAM83] and [WELC83, p.306].
One of the problems associated with estimating the correlation coe- cients is that estimates of correlation coecients of higher lags are less reli- able since they can be calculated from fewer data points within the batch. Usually it is suggested to consider lags not greater than 25% of the sample size ([BOXJ70, p.33]) or even not greater than 8 -10% (c.f., [GEIS64]). Law and Carson [LAWC79] have proposed a procedure for selecting the batch size for processes with autocovariances monotonically decreasing with the value of the lag; see also [LAWK82]. In such a case only the lag 1 autocorrela- tion has to be taken into account. In the same class of processes one may also test batch means against autocorrelation using von Neumann's statistic, [VONN41]. Such an approach was applied in [FISH78a] to processes with positive autocorrelation which decreases monotonically with m. Its sequen- tial implementation, together with the control variates variance reduction technique, was proposed in [ANON86]. The observations were batched not by count but by time, i.e., over equal time intervals, whose length was spe- cially selected, giving an uncorrelated sequence of time means over the in- tervals. Generally, procedures proposed for selecting a proper batch size can employ various statistical techniques and various criteria, and the nal size
of batches is random; dierent batch sizes will be normally found even in dierent replications of the same process.
Some studies of batch means techniques reported poor coverage of batch means interval estimators when they are applied in simulation studies of heavy loaded systems. It is probably caused by the fact that sometimes too small batch sizes are accepted as sucient for getting uncorrelated batch means. For example, the procedures proposed by Fishman in [FISH78a] and [FISH78, p.240] can select batches of as few as 8 observations. Law [LAW83] refers to simulation studies of M/M/1 queues in which the method of batch means with the procedure proposed in [LAWC79] was used. Using kb=10
batches of size m=32, for system utilisation =0.9, and 500 repeated simu- lation experiments, the achieved coverage of the nominal 90% condence in- tervals was only 63%. For these reasons, Kleijnen et al. [KLEI82] suggested the use of a modied Fishman's procedure accepting batches at least 100 observations long, while Welch [WELC83, p.307] recommended construct- ing batches at least 5 times larger than the size m given by a test against autocorrelation, provided that at least 10 such batches can be recorded.
Schmeiser [SCME82] analysed theoretically the trade-o between the number of noncorrelated batches, the batch size and the coverage of con- dence intervals. It was shown that usually the number of batches used in the analysis of condence intervals should be not less than 10, and does not need be greater than 30, if the simulation run is long enough to secure an adequate degree of normality and independence of batch means. This means that having determined a batch size which gives negligibly correlated and approximately normal batch means (which can sometimes require even a few hundred batches to be tested), there is no need to use more than kb=30
batches to obtain condence intervals with a good coverage. Thus, con- dence intervals should be more reliable if obtained from a small number of longer batches. It is obvious that such a transformation improves the nor- mality and independence of batch means, and as such usually yields better coverage of the condence intervals.