Latin Hypercube sampling - Variance reduction in random number generation 39

6. Variance reduction in random number generation 39

6.4. Latin Hypercube sampling

For higher dimensions, the approach of stratied sampling becomes increasingly im-practical: The number of strata needed for the same quality of representation increases exponentially with the number of dimensions. If one dimension is stratied into k sub-sample-spaces, then an n-dimensional problem requires kⁿstrata in order for every region of the sample space to be represented. This is called the curse of dimensionality.³⁵ In 1979, McKay, Beckman and Conover [35] introduced a method called latin hypercube sampling. They extended the idea of latin squares, where in a two-dimensional square array, every element appears exactly once in each row and exactly once in each column, to the n-dimensional case. Hence, it can be seen as a stratication of every dimension rather than the full sample space, and an increase in the number of dimensions leads to a linear increase in the number of samples.

In latin hypercube sampling, every dimension of the sample space Ωⁿ is divided into k equally sized spaces, and we sample exactly once from each sub-sample-space to obtain a random sample X^(d)= (X₁^(d), ..., X_k^(d))for every dimension d = 1, ..., n.

Then, we draw a random permutation π^(d) of {1, ..., k} for every dimension d and match the samples in each dimension to the according permutation. An n-dimensional latin

35See also Glasserman [21], p. 214 & p.236

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 6.4.: A set of 10 2-dimensional latin hypercube samples. Note that in both di-mensions, there is exactly one sample in each stratum

hypercube of k samples then looks like



Note that we have complete dependence within the timesteps, as X_j^(d) ∈ Ω_i implies X_j^(d) ∈ Ω/ _i⁰ for i⁰ 6= i. Therefore, we cannot apply the standard method for estimating the condence region of an estimator. A detailed discussion on how to obtain in-sample condence intervals can be found in Loh [33]. In our experiments, we calculated the em-pirical variances from 100 independent simulations to obtain reasonably accurate variance estimates.

Example Let Ω := [0, 1], n = 3 and k = 5. This represents the case where we want to draw 5 3-dimensional samples, i.e. ve random walks with 3 timesteps. The random variables Xi are then uniformly distributed on the sub-sample-spaces Ωigiven by [ⁱ⁻¹₅ ,₅ⁱ], and dened by

Xi = i − 1 + Ui

5 , where Ui∼ unif orm(0, 1) (6.12) If we draw three random permutations of {1, 2, 3, 4, 5}, for example π⁽¹⁾= {2, 4, 5, 1, 3},

π⁽²⁾ = {1, 5, 4, 2, 3}, π⁽³⁾ = {4, 3, 5, 2, 1}, the corresponding latin hypercube looks like

Restriction to permutation of deterministic lattice points

Assume we want to simulate k paths with n timesteps. So far, we considered latin hypercubes where for each problem dimension, the sample space was divided into k equally sized subintervals. Then, the intervals were permuted and matched at random.

For each interval, we generated a random number to represent the variation within the intervals

As the number of samples in each dimension increases, the interval size from which we sample decreases as well. As a result the variance of Xi diminishes at a rate of

dkV ar(Xi) = d

dkV ar(Ui

k ) = −1

6k⁻³ (6.14)

This gives rise to the idea to set aside the random term Ui within the subintervals for larger values of k. Instead, it is replaced with its expected value ¹₂, so that instead of inter-vals [0,_k¹), [¹_k,²_k), . . . , [^k−1_k , 1)we have the deterministic set of nodes _2k¹ ,_2k³ , . . . ,^2k−1_2k = {x₁, x₂, . . . , x_k}. The simulation is then restricted solely to random permutations on the set {1, ..., k}.

As the numerical results for high-dimensional problems in Chapter 7 indicate, this re-striction does not aect the accuracy of the results for latin hypercube sampling and can therefore be considered superior in that particular setting.

Since only a nite number of values are possible in every step, the possible combinations of lattice points are nite as well. While this is irrelevant for many timesteps, there is a rather high possibility that the same paths occur for few timesteps. In that case an increased sample size does not increase the accuracy, which is unacceptable. For only one timestep, this method represents the deterministic midpoint integration rule, since the permutation of the nodes does not inuence the value in one dimension.

Example A latin hypercube with deterministic nodes for k paths and n timesteps:



Blocks of latin hypercube samples

Since the permutation of the strata in the time dimensions relies on sorting algorithms, the complexity to generate a sample is superlinear. Especially for applications where the main diculty lies within the sampling dimensions rather than the accuracy in one timestep, it is reasonable to split the generation of the latin hypercube samples into multiple batches. This approach is quite popular in the literature. One possible source is Cheng [11], who introduces the idea under the name of cascaded latin hypercube sampling.

Instead of generating one large latin hypercube with k samples in n dimensions, we generate c smaller latin hypercubes. Each has a number k/c of paths and the same number n of time dimensions.

The gain in eciency is caused by the way that the permutations are generated. Usually, this is done by drawing k indexed random numbers and sorting them. The indices of the ordered list then represent a random permutation of the numbers 1, . . . , k. A good sorting algorithm requires an eort of at least O(n log n), so reducing the number of elements to be sorted actually leads to a superlinear decrease in the complexity. Simulation results show a remarkable eectiveness of this method.

Example

By denition, the simulated timesteps are independent of one another. However, when measuring the empirical correlation between the timesteps, the samples usually exhibit some correlation. This section rst briey covers orthogonal or uncorrelated latin hyper-cubes as they are described in the literature by Ye [45]. Here, a deterministic orthogonal latin hypercube is constructed. Secondly, we present the approach we used in our ex-periments, which follows Barraquand [6]. We rst generate a latin hypercube sample, measure the empirical correlations and correct the samples thereafter.

In 1998, Ye [45] introduced a method to construct integer matrices whose columns are fully orthogonal with respect to the canonical scalar product (X⁽ⁱ⁾)^TX^(j). For a number n = 2^m or n = 2^m + 1, his method is able to create an orthogonal matrix with up to 2m − 2columns or timesteps.

In document Fakultät für Mathematik (Page 55-58)