Latin Hypercube Sampling - Structure of the Uncertainty Problem 111

The Use of Emulation as Part of Decision Making Under

5.1. Structure of the Uncertainty Problem 111

5.2.4 Latin Hypercube Sampling

Section 5.2.1 stated that a set of training runs is used to create the emulator model which approximates the simulator. These training runs are acquired using Latin hy-percube (LHC) sampling [76].

A hyper-rectangle (also called an n-orthotope) is the generalisation of a rectangle to a general number of dimensions. Just as the 2 dimensional rectangle could be extended to a cuboid by specifying a range for a third variable over a third axis, this cuboid could be extended into even higher dimensions, for example, by specifying nd ranges for n_d variables over n_d axes.

Latin hypercube sampling is a method of taking a sample of n points from a specified hyper-rectangle. When taking the sample, each axis of the hyper-rectangle formed by the variables of interest (i.e. v₁, ..., v_N_v, d₁, ..., d_N_d) is divided into n intervals. A sample of size n is then taken which varies the values of the N_v + N_d variables such that for

5.2. Emulation 121

each variable exactly one of the samples has a corresponding value in each of the n intervals of that variable’s axis.

Figure 5.1 illustrates this process when taking an LHC sample of size 6 from 2 variables of interest. The first variable is supposed to take a value in the range 0.95 to 1.05 and the second in the range 0 to 1500. An LHC sample can be acquired for any given ranges of two (or more) variables, but these particular choices mean this illustration is an application for the example that will be detailed in Section 5.3.1.

As the desired sample size is 6, Figure 5.1 (a) shows how first each axis is divided into 6 intervals. Figure 5.1 (b) then illustrates how the sample is taken such that exactly one value in the sample lies in each row and column of the axes. Using the centre point of each of these intervals is known as Lattice sampling [101].

It is not necessary for each point to lie in the centre of the grid. For example, the LHC sample of Figure 5.1 (b) could be used to identify which interval each point of the sample lies in. Then, a uniform sample within that interval could be taken to determine the actual values of the sample. Such a sample is illustrated in Figure 5.1 (c), where the points lie in the same row and column as Figure 5.1 (b), but the points have been randomly sampled (from a uniform distribution) within those intervals.

The Latin hypercube sample gives the input values to be used when simulating train-ing runs, which in turn are used to fit the emulator model. An appropriate number of training runs must be selected and is considered in Sections 5.3.3 and 6.1.3 by consid-ering how the predictive power of a fitted model varies with the size of the LHC sample used.

A potential flaw of LHC sampling is that it is possible for the values of the variables to be highly correlated, which would give a sparse coverage of the sample space. This is illustrated in Figure 5.2, which shows a 6 point sample. This is still technically an LHC sample as each row and column contains exactly one point, but the coverage of the space is very poor in comparison to that of Figure 5.1 (c).

As a result, there are many methods available which can reduce the pairwise correlation between variables in an LHC sample, to give a better coverage of the sample space.

For example, [76] consider using a Cholesky decomposition to transform the sample to reduce correlation between the variables, [11] propose two genetic algorithms to

0.96 0.98 1.00 1.02 1.04

050010001500

Variable 1

Variable 2

(a) Plot to illustrate how the Latin hypercube sample first breaks the hyper-rectangle into n intervals.

0.96 0.98 1.00 1.02 1.04

050010001500

Variable 1

Variable 2

(b) Plot to illustrate a possible Latin hyper-cube sample of size 6.

0.96 0.98 1.00 1.02 1.04

050010001500

Variable 1

Variable 2

Figure 5.1: Plots to illustrate how to take Latin hypercube samples.

5.2. Emulation 123

0.96 0.98 1.00 1.02 1.04

050010001500

Variable 1

Variable 2

Figure 5.2: Plot to illustrate a poor Latin hypercube sample.

improve the Latin hypercube sample and [99] detail how orthogonal sampling could be used, which would divide the sample space into several subspaces, and take separate LHC samples in each subspace. A very simple alternative would be to simply take many LHC samples, calculate pair-wise correlation values between the values of the sample and then actually simulate training runs with the LHC sample which gives the smallest pair-wise correlation. Another common, simple alternative is to take many LHC samples and select the sample with the maximum minimum distance between points [74, 98, 23].

Latin hypercubes are very advantageous in comparison to grids (i.e. evenly distributed points over a given range). First, a more dense coverage of the region sampled is given, in the sense that a wider range of values of the variables is taken. For example, Figure 5.3 (a) illustrates a sample of size 16 when using a 4 by 4 grid to determine the input values of training runs, whilst Figure 5.3 (b) illustrates a 16 point LHC sample.

For the grid, only 4 unique values are used of each variable (repeated 4 times), whereas for the 16 point Latin hypercube sample 16 unique values are sampled for each variable.

The problem is even worse in higher dimensions. For example, in 3 dimensions a 4 by 4 by 4 grid would would use 64 points, though each variable would consider only 4 unique values (which would have each value repeated 16 times for other points in the sample).

However, in Latin hypercube sampling a sample of size 64 would use 64 unique values for each variable, which should allow models to give a better fit.

0.96 0.98 1.00 1.02 1.04

050010001500

Variable 1

Variable 2

(a) Plot to illustrate points which form a 4 by 4 grid.

0.96 0.98 1.00 1.02 1.04

050010001500

Variable 1

Variable 2

(b) Plot to illustrate a 16 point LHC sample.

Figure 5.3: Plots to compare a Latin hypercube sample to a grid of points.

In addition, the structure of a grid can be very limiting to the sample sizes that can be considered. For example, in 3 dimensions a 4 by 4 by 4 grid would use 64 points, whereas a 5 by 5 by 5 grid would use 225 points. However, an LHC sample can be of any size we desire. Further, grid sizes increase exponentially with dimension, whereas this is not necessarily the case for LHC samples.

In document Bayesian Framework for Multi-Stage Transmission Expansion Planning Under Uncertainty via Emulation (Page 175-179)