Subproblem C - Efficient random set uncertainty quantification by means of advanced sampling te

For this subproblem, we were asked to find the range of the metrics J1 “

(a) 25 observations.

(b)50 observations.

Figure 8.5: Normalised histogram ofppθ|Deqobtained using Approximate Bayesian Computa-

tional method with (a) 25 experimental observations and (b) 50 experimental observations (b) ofx1, respectively. The normalization assigns a value of 1 to the bin with the highest number

of counts. The red line represent the cut-off value to determine the updated range.

with the improved uncertainty models. The metric J1 is the expected value of the

worst-case requirement metricw, while the metric J2 represents the failure probability

of the system. For solving this subproblem the two strategies introduced in Section 8.2 have been used.

8.4.1 Optimisation in the epistemic space (Double Loop approach) A global optimisation is performed in the epistemic spaceΘ”Ś31_i_“₁Ii, in order to find

those points in Θ that produce the upper and lower bounds on J1 and J2. For any

candidate solution provided by the optimisation algorithm , i.e. θi P Θ, a set of n “

1000random pointstαj, j “1,2, . . . , nuare drawn from the aleatory spaceΩ” p0,1s17

to estimate the metricsJ1 andJ2. The number of samples from the aleatory space has

been selected after performing a convergence test. More specifically, in this test, both J1andJ2are estimated with increasing values ofn(i.e. 100,500,1000,5000and10000)

for5representative realisations of the epistemic space, as shown in Figure 8.7. From the

figure, it can be seen thatn“1000points are sufficient for estimatingJ1andJ2, with a

C.o.V. of 0.1 and 0.05 respectively. The confidence of these estimates can be improved by using a larger sample size at the expense of much more computational cost. The

Figure 8.7: Effect of the number of samples generated in the aleatory space for the inner loop estimation ofJ1andJ2.

search for lower and upper bounds is performed by means of Monte Carlo optimisation using Latin Hypercube sampling, with approximately50000samples. A total of5ˆ107

evaluations of the functionx_to_g(modelf) are thus required to complete the analysis. Here, Monte Carlo is a convenient method to solve the optimisation, as the objective functionsJ1andJ2can be quite noisy, varying approximately between¯10%of the true

value. In order to take into account of the estimation error introduced by using finite sample sets, the objective functions maximum and minimum of Ji“1,2, are redefined

as lower Ji p1´tα{2C.o.V.q and upper Ji p1`tα{2C.o.V.q estimations, respectively,

whereα“0.14 and tα{2“1.48 is the 86th t-Student percentile (see also [59]).

Note that, in order to run the analysis within a reasonable time, parallelisation lies at the foundations of this approach. On a common dual-core personal computer, a single estimation ofJi takes approximately3.4minutes, thus a total of„120 days for a

complete analysis. By means of a double parallelisation, it has been possible to reduce the running time by two orders of magnitude, making it possible to complete the analysis in just „80 hours. A double parellelisation strategy, unlike a standard parallelisation,

makes use of both local processors and cluster units to process the jobs. In other words, the jobs are first sent in parallel to the cluster units and subsequently distributed to every processing units on each cluster machine. In this way, approximately one hundred

percent of the computing power available on the cluster machines is used. 8.4.2 Propagation of focal sets (Random Set approach)

Using the propagation of focal sets method, n “ 1000 random vectors tαj, j “

1,2, . . . , nu are drawn from the aleatory spaceΩ” p0,1s17. In order to evaluate equa-

tions (2.8) and (2.9), genetic algorithms with a population of 125 individuals and 50 generations are adopted requiring a total computational cost of5ˆ106 evaluations of w. Figure 8.8 shows the convergence of the genetic algorithms for two representative focal elements. The convergence is achieved using 30 generations for the identification of the minimum/maximum of the Eq. (2.9).

For this approach, parallelisation is also essential. In fact, approximately 5ˆ106

evaluations of the functionx_to_g are required to complete a full analysis. Although,

in this case, the use of GA makes the parallelisation a little more articulated (jobs need to be sent at any iteration of the algorithm), it is still possible to significantly reduce the running time by approximately one or two orders of magnitude (as in the standard approach).

Results The results of the reduced uncertainty model and the improved model are

summarised in Table 8.3.

Using the proposed methods, it has been possible to bound the actual solution for the targeted metrics. As expected, the improved uncertainty model is far more informative than the reduced model, which is shown by a sensible reduction in the upper bound ofJ1. An even more significant difference is documented for the range of

J2 (see Table 8.3), where the model of uncertainty from being totally uninformative,

J2 P r0, 1s, is reduced to J2 P r0.20, 0.41s. In Table 8.3, the bounds of J1 and J2 are

obtained by means of the two proposed approaches, i.e. optimisation in the epistemic space and propagation of focal sets, respectively. Note also that the optimisation in the epistemic space (standard approach) provided tighter bounds than the propagation of focal sets (counter approach). This result was expected in as much as, the random set methodology cannot cope with distributional probability boxes, and has to treat them as distribution-free p-boxes.

Computation using the optimisation approach in the epistemic space is less intensive than the propagation of focal sets, since only four optimisation tasks are required to find the lower and upper bounds of J1 and J2; while the counter (Random Set) approach

requires a pair of optimisation tasks for each focal element and for each quantity of interest (i.e. J1 and J2). On the other end, the Random set approach performs an

optimisation on a deterministic objective function which increases the chances to find the global optima.

Table 8.3: Bounds of the statisticsJ1andJ2for the reduced and improved uncertainty model

Reduced Uncertainty model Improved Uncertainty model Approach J1 “ r1.37ˆ10´2, 4.97s J1 “ r2.88ˆ10´2, 1.11s Double Loop

J2 “ r6.4ˆ10´2, 0.82s J2 “ r0.24, 0.38s

J1 “ r´1.57ˆ10´4, 54.05s J1 “ r´1.10ˆ10´4, 3.05s Focal Sets

J2 “ r0, 1s J2 “ r0.20, 0.41s

Both approaches are based on global optimisation strategies and hence, they both suffer from the curse of dimensionality. The approaches proposed require an increasingly larger sample size (number of individuals and generations) in order to explore properly the optimisation domain. In consequence, it may not be guaranteed that the calculated optima are actually the global optima. In forward uncertainty propagation, missing the global optima may means computing ranges of the targeted variables that are narrower than the sought ones. In this case, the methods result in an under(inner)-estimation of the actual solution, which may lead to an under-prediction of the targeted metric.

Figure 8.8: Convergence of the objective function, w, to the minimum and maximum for

a representative focal element. Genetic Algorithms have been used with a population of 1000 individuals converging after 53 iterations.

In document Efficient random set uncertainty quantification by means of advanced sampling techniques (Page 125-129)