• No results found

3.5 Metamodels

3.5.3 Kriging

The kriging approximation method is a more complex version of the polynomial fit method. It uses sample data points to build a model that can be used to predict the output or performance of interpolated design points by fitting a low-order polynomial through the data points but allowing the predictions along these polynomials to deviate based on a correlation model such as a Gaussian distribution of all the existing points. The Gaussian distribution’s characteristics are based on the correlation between the sample points i.e. on their proximity to each other. This allows the kriging parameters to change with the prediction point giving it more flexibility while still maintaining accuracy41. Assume the output or performance, P is represented as:

P(x, y) =f(x, y) +Z(x, y) (3.43)

wheref(x, y) is the polynomial andZ(x, y) is the correlation model that is based on the Gaussian distribution for all the cases in this project. The correlation between all the sample points with each other and the correlation between the required point and all the points is found as the Gaussian distribution of the distances between all these points with a ‘roughness’ parameter, θ for each design parameter. So,Z is actually a function ofθ as well as xand y i.e. Z(θ, x, y). All the parameters and values are normalised and then denormalised at the end so that the mean of Z(x, y) is 0. Also, the data are normalised in a scalar way over each parameter between 0 and 1. So the correlation betweenP and F can be described as:

whereβ can be approximated as:

β =(FTZ−1F)1FTZ−1P (3.45)

and the variance can be estimated as: σ2= 1

m(P−F β)

TZ1(PF β) (3.46)

where m is the number of initial data points. The matrix Z, and β and σ2 depend on θ. θ is effectively a width parameter that determines how far the influence of a data point extends43. It is similar to a weight put on each input depending on its influence on the output. A low value means there is a high correlation and the influence of each point affects many other points. A large value means there is less correlation and hence it has less influence on another point. In this way, θ can also be used to find out the design parameters that have the most influence on the performance parameters.

The optimum value ofθis defined as the maximum likelihood estimator, the maximiser of 1

2

(

mlnσ2+ln|Z|) (3.47)

where |Z|is the determinant of Z41. This value is found iteratively41, although for small cases with few inputs, a good estimate can be made by comparing predictions with the original data. Since this is the case for this project and to quicken the kriging process, this process was used to obtain θ.

The Gaussian correlation function,Z(x, y) can be found as the covariance matrix: Cov(i, j) = exp ( ∑ [θx(xi−xj)2+θy(yi−yj)2 ]) (3.48) or exp ( ∑ [d2]) (3.49) Cov(i, r) = exp ( ∑ [θx(xi−xr)2+θy(yi−yr)2 ]) (3.50) whereiandjare the sample points andris the required points,xandyare the design parameters and d is the distance between the points, which in this case is squared. Actually, the value to which it is raised can be between 0 and 2 and represents the differentiability of the response function with respect to the design parameters. Values close to 0 indicate that the function is not differentiable or smooth and closer to 2 is for differentiable functions41.

The weight matrix, λ, that contains all the weights to obtain the required point from each point in the design space, can then be found as

λ=Cov(i, j)−1Cov(i, r) (3.51) Then the prediction can be made as the summation of each appropriately weighted objective function value ˆ Pr= ni=0 λi×Pi(x, y) (3.52)

More details can be found in the references41, 134. For the polynomial fit, up to order two poly- nomials are common. In some cases, constant values are sufficient, such as the mean value of all the data.

Figure 3.18 compares the data in Figure 3.17 as predicted by the polynomial, ANN and kriging metamodels. The kriging method tends to smooth the surface out more than the ANN, although this can be dealt with by tweaking the parameters. Nevertheless, the ANN and the kriging both seem to perform consistently well across all the cases with little change in parameters, slightly more so for the ANN than the kriging method. The advantage of the kriging is that training

time is not required. Both methods are capable of making reasonably accurate predictions. An example of the fortran kriging program is included in Appendix B.4.1.

Figure 3.18: Comparison of ANN prediction accuracy with polynomial of order 10 and kriging for∆Cm. 3

hidden layers (hl) and 15 neurons (n) were used for the ANN. Number of inputs is kept constant at 2. The solver training and prediction comparison is included. ON the right, the blue surface is the ANN prediction and the red is the kriging prediction.