CHAPTER 2. GENERALIZED PARAMETERIZATION METHOD
2.7 Case Study: Three GP Methods and Genetic Algorithm
2.7.1 Introduction
This section uses four base methods to formulate three GP method. Voronoi tessellations (VT) is chosen as the zonation method and it is combined with the natural neighbor (NN) interpolation, inverse distance (ID) interpolation, and ordinary kriging (OK) methods. A total of three GP methods, NN-VT, ID-VT, and OK-VT will be considered. These three GP methods are used to demonstrate the flexibility of the GP method. A genetic algorithm is used to find the optimal parameters for these three GP methods.
2.7.2
Numerical Example
Figure 2.4 shows a natural log-hydraulic conductivity distribution in a two-dimensional confined aquifer. The characteristics of the aquifer and modeling information are listed in Table 2.1. MODFLOW-2000 (Harbaugh et al., 2000) is used for the groundwater flow simulation. The aquifer is excited by six pumping wells with constant pumping rates. The pumping sites and rates are shown in Figure 2.4. Groundwater heads are collected at eleven observation locations (circles in Figure 2.4) at the end of each day. Consequently, 330 head observations are obtained and then corrupted by Gaussian noise with zero mean and a constant standard deviation σh = 0.01 m to
represent the observation error. Moreover, 60 hydraulic conductivity values are sampled at the sample locations shown in Figure 2.5(a). A Voronoi zone structure of 60 zones is created based on the 60 sampled locations, which will be used in the GP method. The hydraulic conductivity is assumed to be natural log-normally distributed. An exponential semivariogram model (Figure 2.5(b))
d 2.4 exp( 4679.4) d , where d is the distance lag, is obtained according to these 6031
(a)
Figure 2.4 Synthetic aquifer: The true ln(K) distribution. Circles: head observation sites. Crosses:
32
(a) (b)
Figure 2.5 Synthetic aquifer: (a) The K sampled locations and Voronoi tessellation. (b) The
experimental semivariogram
d 2.4 exp( 4679.4) dTable2.1: Characteristics of the synthetic aquifer and modeling
Aquifer Confined
Dimensions 3200 meters by 3200 meters
Boundaries AD:constant head (h=40 meters)
AB BC, and CD: impervious
Hydraulic conductivity(K) 2~10 m/day
Specific storage 10-4 meter-1
Number of pumping wells 6 (crosses in Figure 2.4)
Number of head observation boreholes 11 (circles in Figure 2.4)
Number of K measurements 60 (pluses in Figure 2.5(a))
Discretization 64 rows by 64 columns
Stress period 30 days
33
2.7.3 Sensitivity Analysis
This section only studies the sensitivity
π
β. The Jacobian matrix and gradient will be
shown in chapter 4 with a real case study. Based on Eq. (2.18) , the GP sensitivity is calculated
and the m1 j j
π values are plotted in Figure 2.6. The hydraulic conductivity estimation usingthe four base methods can be found in Figure 2.2. Generally speaking, high GP sensitivity occurs where there is a sharp change in the hydraulic conductivity distribution. For the NN-VT GP method, a pattern of low- high- low- high- low can be observed from the top-left corner to the bottom-right corner. A similar pattern can be found in the OK-VT GP sensitivity. However, such a pattern is not obvious for the ID-VT GP sensitivity. Looking at Figure 2.2, one finds that the inverse distance interpolation looks like a series of alternating highs and lows. No flat areas can be observed. This explains the reason why there are higher sensitivities everywhere in the ID-VT GP sensitivity distribution.
2.7.4 Global Optimization
The inverse problem posed in section 2.3.1, which involves groundwater modeling, is nonlinear and leads to a complicated optimization problem. The gradient based algorithms are anticipated to be inefficient in solving Eq. (2.8) because a large number of gradient evaluations through forward modeling of groundwater flow are necessary. In addition, many local optima may exist. From our experience, a genetic algorithm (GA) will be suitable for solving this non- linear problem because GA is a derivative-free heuristic method and is able to search for a global optimum with multiple searching points (Salomon, 1998). GA searches for a global optimal solution using a set of processes and operators analogous to bio-evolution processes (e.g.,
34
selection, crossover, mutation, reproduction, and replacement) to improve/maximize the fitness of chromosomes generation by generation (Goldberg 1989).
A GA solver in Matlab (Houck et al., 1996) is used to search for the near-global optimal solution, and a gradient-based solver provided in Matlab (Han, 1977; Coleman and Li, 1996) is used to search for the optimal β value. MODFLOW is linked with GA as a simulation- optimization model to obtain the simulated head observations and evaluate the fitness. For each GA run, 50 chromosomes are assigned to one population to be evolved for 100 generations. The probability for the uniform crossover is 0.5, and the probability for mutation is 0.08. A convergent GA fitness value is found after five GA runs (a total of 15,000 evaluations). The fitting residuals and estimation uncertainty are shown in Table 2.2, where the fitting residual is the least square error
( )p ,
obs
T
( )p ,
obs
p
h β h h β h and uncertainty is calculated by( )
(Cov )p GP
trace .
In summary, the indictor GP is able to honor the sampled data, minimize the misfit to the observations, and identify a reasonable conditional estimation with small conditional variances.
Table 2.2 Fitting residuals and estimation uncertainty.
NN ID OK VT GP methods (=opt) NN-VT ID-VT OK-VT Fitting Residual 2.46 44.70 1.69 2.93 0.42 12.31 0.19 Uncertainty 434 614 425 701 520 552 518
2.7.5 Identification Results
The results based on the RMSE formulation are given in section 2.3.1. The other formulation will be tested in the following chapters. The identified ln(K) distribution and estimation variance using NN-VT, ID-VT, and OK-VT are plotted in Figure 2.7.
35
(a) OK-VT
Figure 2.6 The GP sensitivity plotted at the location of the GP parameters; the circle size
represents the value of m1 j j
π .36
(b) NN-VT
(c) ID-VT
Figure 2.6 The GP sensitivity plotted at the location of the GP parameters; the circle size
represents the value of m1 j j
π .37
(a)
(b)
Figure 2.7 (a) NN-VT estimation, (b) NN-VT estimation variance, (c) ID-VT estimation, (d)
38
(c)
(d)
Figure 2.7 (a) NN-VT estimation, (b) NN-VT estimation variance, (c) ID-VT estimation, (d)
39
(e)
(f)
Figure 2.7 (a) NN-VT estimation, (b) NN-VT estimation variance, (c) ID-VT estimation, (d)
40
2.8 Summary
This chapter explored many aspects of the generalized parameterization method. Three interpolation methods, natural neighbor interpolation, inverse distance interpolation, and ordinary kriging interpolation, and a zonation method, Voronoi tesselation method, were introduced. To formulate the inverse problem for the identification of GP parameters, the root mean standard error, maximum log-likelihood, maximum conditional log-likelihood approaches were proposed. Three optimization algorithms, the genetic algorithm, hill climbing algorithm, and BFGS algorithm, were presented for the optimization problem. For the sensitivity analysis and gradient calculation, methods were derived to calculate the GP sensitivity, the Jacobian matrix, and the gradient of the objective function with respect to the GP parameters. The non- sensitive problems of the GP parameters was also discussed. The Fisher information as a means of the GP parameter covariance was derived, and issues about the positive definiteness were discussed.
A numerical example was presented to demonstrate the proposed GP methodologies. The results showed that the generalized parameterization method can provide a better representation of the conditional estimation of hydraulic conductivity in a random field. The Voronoi zone structure was combined with a natural neighbor interpolation method, an inverse distance interpolation method, and an ordinary kriging method to formulate three GP methods to capture the non-smoothness of heterogeneity. The complexity of the inverse problem of identifying the optimal GP parameters was greatly reduced by using a genetic algorithm (GA). We conclude that these GP methods are able to find the optimal conditional estimation of hydraulic conductivity between that of the zonation and interpolation methods with the minimal misfit value of groundwater head observations.
41