Environmental Models
BOX 2.3 AN EXAMPLE OF LILLIEFORS TEST FOR A DATA VECTOR OF 10,000 NUMBERS COMING FROM A NORMAL OR UNIFORM DISTRIBUTION
2.4.3.5 Parameter Validation for the Monod Kinetics
To demonstrate the above validation method based on the coincidence or divergence of the confi-dence ellipsoids defined by Equation 2.83, the parameters of the Monod kinetics were again esti-mated using noisy data and starting the search from two differing initial guesses, as described in detail in Marsili-Libelli et al. (2003). The initial and final parameter values are listed in Table 2.1, and the final estimates in the
(
µmax,Ks)
plane are shown in Figure 2.40 for both cases: case 1 (suc-cessful estimation) and case 2 (faulty estimation). In real life, where the ‘real’ parameter values are unknown, there would be no way of knowing whether the results are reliable. For this rea-son, the previous validity test is used, and the confidence ellipses with the two methods are com-pared. Figure 2.41 shows the coincidence of the ellipses, denoting that the estimated parametersTABLE 2.1
Estimation of the Monod Kinetics from Noisy Data (σ = 0.05) with Differing Simplex Starting Points
Ks
(
mg CODL−−1)
µµmax( )h−−1 Y bh( )h−−1 Starting Point E P( )True values 20.000 0.500 0.500 0.030
Case 1 19.0997 0.483 0.525 0.0303 [22.0 0.5 0.37 0.04]T 0.265
Case 2 31.605 0.711 0.485 0.0289 [33.3 0.8 0.57 0.08]T 0.548
Source: Marsili-Libelli, S. et al., Ecol. Model., 165, 127–146, 2003.
17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 0.43
0.44 0.45 0.46
μmax 0.470.48 μmax
0.49 0.50 0.51 0.52 0.53
30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 0.64
0.66 0.68 0.70 0.72 0.74 0.76 Case 1 0.78
Ks Ks
Case 2
FIGURE 2.40 Contour portrait of E
( )
P and the importance of a good search initialization. The dot indi-cates the location of the minimum P reached by the optimization algorithm for the two simplex initialization points of Table 2.1. Only in case 1 the true minimum is reached. (Redrawn with permission from Marsili-Libelli, S. et al., Ecol. Model., 165, 127–146, 2003.)are reliable, whereas Figure 2.42 shows a considerable divergence, which should be interpreted as a warning that the final estimates are unreliable and deserve further investigation. For the ‘good’
estimates of case 1, the confidence intervals of the individual parameters are then computed by Equation 2.89 and listed in Table 2.2.
2.4.4 a MATLAB exercISe: Parameter eStImatIonoFthe S&P model
To wrap up the previous notions, let us again consider the simple S&P model consisting of two river reaches, each with an upstream point source of biodegradable pollutant. The model (2.32) will be used again, but now the exercise is aimed at the selection of differing sampling points along the DO trajectories and see how the differing sampling alternatives influence the estimation of the model parameters
(
K Kb, c)
. But first, let us consider how the selection of the variables included in the error functional changes the nature of the estimation: if both model variables (BOD and DO) are used, then Figure 2.43a shows that the parameter correlation, giving rise to the ‘narrow valley’shape, increases. In fact, the EBOD DO, contours are more elongated than those produced by EDO. Further, Figure 2.43b shows that BOD gives a significant contribution, but being more difficult to measure than DO, it will not be much missed in the subsequent estimation, based on DO alone.
Two interconnected MATLAB scripts were designed for this exercise, and their use is illustrated in the flowchart of Figure 2.44. The software described here can be retrieved from the companion software bundle, in the subfolder \Exercises\Chapter_2\Calibrate_S&P. The first module to be used
15 20 25
0.40 0.45 0.50 0.55
15 20 25
0.49 0.50 0.51 0.52 0.53
15 20 25
0.0295 0.0300 0.0305 0.0310 0.0315
0.40 0.45 0.50 0.55
0.49 0.50 0.51 0.52 0.53
0.40 0.45 0.50 0.55
0.0295 0.0300 0.0305 0.0310 0.0315
0.49 0.50 0.51 0.52 0.53 0.0295
0.0300 0.0305 0.0310 0.0315
Ks Ks Ks
Y
Y
Y bh
bh
μmax μmax
μmax bh
CH CFIM
FIGURE 2.41 Confidence ellipsoids in the case 1 estimation. The coincidence of the confidence regions computed with the Hessian (dashed line) and the FIM matrix (solid line) indicates that the simplex converged to the ‘real’ parameters (dot). (Redrawn with permission from Marsili-Libelli, S. et al., Ecol. Model., 165, 127–146, 2003.)
is Fisher _ contour _ SP _ DO.m. It enables the user to simulate a sampling campaign by clicking with the mouse around the highest sensitivity spots (right-click to exit the sampling). Then, the FIM and the estimation bounds are computed based on these samples. All the data and the computed quantities (error surface, FIM, covariance matrix and parameter bounds) are then saved for the subsequent estimation, performed by the other module (Cal _ SP _ DO.m), which also vali-dates the estimates by applying both the non-parametric and parametric methods.
Figure 2.45 shows the estimation results of the S&P model with a noisy data set (DO _ sag _ noise.mat) concentrated in the high-sensitivity zones. Given the data and the model, the FIM can be computed before the estimation, so that the expected confidence intervals for the two parameters are known in advance. In this case, these values are
(
δKb= ±0 01059. ;δKc = ±0 011368.)
, as shown0.49
0 20 40 60
0.2 0.4 0.6 0.8 1.0 1.2
0 20 40 60
0.46 0.47 0.48 0.49 0.50 0.51
0 20 40 60
0.027 0.028 0.029 0.030 0.031
0.2 0.4 0.6 0.8 1 0.46
0.47 0.48 0.50 0.51
0.2 0.4 0.6 0.8 1 0.027
0.028 0.029 0.030 0.031
0.47 0.49 0.51
0.027 0.028 0.029 0.030 0.031
Ks Ks Ks
Y
Y
Y
μmax bh
bh bh
CH CFIM
μmax
μmax
FIGURE 2.42 Confidence ellipsoids in the case 2 estimation. The divergence between the two confidence regions indicates that the simplex did not converge to the ‘real’ parameters (dot). (Redrawn with permission from Marsili-Libelli, S. et al., Ecol. Model., 165, 127–146, 2003.)
TABLE 2.2
Values and Confidence Intervals of the Correctly Estimated Monod Kinetic Parameters (Case 1 in Table 2.1)
Ks(mg CODL−−1) µµmax( )h−−1 Y bh( )h−−1
Case 1 19.09968 ± 1.72181 0.48298 ± 0.03212 0.50524 ± 0.01139 0.03025 ± 0.00057 Source: Marsili-Libelli, S. et al., Ecol. Model., 165, 127–146, 2003.
in the left part of Box 2.4. After performing the actual model calibration, the results confirm that the estimated parameters are indeed inside the expected confidence intervals. In fact,
δ
0 0028218 0 01059 0 0037711 0 011368
. .
0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.12
0.20 0.25 0.160.140.120.100.08 Kb Kc
0.06
0.30
(a) (b)
FIGURE 2.43 Contour profile (a) and 3D visualization (b) of the error functionals for the calibration of the Streeter & Phelps model using either both state variables BOD and DO (dashed lines) or only DO (solid lines).
The counterintuitive result is that using only DO measurements, the ‘narrow valley’ problem is alleviated, though the gradient decreases. The 3D surface (b) shows that, although BOD gives a considerable contribu-tion, it narrows the bottom of the functional.
Simulate the S&P model with nominal parameters
Sampled data, FIM, Confidence bounds
Preliminary operations
Fisher_contour_SP_DO.m Model calibration and validation
Cal_SP_DO_Fish.m
Ferr_SP.m
SP.mdl
FIGURE 2.44 Software organization of the MATLAB scripts for the complete estimation of the Streeter &
Phelps model, in the subfolder \Exercises\Chapter_2\Calibrate_S&P.
The residual analysis in Figure 2.45b and d confirms that they are uncorrelated and normally dis-tributed (both the KS and he Lilliefors test confirm this), and the F-test on the regression line in Figure 2.47a also confirms that the null hypothesis (regression line coinciding with the 1:1 line) can-not be rejected. The estimated parameters are well within the 95% confidence contour of the error functional, as shown in Figure 2.45c.
Figure 2.46a shows another example of the S&P model identification, but this time a highly biased data set (DO _ bias.mat) was used, where it is assumed that the oximeter has a strong bias, so that its readings are consistently lower than the actual DO value. In addition, it has a low accuracy (high divergence with true values) but a high precision (little measurement dispersion). The prelimi-nary computation of the FIM yields surprisingly lower confidence bounds, due to the high precision
δKb = ± δKc = ±
(
0 0055481. ; 0 0065491.)
that are not satisfied by the subsequent estimation; in fact, the differences between real and estimated parameters areδ
0 027927 0 0055481
0 046436 0 0065491 (2.91)
Also, Figure 2.46c shows that the estimated parameters are just on the border of the 95% confidence contour of the error functional. The residual analysis in Figure 2.46b indicates a high autocorrela-tion, as a consequence of a biased model, whereas the F-test of Figure 2.47b yields an extremely
0 10 20 30 40 Calibrated model Nominal model Data
Kb
Exp. cdf Fitted cdf
(d)
0.08
0.06 0.10 0.12 0.14 0.16 0.18
0.10
FIM approx. conf. region
Exact conf. region
FIGURE 2.45 Estimation results for the Streeter & Phelps model using the data file DO _ sag _ noise.
mat. In (a), the fitted model response is compared with the ‘real’ response, showing their coincidence. In (c), the dashed contour corresponds to the 95% exact confidence region and includes the approximate confidence region based on the FIM. The estimated parameters are well inside both regions. The graphs in (b) assess the residuals, showing the lack of autocorrelation, while in (d) the Gaussian cdf is well approximated by the experimental cumulative distribution.
high F value, confirming that this identification should be rejected as a consequence of the poor data quality.