B.2.1 (Partial) Probability Weighted Moments
B.7 Sensitivity analysis
B.7.2 Sensitivity analysis – incomplete information
Although research for information on past floods might seek to obtain a complete census of past flooding events, some events may not get picked up in the historical record. This section investigates the possible effects of including incomplete historical records in the estimation procedure.
The conceptualisation of the testing framework within the simulation study for this exercise is not trivial, as a random number 𝑘 of threshold exceedances is recorded in the Monte Carlo generation of each historical sample. The simulation setting forced at least one historical record above the threshold to be generated, but it did not fix the number of exceedances of the perception threshold.
The expected number of threshold exceedances in a given historical record of length, ℎ, with a perception threshold corresponding to the pth percentile is 𝐸[𝐾] = ℎ ∗ (1 − 𝑝). The sensitivity of the estimation procedure to the presence of incomplete information (for example, historical records not included in the estimation procedure) was
evaluated only on the simulation settings for which the expected value of events above the threshold is larger than 5 (𝑘 > 5). Furthermore, for each simulation setting the number of events that could potentially have not been included in the historical record was different: if the total number of threshold exceedances is 5, it is not possible to generate a dataset in which more than 4 historical record are missing and still have some historical information.
Depending on the expected number of events in the historical record, different sets of events of increasing size 𝑘− were deleted from the historical record, with 𝑘− varying
within (1, 2, 3, 5, 6, 10, 15, 30, 45). For each simulation setting, the values that 𝑘−
deleting the information for 𝑘− events on the final estimate depends on the actual true
value of 𝑘.
A subsample of results for the simulation settings with low expected number of
threshold exceedances 𝑘 < 15 is shown in Figure B.26, where the RMSE for the shape parameter is shown as a function of the number of events deleted from the historical record. Only selected simulation settings are shown in each panel and, for each simulation setting, the expected number of historical events to exceed the threshold is different. Each line type and symbol indicates a different sample size, while colours indicate the historical to systematic length ratio. For example, in the upper left panel, the upper blue line with dots indicates the simulation setting in which a systematic record of length 10 years is augmented by an historical record covering a period of 50 years, in which one would expect to observe on average (1 − 0.85) 50 = 7.5 events above the threshold. The greenish line with dots just below indicates the simulation setting in which a systematic record of length 10 years is augmented with an historical record covering a period of 100 years, in which one would expect to observe on average (1 − 0.85) 100 = 15 events above the threshold. Thus it is possible to test the effect of excluding up to 6 events from the historical record. The effect of not including one or more historical data points can be assessed by looking at the increase in the RMSE as the number of missing data points increases.
Figure B.26 RMSE for the shape parameter as a function of the number of historical events missing in the historical record using the likelihood estimation
method
Notes: Original historical records contain on average at most 15 points.
The effect of missing information in the historical sample does not seem to have a very large impact on the quality of the estimate of the shape parameter, although in the examples shown in Figure B.26, in most cases the number of events discarded in the historical sample tend to be a relatively small proportion of the whole historical
information. However, since considerable effort is taken when constructing series of historical events, it is hoped that only a small percentage of the past events are not present in the dataset.
To give a more complete description of the relationship between the number of missing data points and the loss in terms of the RMSE of the shape parameter, the results for a larger set of simulation settings is given in Figure B.27. The x-axis is tweaked into a log scale to make the figure slightly more readable. Note that some additional simulation settings with higher perception threshold 𝑋0 were included to investigate the effect of the incomplete samples; the expected number of threshold exceedances for these simulation setting is not very high for very long historical periods, corresponding to more realistic data availability scenarios.
Figure B.27 RMSE for the shape parameter as a function of the historical events missing in the historical record using the likelihood estimation method
The estimation procedure appears to be relatively resistant to the case in which a fairly high number of historical events is missing from the historical record. The estimation gives much larger RMSE only when very large proportions of the historical events are missing. For example, in the upper right panel of Figure B.27, the long dashed line with squares represents the results for the case in which a systematic record of 46 years is augmented with an historical record spanning ℎ = 2,300 years. Since the perception threshold corresponds to the 100-year event, on average there would be 23 historical events exceeding 𝑋0. The estimation performs in a fairly stable way for increasing number of missing data points up to 𝑘−= 6 and only gives visibly worse results when
10 points are missing from the historical record. This corresponds to almost half of the historical sample. The effect of the missing information is stronger in the case of negative shape parameters.
To investigate the overall performance of the likelihood method using historical data on incomplete samples for design event estimation, the RMSE for the log of the estimated of different 𝑄𝑇 values obtained when using a systematic sample of 46 years and an historical sample of 230 years and 2,300 samples is shown in Figure B.28. The RMSE obtained when using the PWM/L-moment method with the systematic record only is also displayed for reference.
(A)
(B)
Figure B.28 RMSE for the log(𝑸𝑻) for a systematic sample size of 46 and an historical period covered of (A) 𝒉 = 230 and (B) 𝒉 = 2,300
Notes: The expected number of historical events is shown for each perception threshold 𝑋0.
Colours indicate the number of events not included in the historical record 𝑘−.
The dashed line indicates the PWM estimate for systematic data only. Estimation methods: likelihood.
The effect on the overall performance of the lack of threshold exceedances in the historical record is stronger for lower perception thresholds and for distributions with negative shape parameters. Interestingly, when events with high return period are to be estimated the estimation seems to give lower RMSE than the traditional L-moment analysis even if a large part of the historical sample is missing.
Results for the effect of the lack of historical information in the (P)PWM setting are shown in Figure B.29 and Figure B.30, which show the effect of the increasing number of missing data points in the historical record on the RMSE for the shape parameter in the (P)PWM approach. These figures can be compared with Figure B.26 and
Figure B.27. The (P)PWM method seems to be less robust than the likelihood method to the lack of information in the historical record.
Figure B.29 RMSE for the shape parameter as a function of the historical events missing in the historical record using the (P)PWM estimation method
Figure B.30 RMSE for the shape parameter as a function of the historical events missing in the historical record using the (P)PWM estimation method