Appendix B: Statistical methods for the inclusion of historical data

in flood frequency analysis

B.1 Introduction

The advantages of incorporating information on historical flood events into classical flood frequency analysis (FFA) have long been recognised (Stedinger and Cohn 1986, Bayliss and Reed 2001, Payrastre et al. 2011) and a number of authors have proposed appropriate methods. The inclusion of historical records has been shown to

substantially reduce the uncertainty around estimated design events and can provide insight into the rarest events which might have pre-dated relatively short systematic records of river flow. The procedures used are usually extensions of 2 estimation methods often used in hydrology: L-moments and maximum likelihood estimation. L-moments were introduced by Hosking (1990) and are widely used by hydrologists in particular in regional flood frequency analysis (RFFA). The statistical methods

described in the Flood Estimation Handbook (FEH) and its subsequent updates are based on L-moment estimation. L-moments can be computed as linear combinations of Probability Weighted Moments, and Wang (1990a) and Wang (1990b) introduced partial probability weighted moments (PPWM) to accommodate censored samples like historical records. Historical records are censored samples in the sense that a large part of the information available is actually the fact that a number of events did not exceed a certain perception threshold; for these points no information on the flow magnitude is available, but it is known that they were below a given value.

Maximum Likelihood (ML) estimation (Azzalini 1996, Coles 2001) is widely used due to its flexibility and the optimal asymptotic properties which maximum likelihood estimates possess, namely unbiasedness and efficiency. Stedinger and Cohn (1986) show how to modify the likelihood function to include historical data. In their review on methods to include historical data in flood frequency analysis, Bayliss and Reed (2001) note that approaches based on Maximum Likelihood seem to be used more frequently by researchers, despite the potential numerical failures of the maximum likelihood

maximisation. Indeed, in a more recent review, Kjeldsen et al. (2014) found that, beside Spain, those countries that have standardised procedures in place for the use of

historical data in FFA recommend the use of Maximum Likelihood approaches that can also be combined with a Bayesian approach. A brief introduction to (Partial) Probability Weighted Moments and Maximum Likelihood methods is given below. The reader is referred to the references in the text for more information.

The standard methods for flood frequency estimation rely on samples of systematic records of measured high flow at a given gauging station, represented as 𝐱 =

(𝑥1, … , 𝑥𝑛). Typically, it is assumed that the data available follow a specific probability

distribution indexed by some parameters 𝛉 (𝑋 ∼ 𝐹(𝑥, 𝛉)), and statistical methods are employed to estimate the parameters of the distribution based on the available data. When historical data are available, typically this corresponds to the information that some floods of a certain magnitude occurred at a point in time and that these floods correspond to the biggest events in a certain range of time (for example, since the beginning of the printing of a local newspaper). In particular, the methods presented in this appendix deal mostly with the case in which the magnitude of k events across h

years is known, although some investigation of the case in which only the fact that some k event exceeded the perception threshold is also pursued.

All k events have a magnitude above the value, 𝑋₀, which is often named the

perception threshold, since it corresponds to the threshold above which the flood would have been large enough to be noted in historical sources or to leave recognisable signs across the catchment. One important assumption made in this setting is that the k historical floods for which some information is available correspond to all the events above the perception threshold which have happened in the period of time covered by the h years.

A good understanding of what values correspond to h, k and 𝑋₀ is a necessary prerequisite before applying any estimation procedure that combines systematic and historical data, as these values would have a large impact on the final estimates (see, for example, the discussion in Strupczewski et al. 2014 or Macdonald et al. 2014). Figure B.1 shows an exemplification of the quantities h, k, n and 𝑋₀ used throughout the report.

Figure B.1 Historical data example (River Wear at Durham), showing a total of

k = 6 historical events (red bars) above the perception threshold 𝑿𝟎 (dashed red

line), recorded across the h = 154 year-long historical period

Notes: The n = 51 years’ long systematic record of gauged peak flows is also shown (black bars).

One additional important aspect of any statistical procedure used to estimate flood frequencies is that it is assumed that all data points, both from the systematic and the historical records, come from the same distribution, that is, that the process under study is actually stationary. If there is reason to believe that large changes have

occurred in the flood generation process (for example, changes in the basin properties, disruptions of the floodplain or the river channel that could alter the hydraulic properties of the river), a thorough assessment of whether events from the past can be

representative of the present situation should be performed. This is also valid for changes that might be the results of climate change, for example, diminishing snowfall and snowmelt.

Payrastre et al. (2011) note that changes in the basin properties are not likely to affect the magnitude of extremely large and rare events, and it is likely that including

historical events would still give a more complete information on the very rare cases. For events of relatively small size (for example, a magnitude is line with those recorded in a 25-year record), it is important to ensure that the historical record can be directly integrated with the systematic records.

Given the fluctuations between flood rich and flood poor periods, information about past events could help to give better estimates of the frequency of large events which might have not been registered in the systematic record. Ideally, historical records could be used to gain better understanding of the natural fluctuations of river flows, as hinted in Macdonald (2014). In the likelihood approach, a non-stationary model could be employed to relate flood risk to one or more external variables, but a very long and rich historical record would be required to gain a good understanding of large-scale variabilities.

B.2 Methods

In document Making better use of local data in flood frequency estimation (Page 143-145)