3 METHODS
3.2 Methods for Flood Frequency Analysis
3.2.1
Event Definition
Following the collation, checking and preliminary analysis of the collated datasets, extraction of extreme values and event definition was required. Independent extreme events were determined relative to each variable type. For sea levels, an independent event was defined as occurring at each successive tide. For river flow above the tidal reach, an event was determined as the duration of high flow period, typically around 48- hours, although many flow events extended over several days due to successive rainfall events maintaining high groundwater levels.
For the point of interest (Lewes), event definition was more complex due to the differing interactions of tide, surge and river flow during different events. An event analysis was undertaken to explore historical extreme high water levels at Lewes corresponding to simultaneous sea level and flow observations to establish the dependency on high tides and river flows.
3.2.2
Annual Maxima (AMAX) Series
Extreme values are produced rarely as their occurrence is unusual for the point of
interest. Annual (water-year) maximas were extracted from the daily maxima series, from October 1st to September 31st, creating an annual maxima (AMAX) extreme value series for each variable. The process incorporated complete winter and summer seasons for each annual maxima value, allowing for seasonality effects to be identified.
Due to the variable nature of hydrologic data recording, the vast majority of water-years contained some period of null values. To assess whether any missing periods in each series may have included other high (and possibly the highest) annual value, each data series was cross-checked with neighbouring recorded series for the same period to see if high values were likely. Seasonality was also taken into account, with winter months most likely to contain the maxima values from each meteorologically driven series. The maxima value was extracted on a year-by-year basis, and the percentage of missing data from each annual maxima series was calculated and included with the maxima values to display their relative accuracy (Appendix A.1).
The annual maxima (AMAX) series at Barcombe Mills, Lewes Corporation Yard, Lewes Gas Works and Newhaven were identified as the four primary hydrological series for the
flood frequency analysis, due to their locations at the fluvial and tidal limits of the lower Ouse (Barcombe Mills and Newhaven), and at intermediate points of interest (Lewes Corporation Yard and Lewes Gas Works). Each AMAX series was extracted and extended (where possible) to provide a long series of AMAX observation at each location.
3.2.3
Peaks-Over-Threshold Series
Where an AMAX series only extracts the largest event from each calendar or water-year (possibly disguising the true historical pattern and rarity of events as any given year may contain more than one significant or extreme event), a peaks-over-threshold (POT) series uses a threshold exceedance approach to select peak values for each significant event in each series.
A POT approach was applied to each series which selected independent peak events that exceeded generic (i.e. percentile) threshold levels to each dataset. The process eliminated the non-extreme peaks (i.e. the everyday tidal peaks) and produced a series of the highest values uniformly across each dataset, independent from the calendar or water year.
Five POT series were calculated for each variable using threshold values selected as:
• 95th, 98th and 99th percentiles,
• an average of 5 POT exceedances per year based on the whole dataset, and
• selecting the lowest AMAX value as the threshold level.
The lowest AMAX value threshold level for each series was selected so as not to ignore observations from years when the peaks values were relatively low. This produced at least one peak value per water-year with many years containing numerous extreme values. To ensure the identification of independent POT events, exceedances were selected on the same day and within 3 day window (±1 day from the day of the highest POT event) where only the peak value during this period was selected. Although it was not possible to take other factors into account, such as high groundwater levels from a previous POT event, the process enabled the POT series to represent extremal nature of flooding events as accurately as possible.
3.2.4
Distribution Selection & Return Period Estimation
Extreme value analysis is used to make inferences about the size and frequency of extreme events. The frequency of occurrence of the extreme hydrological observations was analysed using statistical probability distributions fitted to the annual maxima
sequence of observation. The annual extreme hydrological observations are located in the extreme tail of the parent probability distribution. As such, a distribution which fits the complete duration series would not be suitable for the extreme values. A suitable distribution for extreme values is the Generalized Extreme Value (GEV) distribution, which merges the type I, II and III extreme value family of distribution (commonly know as Gumbel, Fréchet and Weibull) to allow for a continuous range. The extreme value distributions have been found to be ideal for describing annual series of extreme values from UK hydrological data (e.g. Chow et al., 1988; Environment Agency, 2002) and were recommended for extreme distribution fitting in the Flood Estimation Handbook (Robson and Reed, 1999).
The GEV distribution has three parameters of locationµ, scaleα and shape k. The GEV probability distribution function for −∞≤x≤∞ is then given as:
(
)
− − + − = − k x k k x F / 1 1 exp , , ; α µ α µ (3.1)When k <0, the GEV distribution is equivalent to the type III (Weibull) extreme value distribution. Similarly, when k >0, the GEV distribution is equivalent to the type II (Fréchet) extreme value distribution. As k approaches the limit of 0, the GEV becomes the type I (Gumbel) extreme value distribution.
An extreme value analysis was undertaken for each hydrological data series. The GEV distribution’s suitability mathematically checked by calculating the Goodness of Fit of each dataset to using the Anderson Darling test and by estimating the coefficient of skew. The GEV distribution was fitted to each annual maxima extreme series using the Flood Estimation Handbook (Reed, 1999) software package WINFAP-FEH. The fitted probability distributions for each hydrological variable were extrapolated to extreme values to estimate the relative return periods beyond the duration of the series. Each of the distributions was extrapolated up to a maximum of the 1:200 year return period.
However, the majority of the data series extended to approximately 50 years, therefore return periods and estimated magnitudes were treated with caution above this level.
3.2.5
Statistical Correlation
Each hydrological POT series was cross-correlated with relevant corresponding POT series to provide an indication of the relationship and possible dependence (or
independence) between each pair, and to establish the primary variables in the production of extreme water levels at the point of interest. Each hydrological pair of variables was statistically correlated to indicate the relationship between the series. P values were obtained using ANOVA multiple regression analysis. Significant results were taken where P<0.05. Percentages of simultaneous and independent occurrences were also calculated to further assess the relationship. Time-lags of 1 and 2 days were also introduced to establish if correlation differed over longer time periods.