2.3 Covariate parameterisations
2.3.5 Radial functions
First applied to geodesic approximation, radial basis functions (RBF) have since been widely applied in a number of fields such as neural networks, image process- ing and interpolation, kinetic modelling and solution of differential and integral equations. There is a vast literature on radial basis functions in different fields of statistics. While more details are beyond the scope of this work we refer the reader to the papers by Powell (1987) and Girosi and Poggio (1990) for some thorough reviews of the theory involved, and the monograph by Buhmann (2003) for an overview of their application and adaptations to different fields.
Radial functions are a special class of function whose response changes monotoni- cally with distance from a central pointc. This distinguishing feature joins many
specific formulations, which differ by the choice of centre, distance scale and the precise shape of the radial function. The most general formula for any RBF in any dimension is
h(x) = f (x−c)0R−1(x−c),
where f(·) is the specific function used, c is the centre and R is the metric cho- sen. This metric is often Euclidean, such that R = r2I for some scalar radius
r and identity matrix I. Common example of function choices include Cauchy, multiquadric and Gaussian functions, the latter of which in 1-D is
h(x) = exp −(x−c) 2 r2 , (2.3.6)
for some scalar input x ∈ (−∞,∞). Gaussian density functions can also be classified as an RBF, as they are just a version of Eq. 2.3.6 scaled to ensure the density integrates to 1. Gaussian RBF’s have a localised impact, as they show a significant response only in a neighbourhood near the centre, although theoretically they have global support.
RBF’s can be employed in any sort of linear and nonlinear regression model in the same way other basis functions types considered in Sections 2.3.3 and 2.3.4. In particular, assume again that we have a one-dimensional predictor X, then we can model the variation of a covariate-dependent parameter θ(x) using a linear combination of RBF’s θ(x) = p X j=1 βjfj(x),
where each fj is characterised by its own parameters cj and rj and p is the total number of radial function components.
Chapter 3
A comparison of peaks over
threshold methods
In Chapter 1, we considered the importance of accurate statistical models for extreme data in various application, such as for modelling the extreme conditions offshore structures are subject to. In Chapter 2, we introduced the statistical theory of extreme value modelling and noticed that often the applications where this is used tend to produce data showing non-homogeneity, so that considering covariates becomes essential to proposing a realistic model. For example, one cause of structural damage to offshore sites are storm waves, with the most severe sea states usually being wind generated. Consequently, we may suspect that the height of the waves will change according to, for example, season, geographic location of the sampling site, or direction of the wind. Constructing a model that includes such factors is necessary, and statistical tools to analyse and extrapolate from such a model become essential. Hence, in Section 2.1.3, we reviewed the standard
approach to account for covariate dependence for extremes data.
In this chapter, we focus our attention on peaks over threshold (POT) methods, where we study the tail of the distribution by only considering observations that are above an arbitrary value. In the stationary case, two models are available from the literature, namely the generalised Pareto distribution (GPD) and the non- homogeneous Poisson point (NHPP) process formulation. This chapter aims to compare them and shows that, although theoretically they are equivalent (Smith, 1989; Davison and Smith, 1990), in practice each method has its own advantages and limitations. Moving from one model to the other is straightforward in the case of stationary processes. Nonetheless, we are mainly interested in modelling and analysing non-stationary processes. Model parameters are, in this case, no longer directly transferable using the equations in Section 2.1.2. Furthermore, being able to correctly capture covariate effects is complicated, in the EVT framework, by the reduced amount of data available by construction. This may lead to added instability in the optimisation of likelihoods used by maximum likelihood (ML) methods for model fitting. This is particularly true in the case of the Poisson point process formulation, where additional issues arise compared to the equivalent GPD model, as shown in Section 3.1.2.
In the following sections, we systematically review the performance of model fitting and extrapolation for both the GPD and PP models in the non-stationary case, as well as highlighting areas with potential for further development.
3.1
Threshold approaches
The peaks over threshold approach is suited to the types of application and datasets described in Chapter 1, which comprise of more than just annual or monthly max- ima. In particular, data may be available daily, hourly or even sub-hourly.
Among others, it was used in the seminal work by Smith (1989) applied to air pollution data, and soon after popularised by Davison and Smith (1990). Assume that Y1, . . . , Yn are independent and identically distributed (i.i.d.) random vari- ables from the same unknown distribution functionFY over some domain Ω, where, say, Ω =R. It seems natural to define as extreme those observations that exceed a certain value, that is, the Yi, i= 1, . . . , n, above a chosen thresholdu∈R. Two equivalent models exist in the literature to analyse observations within this framework, namely:
1. The generalised Pareto distribution with distribution function, for y > u,
P[Y < y] = 1−φu 1 +ξ y−u ψu −1ξ + ,
with scale ψu > 0, exceedance probability φu ∈ [0,1], both conditional on the threshold u, and shape ξ ∈R;
2. The non-homogeneous Poisson point process with intensity on [0,1]×[u,∞),
λ(t, y) = 1 σ 1 +ξ y−µ σ −1−1ξ + ,
with location µ ∈ R, scale σ ∈ R+, and shape ξ ∈
standard notation
[·]+ := max{0,·}.
Although these models have already been introduced in Section 2.1.2, it is worth noticing a few intrinsic characteristics and results for the approaches considered, as both have advantages and drawbacks and neither prevails as an overall “better” model. In particular,
• The Poisson point process parameters are threshold invariant, provided we are far enough in the tail of the distribution for a given threshold u. Then, any subset of the extreme observations obtained with a new choice of thresh- old v > u will follow the same distribution, i.e. the latter will be described by a NHPP with the same parameters µ, σ, ξ;
• The Poisson point process parameters are strongly dependent, which makes parameter estimation harder to perform. Wadsworth and Tawn (2012) sug- gests that introducing an additional factor in the likelihood may help in reducing the correlation between them, as detailed in Section 3.1.1. In a similar manner, Sharkey and Tawn (2017) propose the reparameterisation of NHPP parameters in terms of a tuning parameter for a Bayesian imple- mentation and introduce a method for choosing this additional term so to obtain near-orthogonality of the model parameters for stationary processes, or when a linear trend in one covariate is present in the location parameter only.
invariance no longer holds, asψu andφu depend on the threshold. The choice of this threshold, as we mentioned earlier, is in itself an issue for both the NHPP and GPD models and may be subjective, which will then have an effect on model fitting and extrapolation.
• There exists a reformulation of the GPD scale parameter ψu in terms of a lower threshold u0 < u for which the assumption of GPD distributed ex-
ceedances still holds. This re-parameterisation is often used for threshold selection methods testing for stability of parameters over a range of thresh- old candidates. It essentially uses Eq. 2.1.9 to re-parametrise the GPD scale parameter as ψ∗ = ψu −ξu, which is constant with respect to u. An alternative re-parameterisation follows the work by Cox and Reid (1987) and Chavez-Demoulin and Davison (2005) to obtain more computationally advantageous formulations of the GPD parameters. It consists of moving from GPD parameters (ψu, ξ) to the asymptotically independent pair (νu, ξ), where νu =ψu(1 +ξ).
• Although the threshold stability property often leads theoretical statisticians to prefer the NHPP approach, the GPD formulation is usually preferred by applied users, as more immediately interpretable.