Results and discussion - Classifying and exploiting structure in multivariate extremes

We begin this section by considering the results of the MCMC implementation. We run our MCMC chains for 20000 iterations, and discard the first 5000 iterations as burn-in to aid convergence. Examples of the chains produced are provided in Fig- ure 6.3.1 for scale and shape parametersψ10,6 andξ10,6. Estimates of these parameters were obtained using the posterior means of their respective MCMC chains. These plots demonstrate that good mixing has been achieved for this case; similar results were obtained across other stations and months.

0 5000 10000 15000 0.12 0.14 0.16 0.18 0.20 0.22 Iterations Scale 0 5000 10000 15000 −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 Iterations Shape

We now explore the monthly variation in the estimated model parameters by focussing on results at four nearby stations. The locations of these stations are shown in the top left panel of Figure 6.3.2. The data set contains over 8000 observations for stations 2 and 5, and no observations for stations 7 and 10. The top right and bottom left panels of Figure 6.3.2 show our estimates of the scale and shape parameters, respectively, at these four locations. These plots demonstrate the seasonality in the parameter estimates, with higher values of both the scale and shape generally corresponding to summer and autumn months. This effect is maintained in the predicted 0.998 quantiles, shown in the bottom right panel of Figure 6.3.2, which are typically highest between June and October. A similar trend was observed at other sites, particularly those with limited data where estimates are more heavily influenced by information from other locations, due to the spatial smoothing imposed by the model.

0 1 2 3 4 47.0 48.0 49.0 Station Locations Longitude Latitude 0.12 0.14 0.16 0.18 0.20 Scale Parameter Month σ Jan

FebMarAprMayJunJulAugSepOctNovDec

−0.10 0.00 0.10 Shape Parameter Month ξ Jan

FebMarAprMayJunJulAugSepOctNovDec

1.0 1.5 2.0 2.5 Predicted Quantiles Month Rainf all (mm) Jan

FebMarAprMayJunJulAugSepOctNovDec

Figure 6.3.2: Location of stations 2 (purple), 5 (pink), 7 (orange) and 10 (blue), as well as estimates of the corresponding scale and shape parameters and predicted 0.998 quantiles.

We now consider our estimates in the context of the competition, which used the quantile loss function by Koenker (2005). In particular, as in the challenge, we consider the percentage improvement provided by our method over benchmark predictions. The competition was split into two challenges: Challenge 1 involved only sites where observations were available, with the benchmark quantile estimates being given by the monthly maxima at each station; Challenge 2 included predictions for all sites, with the benchmark for those sites with no data being taken as the aver- age of the quantiles predicted in Challenge 1 for each month. Our method gave a 59.9% improvement over the benchmark for Challenge 1, and a 57.7% improvement for Challenge 2. Table 6.3.1 shows the performance of our approach using this same metric, but with the results separated by month.

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Challenge 1 57.7 71.1 60.0 65.0 43.7 62.8 65.9 77.0 38.7 38.4 52.2 33.4 Challenge 2 54.4 69.3 57.4 61.9 43.1 60.7 64.2 75.4 37.9 36.4 49.3 31.3

Table 6.3.1: Percentage improvement over the benchmark for Challenges 1 and 2 across each month.

As is to be expected, our method performed better in Challenge 1, where only predictions for sites with observations were considered, across all months. Looking at these results separately for each month allows us to identify possible areas for improvement. In particular, the scores for September, October and December are lower than for other months, suggesting that the method could be improved by focussing on the modelling of autumn and winter months.

Discussion

7.1 Thesis summary

The aim of this thesis was to present novel theoretical and methodological results for multivariate extremes, with particular emphasis on extremal dependence between multivariate random variables. In this chapter, we first summarize the contributions of this work, before presenting some ideas for further work in Section 7.2.

In Chapter 3, we proposed approaches for identifying subsets of variables that can take their largest values simultaneously, while the others are of smaller order. This involved the introduction of a novel set of indices, based on a regular variation assumption, that describe the extremal dependence structure between variables. We proposed two inferential methods to estimate these parameters, as well as the propor- tion of extremal mass associated with our 2d₋_{1 sub-cones, each chosen to represent a}

different subset of variables taking its largest values simultaneously. This methodol- ogy could be applied to aid model selection, or in the construction of mixture models that exhibit the required extremal dependence structures. We demonstrated these methods through a simulation study and an application to river flow data.

In Chapter 4, we discussed extensions of the results of Chapter 3, by considering variables in terms of their radial-angular components. The first of these radial-angular methods approximates the various faces of the angular simplex using a partitioning approach similar to the methods in Chapter 3. The remaining proposed methods use a soft-thresholding technique that takes into account the distance of points from different faces of the simplex; we presented a version of this for our hidden regular variation assumption, as well as weighted variant of the method proposed by Goix et al. (2016). All of these methods were compared in a simulation study, where different methods were shown to perform best in different cases.

In Chapter 5, we carried out a theoretical investigation into the extremal dependence properties of vine copulas. In particular, we studied the coefficient of tail dependence ηC (Ledford and Tawn, 1996) for certain vine copula examples. This

involved applying the geometric approach of Nolde (2014) for calculating ηC when

the joint density is known, and extending the approach for cases where only higher order joint densities can be obtained analytically. We focussed on trivariate vine copulas constructed from extreme value and inverted extreme value components, and obtained results for higher dimensionalD-vines andC-vines formed from inverted extreme value pair copulas. We demonstrated our results through a series of (inverted) logistic examples.

Finally, in Chapter 6 we presented the results of a team competition for the EVA 2017 conference. The challenge involved predicting extreme precipitation quantiles for several stations in the Netherlands. Our team proposed a Bayesian approach with a hierarchical structure that modelled spatio-temporal dependence through the model parameters; our estimation procedure involved MCMC techniques and spatial inter- polation. The approach performed well in terms of the quantile loss metric proposed by the challenge organizer.

In document Classifying and exploiting structure in multivariate extremes (Page 149-154)