• No results found

Statistical Models Chapter overview

In this chapter I develop and evaluate a series of novel statistical models for North Atlantic storm data obtained by objective feature tracking on the ERA-40 reanalysis, bringing together physical understanding of the processes of storm formation and statistical methods of modelling point processes. The first half of the chapter describes the methods used, and the second half presents the results.

The models are compared using an information theoretic approach but the selection of a “correct” model is avoided; instead, the exercise is used to discuss the relative importance of different physical processes to the behaviour of North Atlantic storm events.

The long term behaviour of a state-of-the-art climate simulation (ECHAM5) is com- pared with the short reanalysis dataset, demonstrating the need for longer observa- tion periods (especially when considering the behaviour of the more extreme storms).

3.1 Introduction 99

3.1.1 Motivation 3.1.2 Starting points 3.1.3 Previous work

Methods 106

3.2 Definitions and overview of models 106

3.2.1 Poisson process with seasonal cycle 3.2.2 Cox process

3.2.3 Hawkes process

3.3 Simulation 109

3.3.1 Varying rate point process simulation 3.3.2 Thinning algorithm

3.4 Parameter estimation 114

3.4.1 Graphical parameter estimation 3.4.2 Maximum likelihood estimation

3.5 Clustering analysis 119

3.6 Uncertainty analysis 120

3.6.1 Uncertainty due to sample size 3.6.2 Uncertainty due to numerics 3.6.3 Uncertainty due to functional form

3.7 Model comparison 122

3.7.1 Information theoretic approaches 3.7.2 Choice of comparison method

Application to reanalysis data: results and discussion 126

3.8 Data: ERA-40 and TRACK 126

3.9 Seasonal cycle only 127

3.9.1 Parameter estimation

3.9.2 Implications for clustering statistics

3.10 Cox process 133

3.10.1 Parameter estimation: the NAO 3.10.2 Parameter estimation: Cox process 3.10.3 Implications for clustering statistics

3.11 Hawkes process 143

3.11.1 Standard Hawkes process 3.11.2 Seasonal Hawkes process

3.11.3 Symmetric seasonal Hawkes process

3.12 Model comparison 156

3.13 Comparison with long climate simulations 156

3.13.1 Model selection 3.13.2 Storm statistics

3.14 Conclusions 162

3.14.1 Interpretation of model results 3.14.2 Lack of NAO dependence

3.14.3 Use for prediction and other purposes

3.1 Introduction

3.1.1 Motivation

The use of state-of-the-art dynamical climate simulators is limited to a small number of modelling centres which have powerful computing facilities and resources to devote to the task. Almost by definition, any model which is accessible to a desktop computer is no longer “state-of-the-art” and will have been superseded by a more “advanced” simulator. Similarly, any model which can be run a large number of times will be less detailed than one which utilises the maximum computing power available. However, there is a need for simulators to be accessible beyond the main modelling centres and capable of running many times to generate uncertainty estimates.

There are two strategies for achieving such a goal: the first is to begin from the top down with the climate simulators themselves, using older or lower resolution versions, and the second is to begin from the bottom up, creating new models which repre- sent phenomena in a more statistical manner, omitting process details but capturing behaviours and correlations. The top-down strategy is used by projects such as cli- mateprediction.net5, which takes a second-tier climate model (HadCM3 and variants) and runs it many thousands of times to come up with, for example, climate sensitiv- ity analyses240. The bottom-up process is more commonly used to model events for which we have many observations and/or less detailed understanding of the physi- cal processes involved, for example extreme rainfall events or insurance losses270. In this chapter I describe an application of the bottom-up strategy to North Atlantic storm modelling, and evaluate it with reference to some results from a longer simulation.

There are of course many limitations of such a model, not least of which is the objec- tion that a few degrees of freedom cannot possibly represent the variety of physical phenomena which contribute to the existence and variability of the North Atlantic storm track. However, the object of creating any model, from the simple to the com- plex, is to understand which processes are important and which (if any†) can be safely ignored. Both the bottom-up and top-down strategies of model creation (and arguably also the state-of-the-art climate simulators) are attempting to achieve a minimal rep- resentation of the process of interest; the success of each can be judged by observa- tions and compared against each other.

3.1.2 Starting points

Let us begin by considering storms as events which occur “randomly”, but according to some underlying distribution which may change over time.

The simplest example of a point process is the Poisson point process, where events occur “randomly” in such a way that the rate (expected number of events observed in unit time) is constant,

λ(t) = μ0, (3.1)

and the inter-event waiting times T follow an exponential distribution with mean and standard deviation μ0:

T(t) ∼ μ0e−μ0t. (3.2)

If storm events were generated from a constant Poisson distribution then there would be no time-dependent correlation of events. We can test this specifically by sorting the inter-event waiting times (see Section 3.5 for description of how these are obtained) into short, medium and long gaps, and then considering the distribution of the follow-

ing gaps. Figure 3.1 shows that short gaps tend to be followed by short gaps (clusters), and long gaps tend to be followed by long gaps (lulls between clusters). This chapter will make extensive use of such histograms as a visualisation of the distribution of waiting times and their uncertainty for different statistical models.

This is a very simple test, which makes no assumptions about the form of the data and demonstrates that the Poisson representation is insufficient since greater structure is visible. Generalisations of (3.1) can take many forms, usually involving some depen- dence of the rate λ either on the history of previous events or on external factors such as time, spatial position, or some other process†.

In this chapter I will consider primarily three main behaviours of interest and the cor- responding model formulations, as follows:

1. seasonality: a simple dependence on the time of year;

2. Cox (“doubly stochastic”) behaviour: dependence on some background pro- cess which may or may not be observable;

3. Hawkes (“self-exciting”) behaviour: self-excitation or self-inhibition (when one event happens, it changes the instantantaneous rate λ such that the next event is likely to happen sooner/later than it would otherwise have done).

These are reasonable first guesses because they correspond to different aspects of the physical understanding of how storms are generated.

The seasonal cycle is an obvious influence: at the European end of the North Atlantic storm track it is commonly observed that there are more strong storms in winter than in summer50,48, which is due to the prevalence of stronger westerlies and a stronger temperature gradient in the North Atlantic in winter causing stronger depressions to develop (see Literature Review, Section 2.8).

In other literature, λ is often referred to as the intensity of the process, but in the current context I will always use rate to avoid confusion with the storm intensity. λ has units 1/[T].

Fr

Related documents