Chapter 3 Stochastic Climate Modelling
3.4 Empirical Methods to Model Reduction
The MTV strategy described above relies upon the approximation of nonlinear terms by a stochastic process. This inevitably introduces some unknown parameters. In practice these are estimated from observations of very long runs of the full model. Generally there are one or two unknown parameters for each degree of freedom in the system. This would not be possible if trying to produce a model from observations of the real atmosphere. In this case only observations of the variables of interest may be available: an empirically derived model may work as well. In this section we discuss some data driven methods to producing low dimensional models of the atmosphere and ocean.
Penland [1996] used the centred Ornstein-Uhlenbeck (OU) process
dx=Cxdt+ΣdB
to model sea surface temperature anomalies and test their potential for predicting the El-Nino Southern Oscillation (ENSO). ENSO is a basin wide warming phe- nomenon in the South Pacific Ocean which occurs quasi-periodically with approx- imate period 18 months. Parameters of the OU process are estimated by taking moments of the Fokker-Planck equation. One computes an estimate of the so called Green’s function matrix as
G(τ0) = exp(Bτ0) =<x(t+τ0)x(t)T ><x(t)xT(t)>−1 .
The eigenvalues of the matrixGTGare known as POPs Principal Oscillation Pat-
terns (POPs) or Empirical Normal Modes and are discussed earlier in the chapter. The Linear Inverse Model (LIM) employed here is a closely related technique to POP analysis.
The author uses the Green function matrix to determine the optimal ini- tial structure for the system to evolve to the most probable prediction. This is
determined to be the leading eigenfunction ofGTG(τ0).
The OU process is fitted to monthly mean data taken from a 4◦×10◦ grid
estimated with τ0 = 4 months. Subsequent estimates of B are found insensitive
to τ0 which supports the choice of a linear model. Penland [1996] finds that the
decay time scale of the estimated POPs are less than that of the ENSO oscillation. This may suggest that there is some interaction between modes or that this linear model is inappropriate. They look at predictability for lead times of 3, 6 and 9 months using Root Mean Square error. The 3 month lead time has some predictive skill but performs poorly during the strong warm phase of ENSO. It is found that this forecasting method can capture the ENSO pattern and persistence but not the magnitude. This could again indicate a problem with using a linear model.
Further work has been done on fitting LIMs to ENSO. Johnson et al. [2000] find an improvement in their model of sea surface temperature EOFs by including the first two EOFs of subsurface heat content anomalies as measured by RMS er- ror. Penland and Matrosova [1998] apply LIM to Atlantic sea surface temperature anomalies to determine if there is any predictive skill gained by using global Sea Surface Temperatures as predictors. They confirm that this is the case. In terms of applications to atmospheric data most have focussed upon the related problem of determining POPs from data. POPs are derived from the assumption of a lin- ear model and represent the normal modes of the dynamics. They are different to EOFs in that they are not optimised for explained variance and they do not form a set of orthogonal patterns. They are dynamical modes of the system, not standing patterns like EOFs. For example, Xu and von Storch [1990] determine POPs for
sea level pressure between 15◦S and 40◦S in order to describe the development of
the Southern Oscillation. They discovered that the 30-60 day oscillation may be predicted by the POP forecast scheme for several days, better than persistence and an Auto-Regressive Moving Average (ARMA) model. von Storch and Baumhefner [1991] extend this work to predictions of the equatorial velocity field and examine the accuracy using the anomaly correlation skill score.
A LIM is applied to Northern Hemisphere wintertime low frequency vari- ability by Winkler et al. [2001]. 30 EOFs, capturing 90% of the variability, were computed for combined 250 hPa and 750 hPa Northern Hemisphere (NH) stream- function anomalies together with 7 EOFs, capturing 70% of the variability, for tropi-
cal diabatic heating 30◦S to 30◦N. The LIM was then formed from this 37 component
vector using the same methods as Penland [1996], discussed above. The measure of predictive skill used is the local anomaly correlation in 250 hPa streamfunction at a lead time of 14 days. By this measure the LIM outperforms forecasts based on climatology, persistence, a barotropic numerical model and a baroclinic model. The LIM competes with the skill of the then medium range prediction model of NCEP
with O(106) variables. The authors attribute the skill of the LIM to its ability to approximate some of the nonlinear effects that do not appear in models constructed by linearising the full system. They also note the importance of including the trop- ical diabatic heating as a dynamic variable rather than an external forcing which is gives a marked improvement on the work of Penland and Ghil [1993]. They also apply a LIM to extratropical variability with tropical heating as a dynamical vari- able and report predictive skill only modestly better than the persistence prediction. Although their poor result could be more to do with their attempt to build a model for all seasons. Winkler et al. [2001] conclude that the dynamics of extratropical variability are essentially linear and stable if sufficient variables are included in the model although LIM still fails to capture the full amount of wintertime variability.
In many geophysical systems linear dynamics with white noise forcing are not sufficient. Kravtsov et al. [2005] suggest a data driven approach to constructing a nonlinear stochastic model. In particular they consider quadratic models such that the inference is still linear in the parameters. They estimate the parameters from model data using the least squares procedure where the dependent variables are the time derivatives. Their novel suggestion is to account for the autocorrelation in the residuals by adding extra unobserved levels. Each extra level is a linear equation for the residual. In this way more levels are added until the residuals on the final level are uncorrelated in time. Their model equations are
dxi = (xTAix+b(0)i x+c (0) i )dt+r (0) i dt dri(0)=b(1)i [x,r(0)]dt+r(1)i dt dri(1)=b(2)i [x,r(0),r(1)]dt+r(2)i dt · · · dri(L) =b(2)i [x,r(0),r(1), . . . ,r(L)]dt+dr(iL+1).
Only the first level has nonlinear terms for climate variablex, the others are linear
equations for the residuals. They iteratively add more levels until the lag 1 auto- correlation is zero. The structure is similar to a multivariate autoregressive moving average model except nonlinear terms are included.
They demonstrate this method by estimating parameters for the three dimen- sional Lorenz model, which is a deterministic chaotic system, and also for stochastic cubic models. They report that their method is able to reproduce the parameters for the nonlinear terms but that there is dependence upon the data sampling strategy. In particular the estimated errors are large for infrequent observations. One of the main themes of this thesis is to develop an inference method that works with infre-
quent observations. Kravtsov et al. [2005] also discuss the problem of having such a large number of parameters that the problem is ill-conditioned. In their application to a semi-realistic atmospheric model they have over 3,000 parameters to estimate for a 15 dimensional system. They use a regularisation procedure which makes the inference well posed: specifically the methods of principal component regression and partial least squares. Motivated by this problem, in this thesis, we use a Bayesian approach where one places priors on the parameters. This leads to a well posed inverse problem.
Kravtsov et al. [2005] apply their method to the barotropic quasi-geostrophic three level model of Marshall and Molteni [1993]. They use a cross validation method to determine the number of variables for the reduced model. By splitting the data into two sets they train the model on one and test its predictive performance on the other. They then determine the number of levels needed to account for the autocorrelation in the noise. They settle for using 15 variables and three levels and report that the reduced model has a similar climatology to the full. They analyse the PDFs for the full and reduced by fitting Gaussian mixture models to the data. In both cases four mixture components was the optimal and they had similar clusters to each other. They looked at the ability of the reduced model to attribute the correct probability mass to regions associated with persistent flow regimes for the Northern Hemisphere. They confirm that the reduced model can capture the statistics of the positive and negative phases of the Arctic Oscillation and North Atlantic Oscillation. They use Singular Spectrum Analysis to determine the skill of the reduced model in capturing the low frequency variability. They compare their results to those from a single level model and conclude that this model is indistinguishable from a red spectrum whereas the multilevel model can capture the correct spectrum of the principal components of the full model. Moreover, they state that the single level model is sensitive to the particular realisation of the noise used and can have trajectories which diverge away from the stable patterns of the full model.
As argued by Majda and Yuan [2012] multilevel, quadratic regression can produce nonphysical behaviour such as finite time blow up and non-existence of an invariant measure. They also note the effects of error due to the sampling inter- val of the training data. They argue that a physics based model (motivated by the homogenisation procedure outlined above) with cubic non-linearity has more predictive skill.
In this section we have argued that linear models are insufficient to model well low frequency variability of the atmosphere; sparse observations can lead to errors and inconsistency in estimates of parameters for diffusion models and that it
is desirable to use physically motivated non-linear models as in those resulting from the rigorous homogenisation (MTV) procedure. However, the MTV procedure relies upon the estimation of hundreds of parameters from observations of the full system and may be inappropriate when there is lack of time scale separation. Therefore, we argue for a data driven approach where the parametric model is motivated by the MTV procedure: in quadratic models of the atmosphere this is usually a cubic model with noise that is linear in the state. We also argue in favour of theoretically well understood likelihood based inference to estimate the parameters. This leads to estimates with quantifiable errors. In particular the Bayesian approach will allow us to overcome any possible ill-posedness of the inference and will also prove useful in restricting the parameter space to give stable models. To overcome the problem of errors associated with infrequent observations we develop data imputation methods proposed in the SDE inference literature. In the next section we introduce the models with which we will work.