Experimental Configurations - Parallel Filter Algorithms for Data Assimilation in Oceanography

To assess the filtering abilities of the different filter algorithms identical twin exper-iments are performed with a toy model using the nonlinear shallow water equations, see e.g. [62],¹

∂_t~u + (~u∇)~u + ~f × ~u + g∇h = 0 (4.1)

∂_th +∇ · ((H₀+ h)~u) = 0 (4.2) where ~u(~r, t) = (u(~r, t), v(~r, t)) is the velocity field and h(~r, t) is the field of the sea surface elevation (~r = (x, y) is the 2-dimensional location vector). H₀(~r, t) is the sea depth and g is the gravitational acceleration. Further, ~f = 2Ω sin θ ~k, where Ω is the angular velocity of the Earth, θ is the latitude, and ~k is the vertical unit vector.

The shallow water equations are discretized in potential enstrophy conserving form according to Sadourny [71] with the extension to include the Coriolis term. The model domain is chosen as a box measuring 950 km per side with a flat bottom at 1000 meters depth. Periodic boundary conditions are applied in zonal and meridional directions.

The Coriolis parameter 2Ω sin θ is constant over the domain with a value of 10⁻⁴ s⁻¹. This corresponds to a beta-plane approximation at a latitude of θ = 45^◦N. The exper-iments were performed with 30× 30 grid points and a time step of 100s using a leap frog scheme.

The state vector x, used in the filter algorithms, consists of the surface elevation h and the horizontal velocity components u and v at the grid points. The state dimension amounts to n = 2700. This number is sufficiently large to obtain meaningful filter results also for the low-rank algorithms, but it is still small enough to allow for a direct study of the filter-represented covariance matrices.

For the twin experiments the ’true’ state trajectory of the system is generated by initializing with the state shown in the left panel of figure 4.1. It is in geostrophic balance and has a shape that ensures nonlinear evolution with the shallow water equa-tions. Synthetic observations of the surface elevation at each grid point are generated by adding normally distributed random numbers of variance 10⁻⁴ m² to the true surface elevation. Using only the surface elevation as observations, the dimension of the obser-vation vector is m = 900. The generated obserobser-vations are quite accurate in comparison to the amplitude of the true surface elevation. This is useful, since the dependence of filtering performance on ensemble size can be better accessed for large ensembles with accurate observations. In the twin experiments it is assumed that the model is exact, thus no model error is simulated.

1We use the notation~u for a spatially continuous vector field. The discretization of a field h, which is represented as a vector, is denoted byh.

0 200 400 600 800

Mean over 8000 time steps h [m]

Figure 4.1: Surface elevation and velocity field of the true initial state (left) and mean state over 8000 time steps using each 10th step (right).

Two types of experiments are performed. For the first one, referred to as exper-iment ’A’, the initialization of the model state estimate xâ₀ and the corresponding covariance matrix Pâ₀ is performed for all three filter algorithms by applying the EOF procedure described by Pham et al. [68] which uses a sequence of model states. The ini-tial state estimate xâ₀ is chosen as the mean state of the true model simulation over 8000 time steps using each 10th time step. It is shown in the right panel of figure 4.1. The covariance matrix Pâ₀ is computed as the variation of the true model trajectory about this mean. This matrix does not reflect the estimated error of the initial state but the estimated mean temporal variability of the model state. The procedure, however, yields a consistent and simple way to obtain variance estimates together with estimates of the covariances.

This mean and covariance matrix serve as a baseline. However, it soon turned out that all algorithms can improve this ”state of large ignorance”. A much more enlighten-ing settenlighten-ing would be to use a model state and covariance matrix that are already quite accurate and difficult to improve. To this end, the initialization of the second type of experiments, referred to as experiment ’B’, is conducted with the estimated state and covariance matrix after the second analysis update from an assimilation experiment of type A with the EnKF using a very large ensemble of N = 5000 members. This is a very accurate state estimate whose rms deviation from the true state is two orders of magnitude smaller than the initial estimate of type A. The structure of this state is thus very similar that of the true initial state displayed in the left panel of figure 4.1.

In addition, the covariance matrix of type B is an estimated error covariance matrix of the state estimate. It has a strongly different structure compared with the covariance matrix of type A. This is obvious from the eigenvalue spectrum, displayed in figure 4.2.

For type A the covariance matrix is ill-conditioned and the ten largest eigenmodes already explain 99% of the variance. In contrast to this, 371 eigenmodes are required to explain 90% of the variance for type B.

0 100 200 300 400 500 10⁻⁸

10⁻⁶ 10⁻⁴ 10⁻² 10⁰ 10² 10⁴

eigenvalue

eigenvalue index

type A type B

Figure 4.2: Eigenvalues for the covariance matrices for experiments of type A and B up to eigenvalue index 500.

Decomposed low-rank approximations ˆPâ₀ = V₀U₀V^T₀ of the covariance matrix Pâ₀ are required to initialize the SEEK and SEIK filters. These are computed by incom-plete eigenvalue decompositions of Pâ₀ retaining only the r largest eigenmodes. The N ensemble states required for the EnKF algorithm have been generated from the state estimate xâ₀ and the covariance matrix Pâ₀ by a transformation of independent random numbers. For this, the eigenvalue decomposition of Pâ₀ is computed, yield-ing Pâ₀ = VUV^T. The eigenvectors are scaled by the square root of the corresponding eigenvalue as L = VU^1/2. For each ensemble state {xâ(α)₀ , α = 1, . . . , N} each scaled eigenvector L⁽ⁱ⁾ is multiplied by a random number b^(α)_i from a normal distribution of zero mean and unit variance and added to the state estimate xâ₀:

x^a(α)₀ = x^a₀+ Xq

i=1

b^(α)_i L⁽ⁱ⁾; α = 1, . . . , N (4.3)

Since the prescribed covariance matrix has a maximum rank of 799, we use only q = 799 eigenmodes in equation (4.3).

The assimilation experiments are performed over an interval of 8000 time steps for type A and 7600 time steps for type B with an analysis phase each 200 time steps. For a particular ensemble size N the rank in SEEK and SEIK is set to r = N− 1. In this case the number of model evaluations is equal for all three filter algorithms and the filtering performances can be directly related to computing time. Below the expression

“ensemble size” is used to denote the number of different model states to be evolved.

It will be equal to N for the EnKF and r + 1 for the SEEK and SEIK algorithms.

In document Parallel Filter Algorithms for Data Assimilation in Oceanography (Page 64-67)