Miyoshi et al. (2013) and Terasaki and Miyoshi (2014) showed that the **interaction** between OECs and the **observation** operator are very important. In particular, they showed that when the observations are direct measurements of the state variables, observations with correlated errors (either positive or negative **correlations**) can reduce the entropy of the posterior probability distribution function (PDF) compared with the case when the **observation** errors are uncorrelated. However, when the **observation** operator is expressed as a linear combination of the state variables with positive coefficients, the sign of the OECs becomes important. In this case, OECs only reduce the entropy of the posterior PDF, compared with uncorrelated errors, if the OECs are negative. Through a series of idealized experiments, we will show how these results can be explained in terms of the analysis scales that the observations can constrain (see section 4). Liu and Rabier (2002) studied the case in which OECs originate from the observations measuring different scales from those modelled. This effect was simulated in these experiments by generating observations from a model run at a higher spectral resolution than that used in the **assimilation**. In this case, the OECs and the **observation** operator are intrinsically related. Liu and Rabier (2002) presented results on the optimal thinning of these observations with correlated errors. They found that if the OECs are correctly modelled, then increasing the **observation** density beyond a certain threshold does not reduce the analysis

Show more
15 Read more

the original radar observations are thinned to 6-km mesh to remove spatial **correlations** between adjacent observations, O − B covariance from O − B statistics exists only for distances larger than 6 km. Although the **observation** **error** variance cannot be determined, the background **error** variance and length scale for background **error** covariance can be determined using the method in Section 2.1. Estimated length scales for radial velocity, rainwater, snow, and graupel are 7.7, 4.3, 4.5, and 4.0 km, respectively. For all **observation** types, O − B covariances from the fitted Gaussian func- tion represent O − B covariances from O − B statistics well. Note that length-scale values for stream function and velocity potential from the NMC-based statistics are about 90 and 70 km, respectively, and length-scales for rainwater, snow, and graupel are specified as a value of 6 km in the WRFDA system. Although length scale for radial velocity is estimated, length scales of control variables associated with wind (i.e. stream function and velocity potential) are tuned because **data** **assimilation** is done in control variable space in WRFDA.

Show more
Abstract. The **assimilation** of satellite-based water level ob- servations (WLOs) into 2-D hydrodynamic models can keep flood forecasts on track or be used for reanalysis to obtain improved assessments of previous flood footprints. In either case, satellites provide spatially dense **observation** fields, but with spatially correlated errors. To date, **assimilation** meth- ods in flood forecasting either incorrectly neglect the spa- tial correlation in the **observation** errors or, in the best of cases, deal with it by thinning methods. These thinning meth- ods result in a sparse set of observations whose **error** cor- relations are assumed to be negligible. Here, with a case study, we show that the **assimilation** diagnostics that make use of statistical averages of **observation**-minus-background and **observation**-minus-analysis residuals are useful to es- timate **error** **correlations** in WLOs. The average estimated correlation length scale of 7 km is longer than the expected value of 250 m. Furthermore, the **correlations** do not decrease monotonically; this unexpected behaviour is shown to be the result of assimilating some anomalous observations. Accu- rate estimates of the **observation** **error** statistics can be used to support quality control protocols and provide insight into which observations it is most beneficial to assimilate. There- fore, the understanding gained in this paper will contribute towards the correct **assimilation** of denser datasets.

Show more
12 Read more

x i = M e {i−1}→i (x i−1 ), i = 1, 2, .., 50, (37) where x i = (x i y i z i w i v i ) T is the model state vector. The true model state can be obtained at time t i as
x t i = M e {i−1}→i (x t i−1 ) + η i , (38) where the vector of model **error** η i ∼ N (0, Q i ). The forms of the erroneous model (37) and true model (38) are the same as (7) and (8) used in derivation of both the combined model-**error** and **observation**-**error** covariance matrix R ∗ (14) and the estimated combined **error** covariance matrix R ∗ (32), with the exception that here nonlinear model equations are used, as opposed to linear model matrices. One of the key objectives in this section is to show that the theory developed with linear models is also successfully applicable using models of a nonlinear nature. We have discussed the capabilities of NWP centres, such as ECWMF, to estimate the diagonal entries of the background model-**error** covariance matrix evolved using the model matrix and subsequently mapped to **observation** space using the randomization method. If our developed method to compute R ∗ (32) were to be implemented operationally using similar randomization techniques, then only the diagonal elements would be specified in the combined **error** covariance matrix. Therefore, we wish to show in our numerical experiments with the idealized erroneous model (37) that, even when only the diagonal elements of the combined **error** covariance matrix are calculated and used within the **data** **assimilation** process, improvements to the analysis accuracy can be obtained. Specifically, this ignores the presence of time and multivariate cross-**correlations** in both the **observation** **error** and model **error**. For experiments in this section we use an **assimilation** window length of 50 time steps. We define the true initial conditions:

Show more
16 Read more

Section 2.3 Page 16
decade, uncorrelated **observation** errors were assumed for all instruments. Partly this is due to the difficulty of estimating **observation** **error** statistics. Additionally, using non-diagonal correlation matrices increases the computational expense of inverting the **observation** **error** covariance matrix. For spatial **correlations**, thinning is one technique that can allow users to neglect correlated **observation** errors. For some instruments estimated correlation lengthscales are shorter than typical thinning distances [Bennitt et al., 2017], meaning that thinning can be a valid technique. However, for other instruments the correlation lengthscales have been found to be much longer than reasonable thinning distances [Waller et al., 2016a,c, Cordoba et al., 2017], meaning that **correlations** must be taken into account. The use of thinning results in a large number of observations being discarded: in the Met´eo France convection-permitting limited area model radar observations are horizontally thinned by a factor of 64 and infrared satellite observations are horizontally thinned by a factor of 400 [Michel, 2018]. Thinning may also be necessary due to the large size of **observation** datasets, which can cause difficulties with storage and computational resource. However, alternative **data** compression methods, such as using a Fourier transform to retain only the largest modes of **observation** information, may allow a larger amount of information to be retained, whilst reducing the computational burden [Fowler, 2019].

Show more
215 Read more

In the Lorenz 1996 example, although the optimal methods were adaptive with the changing flow of the ensemble, the compression was actually relatively static (see Table 2 for an example of the how the most import- ant wave numbers change for the 5 **observation** times). Future work will look further at the effect of the flow dependent estimate on the **data** compression using a more physically realistic model in which **prior** **error** corre- lations are more dynamic. For example, using the multi- variate modified shallow water model of Kent et al. ( 2017 ) which represents simplified dynamics of cumulus convection and associated precipitation, and the corre- sponding disruption to large-scale balances. Methods for reducing the computational cost of on-line **data** compres- sion will also be investigated, as well as the possibility of retaining some base line scales.

Show more
19 Read more

Abstract. The **assimilation** of satellite-based water level ob- servations (WLOs) into 2-D hydrodynamic models can keep flood forecasts on track or be used for reanalysis to obtain improved assessments of previous flood footprints. In either case, satellites provide spatially dense **observation** fields, but with spatially correlated errors. To date, **assimilation** meth- ods in flood forecasting either incorrectly neglect the spa- tial correlation in the **observation** errors or, in the best of cases, deal with it by thinning methods. These thinning meth- ods result in a sparse set of observations whose **error** cor- relations are assumed to be negligible. Here, with a case study, we show that the **assimilation** diagnostics that make use of statistical averages of **observation**-minus-background and **observation**-minus-analysis residuals are useful to es- timate **error** **correlations** in WLOs. The average estimated correlation length scale of 7 km is longer than the expected value of 250 m. Furthermore, the **correlations** do not decrease monotonically; this unexpected behaviour is shown to be the result of assimilating some anomalous observations. Accu- rate estimates of the **observation** **error** statistics can be used to support quality control protocols and provide insight into which observations it is most beneficial to assimilate. There- fore, the understanding gained in this paper will contribute towards the correct **assimilation** of denser datasets.

Show more
10 Read more

6. Conclusions
To make better use of observations in **data** **assimilation** it is necessary to understand and correctly represent their associated **error** statistics in the **assimilation** method. One popular method for estimating **observation** **error** statistics, which makes use of information in the background and analysis residuals, is the method of Desroziers et al. (2005). Although this method has been used both in simple experiments and operational systems to provide estimates of the **observation** **error** statistics, the behaviour of the diagnostic is not well understood. In this work we have developed a theoretical understanding of the non-iterative application of this diagnostic and illustrated this with simple examples. We note that in these cases the statistical nature of the diagnostic is not considered, as the values are calculated directly and not from samples of the analysis and background residuals. When estimates are calculated in this way it is inevitable that further noise will be introduced.

Show more
16 Read more

ued throughout 2013, and a series of 7-day forecasts, 3 days apart, with identical base dates as illustrated in Fig. 2, are car- ried out from 3 January 2013. The forecast experiments were done behind real time; therefore, observations in the 12–24 h **prior** to forecast base time were available to both systems, whereas in practice they would not be available in this pe- riod in a real-time system. The model is forced by 3-hourly prescribed surface fluxes of momentum, heat and salt from the Bureau of Meteorology operational global NWP sys- tem version 1, which is known as ACCESS-G APS1 (Aus- tralian Community Climate and Earth System Simulator). For **data** **assimilation** the EnKF-C software (Sakov, 2015) is used in Ensemble Optimal Interpolation (EnOI) (Evensen, 2003) mode. The analysis equation and background **error** co- variances can be written as

Show more
Other potential issues are channel selection choice and **observation** sampling. The work above was undertaken when there was a fixed IASI channel selection for 1D-Var and 4D-Var; currently the channel selection choice is adaptive and dependent on additional quality control related to cloudy fields of view. This does not affect the use of the diagnostic nor the substance of the results presented here; it requires only a modification of the application. Also, the statistics used in the calculation of the covariance matrices were taken from a global sample, and hence local effects could be masked. But it is not proposed that we use these exact diagnosed **error** structures in the Met Office **assimilation** system; instead that they provide the motivation and framework for future investigations.

Show more
11 Read more

With the development of convection-permitting numerical weather prediction the efficient use of high- resolution observations in **data** **assimilation** is becoming increasingly important. The operational **assimilation** of these observations, such as Doppler radar radial winds (DRWs), is now common, although to avoid violating the assumption of uncorrelated **observation** errors the **observation** density is severely reduced. To improve the quantity of observations used and the impact that they have on the forecast requires the introduction of the full, potentially correlated, **error** statistics. In this work, **observation** **error** statistics are calculated for the DRWs that are assimilated into the Met Office high-resolution U.K. model (UKV) using a diagnostic that makes use of statistical averages of **observation**-minus-background and **observation**-minus-analysis residuals. This is the first in- depth study using the diagnostic to estimate both horizontal and along-beam **observation** **error** statistics. The new results obtained show that the DRW **error** standard deviations are similar to those used operationally and increase as the **observation** height increases. Surprisingly, the estimated **observation** **error** correlation length scales are longer than the operational thinning distance. They are dependent both on the height of the **observation** and on the distance of the **observation** away from the radar. Further tests show that the long **correlations** cannot be attributed to the background **error** covariance matrix used in the **assimilation**, although they are, in part, a result of using superobservations and a simplified **observation** operator. The inclusion of correlated **error** statistics in the **assimilation** allows less thinning of the **data** and hence better use of the high-resolution observations.

Show more
21 Read more

The **error** in the optimal solution (or ‘analysis **error**’) is naturally defined as a difference between the solution u and the true state u t ; this **error** is quantified by the analysis **error** covariance matrix (see, for example, Thacker, 1989; Rabier and Courtier, 1992; Fisher and Courtier, 1995; Yang et al., 1996; Gejadze et al., 2008). This perception of uncertainties in the 4D-Var method is probably inherited from the nonlinear least square (or nonlinear regression) theory (Hartley and Booker, 1965). A less widespread point of view is to consider the 4D-Var method in the framework of Bayesian methods. Among the first to write on the Bayesian perspective on DA one should probably mention Lorenc (1986) and Tarantola (1987). For a comprehensive review on the recent advances in DA from this point of view see, for example, Wikle and Berliner (2007) and Stuart (2010). So far, it has been recognized that for the Gaussian **data** errors (which include **observation** and background/**prior** errors) the Bayesian approach leads to the same standard 4D-Var cost functional J(u) to be minimized. However, it is not yet widely recognized that the conception of the estimation **error** in the Bayesian theory is somewhat different from the nonlinear least squares theory and, as a result, the Bayesian posterior covariance is not exactly the analysis **error** covariance. These two are conceptually different objects, which can sometimes be approximated by the same estimate. In the linear case they are quantitatively equal; in the nonlinear case the difference may become quite noticeable in practical terms. Note that the analysis **error** covariance computed at the optimal solution can also be named ‘posterior’, because it is, in some way, conditioned on the **data** (observations and background/**prior**). However, this is not the same as the Bayesian posterior covariance.

Show more
16 Read more

The present study aims at providing a method to explic- itly quantify the **error** of process-based terrestrial models, in particular, for global CCDASs. The conclusions also apply to site-scale parameter optimisation schemes. Denoting as “**prior**” the state of the carbon-cycle model before any obser- vational constraint, we propose to analyse the statistics of the **prior** residuals (observations-minus-**prior** simulations) with the help of the assigned **prior** parameter uncertainties pro- jected in the **observation** space. Within the Bayesian frame- work, these two pieces of information and the **observation** er- ror, which is the summed contribution of model and measure- ment errors, are linked together. We apply this method to the global biosphere model ORganising Carbon and Hydrology In Dynamic EcosystEms (ORCHIDEE, Krinner et al., 2005) in temperate deciduous broadleaf forests, using measure- ments of the daily net ecosystem exchange (NEE) flux at twelve eddy-covariance flux measurement sites as observ- able quantity. We take advantage from the previous studies that have characterised the uncertainties of these measure- ments (e.g., Richardson et al., 2008). The inferred structure of the **observation** (model + measurement) **error** on the mod- elled net carbon fluxes is then projected in the space of atmo- spheric concentrations in order to characterise its structure when assimilating concentration measurements with a CC- DAS.

Show more
11 Read more

correctly in the experiment, i.e. the random background errors are sampled from the same distribution as is modelled in the cost function.
The analysis errors E1 and E2 at t 0 are given in Tables 5 and 6, for u and f respectively. The tables show that the different approximations to R still have an impact on the analysis accuracy when the background errors are corre- lated, and the impact is a similar order of magnitude to that seen in Experiment 1 (with uncorrelated background errors). As before, the diagonal approximations give some of the worst performances. However, unlike Experiment 1, variance inflation improves the results slightly. This change in behaviour with correlated vs. uncorrelated background errors is consistent with our earlier 3D-Var information content results (Stewart et al., 2008) as discussed in Section 3.2. The results using an ED approximation are mixed: for the u field the performance is only comparable to the diagonal approximations, although for the f field the ED approximation yields better results. Fisher (2005) notes a potential problem with the eigendecomposition approach in that the approximate R matrices contain spurious **correlations**, although it is hoped that contributions from these spurious **correlations** may cancel out in the analysis. We hypothesise that the particular realisation of the **observation** and background noise used in this experiment has amplified this problem, although more detailed experi- ments beyond the scope of this article would be needed to verify this hypothesis definitively. Overall the Markov approximations provide the best results in terms of analysis accuracy (also seen in Experiment 1 and 2). Finally, we note that the detailed results seen in Experiments 1 and 2 change when background errors are correlated, but that the general conclusion that it is better to include some level of correlation structure in the **observation** **error** covariance matrix approximation than to incorrectly assume **error** independence still holds.

Show more
16 Read more

In a second study, carried out at the Met Office (Peter Weston, 2016; personal communication), default Cross-track Infrared Sounder (CrIS) **data** from the ∼14 km resolution field of view (FOV) was compared against averaged CrIS **data** created by averaging the 3 × 3 FOV in each field of regard (FOR) to create superobservations with an effective resolution of ∼42 km. The motivation for this study was better to match the scales between the observations and the models being used, where the forecast model is N768 (∼17 km horizontal resolution) and the **assimilation** model is N216 ( ∼60 km resolution). The **error** characteristics of both datasets were estimated using a posteriori diagnostics, such as those described in section 5, and showed that the averaged dataset had smaller **error** standard deviations and weaker **correlations** due to smaller representation errors and lower instrument noise through the averaging. Another effect of the averaging is that the number of observations suitable for **assimilation** is reduced, due to more of the larger FOVs being contaminated by cloud. When compared in NWP **assimilation** trials (including correlated **observation**-**error** covariances and using a 4DVar algorithm), the results were broadly neutral, with very slight degradations of up to 0.5% in background fits to observations sensitive to mid-tropospheric temperature and humidity. Therefore, it appears that the negative effects of the reduction in the number of observations assimilated due to cloud contamination has more of an effect than the smaller errors due to the better scale matching and reduced instrument noise.

Show more
19 Read more

(Manuscript received 6 November 2012; in final form 1 May 2013)
A B S T R A C T
**Data** **assimilation** methods which avoid the assumption of Gaussian **error** statistics are being developed for geoscience applications. We investigate how the relaxation of the Gaussian assumption affects the impact observations have within the **assimilation** process. The effect of non-Gaussian **observation** **error** (described by the likelihood) is compared to previously published work studying the effect of a non-Gaussian **prior**. The **observation** impact is measured in three ways: the sensitivity of the analysis to the observations, the mutual information, and the relative entropy. These three measures have all been studied in the case of Gaussian **data** **assimilation** and, in this case, have a known analytical form. It is shown that the analysis sensitivity can also be derived analytically when at least one of the **prior** or likelihood is Gaussian. This derivation shows an interesting asymmetry in the relationship between analysis sensitivity and analysis **error** covariance when the two different sources of non-Gaussian structure are considered (likelihood vs. **prior**). This is illustrated for a simple scalar case and used to infer the effect of the non-Gaussian structure on mutual information and relative entropy, which are more natural choices of metric in non-Gaussian **data** **assimilation**. It is concluded that approximating non-Gaussian **error** distributions as Gaussian can give significantly erroneous estimates of **observation** impact. The degree of the **error** depends not only on the nature of the non-Gaussian structure, but also on the metric used to measure the **observation** impact and the source of the non-Gaussian structure. Keywords: mutual information, relative entropy, sensitivity

Show more
18 Read more

One issue that appears to have been overlooked is that the diagnostics are derived assuming that the analysis is calculated using a best linear unbiased estimator. In recent work, the diagnostic has been applied to calculate **observation** errors where the analysis has been calculated using an ensemble **assimilation** scheme employing domain and **observation** localization techniques (Lange and Janji´c, 2016; Schraff et al., 2016). In this article, we consider whether the diagnostics of Desroziers et al. (2005) are still appropriate for calculating **observation**-**error** statistics for observations used in a local **assimilation** scheme. We provide a new derivation of the diagnostics using the analysis calculated by a local ensemble **assimilation**. From this derivation, we show that the diagnostic equations no longer hold and that the statistical averages of **observation**-minus-background and **observation**-minus-analysis residuals no longer result in an estimate of the **observation**- **error** covariance matrix, in general. However, further analysis of our derived diagnostics shows that, under certain circumstances, some elements of the **observation**-**error** covariance matrix can, in principle, be recovered exactly. Those elements that cannot, in principle, be derived exactly we describe as ‘incorrectly estimated’. Furthermore, we provide a method to determine which elements of the **observation**-**error** covariance matrix can be estimated correctly. In particular, the correct estimation of **correlations** is dependent on both the localization radius and the **observation** operator. We provide some special cases that show, dependent on specific background- and **observation**-**error** statistics and **observation** operators, that one may be lucky and may, in theory, be able to recover all elements of the **observation**-**error** covariance matrix, or unlucky and able to recover none. We also use examples to show that it is possible that, theoretically, some elements will be estimated incorrectly by the diagnostic, but, due to the choice of specific background- and **observation**-**error** statistics and **observation** operators, the estimated values may be close to the true values. However, some **prior** knowledge of the true statistics is required to be able to validate the quality of the incorrect estimates. Therefore, if the estimated **error** statistics are to be utilized further, it is necessary to find another method to assign values to those elements of the covariance matrix that cannot be estimated correctly by the diagnostic. This may be achieved by, for example, applying techniques such as those in Higham (2002) to provide a nearest approximate correlation matrix.

Show more
10 Read more

We also discuss various physical and chemical processes employed in the WRF-Chem model in more detail. Table A1 summarizes the WRF-Chem configuration options that are used in this study. To evaluate the cross-variable compo- nent of forecast **error** covariance, we select the simplified physical processes rather than sophisticated ones. Regard- ing the atmospheric processes, we use the recommended physics options for the regional climate case at 30 km grid size in our experiments. As the chemical options, the Carbon Bond Mechanism version Z (CBM-Z) without a dimethyl- sulfide scheme is used for the gas-phase chemistry. The CBM-Z photochemical mechanism contains 55 prognostic species and 134 reactions having the lumped structure ap- proach for condensing organic chemical species and reac- tions (Fast et al., 2006). It also uses a regime-dependent ap- proach based on the partitioned kinetics, such as background, anthropogenic, and biogenic submechanisms for saving the computational time (Fast et al., 2006). Furthermore, we con- sider the chemical tendency diagnostic for equation budget analysis. However, we did not consider the convective pa- rameterization, which can simulate the subgrid convective transport, wet scavenging, and aqueous chemistry due to a simple experiment setting, even with a typhoon case.

Show more
In tra ffic, according to 3 , the DA problems have been addressed with Kalman filters extensions following the sem- inal work of Kalman 4 . They have been adressed with analytic extensions of the Kalman filter such as the Extended Kalman Filter (EKF) 5 , the Unscented Kalman Filter (UKF) 6 or the Mixture Kalman Filter (MKF) 7 . Other works lay on replications such as the Ensemble Kalman Filter (EnKF) 8 or the Particle Filter (PF) 9 . KF-based DA methods as- sume that the traffic model is linear, or at least differentiable. Several works aimed to use a KF method associated with a Eulerian LWR model: the Cell Transmission Model 10 . More recently, DA was explored within a Lagrangian-Space tra ffic model with both loop **data** 11 and probe **data** 12 . These methods do not take into account model or **observation**

Show more